# Sequences

<style>
section.present > section.present { 
    max-height: 90%; 
    overflow-y: scroll;
}
</style>

<small><a href="https://colab.research.google.com/github/brandeis-jdelfino/cosi-10a/blob/main/lectures/notebooks/7_sequences.ipynb">Link to interactive slides on Google Colab</a></small>

To add: in-class exercises

# Sequence types

We've looked at a few **sequence types** already - can you guess which types are sequence types?

* **Sequence types** are data types that represent sequences of things. 
  * `str` - sequence of characters
  * `list` - sequence of values
  * `range` - (as in: `for i in range(5)`!) generated sequence of integers 
  * `tuple` - coming soon, similar to `list`
  
Today, we'll learn about some common operations that work on some/all sequence types, and learn more about strings.

# Tuples

Tuples are a sequence type. They are almost the same as lists, with a few small differences:
* Tuples are created and represented with parentheses `(` `)` instead of brackets `[` `]`
* Tuples are **immutable**. This means they can't change after creation.
  * You can't add, remove, or change the items a tuple contains.

In [None]:
i_am_a_tuple = (1,2,3)
print(i_am_a_tuple)

In [None]:
im_also_a_tuple = (7,)
print(im_also_a_tuple)

In [None]:
flavors = ("chocolate", "vanilla", "strawberry")
flavors[1] = "coffee"

You can convert back and forth between lists and tuples, similar to converting between `int`, `float`, and `str`:

In [None]:
flavors = ("chocolate", "vanilla", "strawberry")
print(flavors)


In [None]:
new_flavors = list(flavors)
new_flavors[1] = "coffee"
print(new_flavors)

In [None]:
flavors = tuple(new_flavors)
print(flavors)

# `range` revisited

`range` is actually a sequence type that generates integers:

In [None]:
list(range(5))

In [None]:
for i in range(5):
    print(i)

behaves the same as:

In [None]:
for i in [0,1,2,3,4]:
    print(i)

In [None]:
nums = [10,11,12,13]
for i in range(len(nums)):
    print(nums[i])

behaves the same as:

In [None]:
nums = [10,11,12,13]
for num in nums:
    print(num)

# `for` revisited

It turns out that `for i in range(...):` is actually just a special case of the general `for` loop form:

```
for <var> in <sequence>:
   statement(s)
```

A `for` loop can iterate over any **sequence**. It executes the code block once for each item in the sequence. 

This means you can use `for` loops on strings, lists, tuples, and ranges.


In [None]:
for c in "Hello, world!":
    print(c)

# Common sequence operations

These work on all sequence types! We've seen most of these before on strings or lists. They also work on tuples (and even ranges).

| Operation | Description | Example | Result |
| --- | :--- | --- | :--- |
| `x in s` | True if an item of s is equal to x | `"hi" in ["hello", "hi", "yo"]` | True |
| `x not in s` | Opposite of `in` | `"hi" not in ["hello", "hi", "yo"]` | False |
| `s + t` | Concatenate (combine end-to-end) | `[1,2,3] + [4,5]` | `[1,2,3,4,5]` |
| `s * n` | Replicate | `"abc" * 3` | `"abcabcabc"` |
| `s[i]` | *i*th item of `s` | `"hello"[1]` | `"e"` |
| `s[i:j]` | slice of `s` from `i` to `j` | `"hello"[1:3]` | `"ell"` |
| `len(s)` | length of `s` | `len("hello")` | `5` |
| `min(s)` | The smallest item of `s` | `min([4, 8, 7, 3])` | `3` |
| `max(s)` | The largest item of `s` | `max([4, 8, 7, 3])` | `8` |
| `s.index(x)` | index of the first occurrence of `x` in `s` | `[4, 8, 7, 3].index(7)` | `2` |
| `s.count(x)` | total number of occurrences of `x` in `s` | `"hello".count("l")` | `2` |

# More on Strings

In addition to the common sequence operations, strings provide many other helpful methods.

The full list can be found in the [official documentation on strings](https://docs.python.org/3/library/stdtypes.html#string-methods). 

We'll look through a few notable ones now.

`s.split(<delim>)` splits a string into a list. By default it splits at any whitespace, or you can tell it which character(s) to split at:

In [None]:
sentence = "This is a sentence with lots of words"
sentence.split()

In [None]:
sentence = "This,is a sentence,with,lots of words"
sentence.split(",")

`s.join(<sequence of strings>)` joins a sequence of strings together into a single string, with `s` between each string.

In [None]:
' '.join(["some", "words", "to", "stitch", "together"])

In [None]:
'%T%'.join(["some", "words", "to", "stitch", "together"])

`s.isdigit()` returns `True` if all characters in the string are digits and there is at least one character, `False` otherwise.

In [None]:
'12345678'.isdigit()

In [None]:
'123abc'.isdigit()

`s.upper()` and `s.lower()` convert strings to upper and lower case

In [None]:
'SpongeBob SquarePants'.upper()

In [None]:
'SpongeBob SquarePants'.lower()

`s.startswith(<search string>)` and `s.endswith(<search string>)`: return True if a string starts/ends with a substring

In [None]:
'SpongeBob SquarePants'.startswith("Sponge")

In [None]:
# case matters!
'SpongeBob SquarePants'.endswith("pants")

# String formatting

There are many, many ways to format strings for printing. Let's peek at a tiny slice that will make your life easier: formatted string literals, or [**f-strings**](https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals).

Adding an `f` before a string lets you you include the value of Python expressions inside a string. Here's an example:

In [None]:
name = "Spongebob"
age = 72
foods = ["Cake", "Pie", "Peanut Butter"]
print(f"Hi, I am {name}, I'm {age} years old, and I like {len(foods)} foods: {', '.join(foods)}.")

Note that we were able to include an integer (`age`) without an explicit type conversion. 

This makes printing **much** easier, no more awful lines like this: `print("Some text: " + str(some_number) + ".")`

One more formatting trick: to control the number of decimal places, add `:.<num decimals>f` after an expression in an f-string.

For example:

In [None]:
fraction = 1/3
print(fraction)
print(f"{fraction:.2f}")

The [official documentation on string formatting](https://docs.python.org/3/reference/lexical_analysis.html#f-strings) is extremely obtuse. Try [fstring.help](https://fstring.help/cheat/) instead for a decent reference.

# Iterables

An **iterable** object is an object capable of returning its members one at a time. 

All sequence types are **iterables**. Other types can be iterable too. 

An **iterator** is an object you use to iterate over and **iterable**.

Some examples of non-sequence iterable usage:
* Reading a file line-by-line - a file reader creates an iterator over all the lines in the file.
* A "generator" - an iterator that generates a sequence on-demand, as each item is requested.

The details of iterables and iterators are beyond the scope of this class. The important things to know for now are:
* **Iterables** are objects that can return their members one at a time
* All iterables can be iterated over in a `for` loop.

# `for` revisited, again

Last time, I promise. 

```
for <var> in <iterable>:
   statement(s)
```

`for` loops can loop over anything that is an **iterable**. This capture more than just **sequences**.