<a href="https://colab.research.google.com/github/JesusjrGalvez/Tutorial_DeepDive/blob/main/DeepDive02_01_Sequence_Types.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Sequence Types

Sequence types have the general concept of a first element, a second element, and so on. Basically an ordering of the sequence items using the natural numbers. In Python (and many other languages) the starting index is set to `0`, not `1`.

So the first item has index `0`, the second item has index `1`, and so on.

Python has built-in mutable and immutable sequence types.

Strings, tuples are immutable - we can access but not modify the **content** of the **sequence**:

In [None]:
t = (1, 2, 3)

In [None]:
t[0]

1

In [None]:
t[0] = 100

TypeError: 'tuple' object does not support item assignment

But of course, if the sequence contains mutable objects, then although we cannot modify the sequence of elements (cannot replace, delete or insert elements), we certainly **can** change the contents of the mutable objects:

In [None]:
t = ( [1, 2], 3, 4)

`t` is immutable, but its first element is a mutable object:

In [None]:
t[0][0] = 100

In [None]:
t

([100, 2], 3, 4)

#### Iterables

An **iterable** is just something that can be iterated over, for example using a `for` loop:

In [None]:
t = (10, 'a', 1+3j)

In [None]:
s = {10, 'a', 1+3j}

In [None]:
for c in t:
    print(c)

10
a
(1+3j)


In [None]:
for c in s:
    print(c)

a
10
(1+3j)


Note how we could iterate over both the tuple and the set. Iterating the tuple preserved the **order** of the elements in the tuple, but not for the set. Sets do not have an ordering of elements - they are iterable, but not sequences.

Most sequence types support the `in` and `not in` operations. Ranges do too, but not quite as efficiently as lists, tuples, strings, etc.

In [None]:
'a' in ['a', 'b', 100]

True

In [None]:
100 in range(200)

True

#### Min, Max and Length

Sequences also generally support the `len` method to obtain the number of items in the collection. Some iterables may also support that method.

In [None]:
len('python'), len([1, 2, 3]), len({10, 20, 30}), len({'a': 1, 'b': 2})

(6, 3, 3, 2)

Sequences (and even some iterables) may support `max` and `min` as long as the data types in the collection can be **ordered** in some sense (`<` or `>`).

In [None]:
a = [100, 300, 200]
min(a), max(a)

(100, 300)

In [None]:
s = 'python'
min(s), max(s)

('h', 'y')

In [None]:
s = {'p', 'y', 't', 'h', 'o', 'n'}
min(s), max(s)

('h', 'y')

But if the elements do not have an ordering defined:

In [None]:
a = [1+1j, 2+2j, 3+3j]
min(a)

TypeError: '<' not supported between instances of 'complex' and 'complex'

`min` and `max` will work for heterogeneous types as long as the elements are pairwise comparable (`<` or `>` is defined). 

For example:

In [None]:
from decimal import Decimal

In [None]:
t = 10, 20.5, Decimal('30.5')

In [None]:
min(t), max(t)

(10, Decimal('30.5'))

In [None]:
t = ['a', 10, 1000]
min(t)

TypeError: '<' not supported between instances of 'int' and 'str'

Even `range` objects support `min` and `max`:

In [None]:
r = range(10, 200)
min(r), max(r)

(10, 199)

#### Concatenation

We can **concatenate** sequences using the `+` operator:

In [None]:
[1, 2, 3] + [4, 5, 6]

[1, 2, 3, 4, 5, 6]

In [None]:
(1, 2, 3) + (4, 5, 6)

(1, 2, 3, 4, 5, 6)

Note that the type of the concatenated result is the same as the type of the sequences being concatenated, so concatenating sequences of varying types will not work:

In [None]:
(1, 2, 3) + [4, 5, 6]

TypeError: can only concatenate tuple (not "list") to tuple

In [None]:
'abc' + ['d', 'e', 'f']

TypeError: must be str, not list

Note: if you really want to concatenate varying types you'll have to transform them to a common type first:

In [None]:
(1, 2, 3) + tuple([4, 5, 6])

(1, 2, 3, 4, 5, 6)

In [None]:
tuple('abc') + ('d', 'e', 'f')

('a', 'b', 'c', 'd', 'e', 'f')

In [None]:
''.join(tuple('abc') + ('d', 'e', 'f'))

'abcdef'

#### Repetition

Most sequence types also support **repetition**, which is essentially concatenating the same sequence an integer number of times:

In [None]:
'abc' * 5

'abcabcabcabcabc'

In [None]:
[1, 2, 3] * 5

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

We'll come back to some caveats of concatenation and repetition in a bit.

#### Finding things in Sequences

We can find the index of the occurrence of an element in a sequence:

In [None]:
s = "gnu's not unix"

In [None]:
s.index('n')

1

In [None]:
s.index('n', 1), s.index('n', 2), s.index('n', 8)

(1, 6, 11)

An exception is raised of the element is not found, so you'll want to catch it if you don't want your app to crash:

In [None]:
s.index('n', 13)

ValueError: substring not found

In [None]:
try:
    idx = s.index('n', 13)
except ValueError:
    print('not found')

not found


Note that these methods of finding objects in sequences do not assume that the objects in the sequence are ordered in any way. These are basically searches that iterate over the sequence until they find (or not) the requested element.

If you have a sorted sequence, then other search techniques are available - such as binary searches. I'll cover some of these topics in the extras section of this course.

#### Slicing

We'll come back to slicing in a later lecture, but sequence types generally support slicing, even ranges (as of Python 3.2). Just like concatenation, slices will return the same type as the sequence being sliced:

In [None]:
s = 'python'
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [None]:
s[0:3], s[4:6]

('pyt', 'on')

In [None]:
l[0:3], l[4:6]

([1, 2, 3], [5, 6])

It's ok to extend ranges past the bounds of the sequence:

In [None]:
s[4:1000]

'on'

If your first argument in the slice is `0`, you can even omit it. Omitting the second argument means it will include all the remaining elements:

In [None]:
s[0:3], s[:3]

('pyt', 'pyt')

In [None]:
s[3:1000], s[3:], s[:]

('hon', 'hon', 'python')

We can even have extended slicing, which provides a start, stop and a step:

In [None]:
s, s[0:5], s[0:5:2]

('python', 'pytho', 'pto')

In [None]:
s, s[::2]

('python', 'pto')

Technically we can also use negative values in slices, including extended slices (more on that later):

In [None]:
s, s[-3:-1], s[::-1]

('python', 'ho', 'nohtyp')

In [None]:
r = range(11)  # numbers from 0 to 10 (inclusive)

In [None]:
print(r)
print(list(r))

range(0, 11)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [None]:
print(r[:5])

range(0, 5)


In [None]:
print(list(r[:5]))

[0, 1, 2, 3, 4]


As you can see, slicing a range returns a range object as well, as expected.

#### Hashing

Immutable sequences generally support a `hash` method that we'll discuss in detail in the section on mapping types:

In [None]:
l = (1, 2, 3)
hash(l)

2528502973977326415

In [None]:
s = '123'
hash(s)

-1892188276802162953

In [None]:
r = range(10)
hash(r)

-6299899980521991026

But mutable sequences (and mutable types in general) do not:

In [None]:
l = [1, 2, 3]

In [None]:
hash(l)

TypeError: unhashable type: 'list'

Note also that a hashable sequence, is no longer hashable if one (or more) of it's elements are not hashable:

In [None]:
t = (1, 2, [10, 20])
hash(t)

TypeError: unhashable type: 'list'

But this would work:

In [None]:
t = ('python', (1, 2, 3))
hash(t)

-8790163410081325536

In general, immutable types are likely hashable, while immutable types are not. So numbers, strings, tuples, etc are hashable, but lists and sets are not:

In [None]:
from decimal import Decimal
d = Decimal(10.5)
hash(d)

1152921504606846986

Sets are not hashable:

In [None]:
s = {1, 2, 3}
hash(s)

TypeError: unhashable type: 'set'

But frozensets, an immutable variant of the set, are:

In [None]:
s = frozenset({1, 2, 3})

In [None]:
hash(s)

-7699079583225461316

#### Caveats with Concatenation and Repetition

Consider this:

In [None]:
x = [2000]

In [None]:
id(x[0])

2177520743920

In [None]:
l = x + x

In [None]:
l

[2000, 2000]

In [None]:
id(l[0]), id(l[1])

(2177520743920, 2177520743920)

As expected, the objects in `l[0]` and `l[1]` are the same.

Could also use:

In [None]:
l[0] is l[1]

True

This is not a big deal if the objects being concatenated are immutable. But if they are mutable:

In [None]:
x = [ [0, 0] ]
l = x + x

In [None]:
l

[[0, 0], [0, 0]]

In [None]:
l[0] is l[1]

True

And then we have the following:

In [None]:
l[0][0] = 100

In [None]:
l[0]

[100, 0]

In [None]:
l

[[100, 0], [100, 0]]

Notice how changing the 1st item of the 1st element also changed the 1st item of the second element.

While this seems fairly obvious when concatenating using the `+` operator as we have just done, the same actually happens with repetition and may not seem so obvious:

In [None]:
x = [ [0, 0] ]

In [None]:
m = x * 3

In [None]:
m

[[0, 0], [0, 0], [0, 0]]

In [None]:
m[0][0] = 100

In [None]:
m

[[100, 0], [100, 0], [100, 0]]

And in fact, even `x` changed:

In [None]:
x

[[100, 0]]

If you really want these repeated objects to be different objects, you'll have to copy them somehow. A simple list comprehensions would work well here:

In [None]:
x = [ [0, 0] ]
m = [e.copy() for e in x*3]

In [None]:
m

[[0, 0], [0, 0], [0, 0]]

In [None]:
m[0][0] = 100

In [None]:
m

[[100, 0], [0, 0], [0, 0]]

In [None]:
x

[[0, 0]]