# Sequence Types

Sequence types have the general concept of a first element, a second element, and so on. Basically an ordering of the sequence items using the natural numbers. In Python (and many other languages) the starting index is set to `0`, not `1`.

So the first item has index `0`, the second item has index `1`, and so on.

> In sequence the order in which elements are returned is gurenteed eg:list, string, tuple unlike the sets which are not sequecnce type where order is not maintained.

In [3]:
#  Built-in sequence types in python
l = ['a',2,3]
t = (1,2,3)
s = "string"

for e in l:
    print(e)
    
# here order is maintained

a
2
3


In [5]:
s = {'a',2,3}
for e in s:
    print(e)
    
# Here order is not maintained

2
3
a


> Also we can't index items in set

In [1]:
l = [1,2,3]
t = (1,2,3)

l[0] = 100
print(l)

[100, 2, 3]


In [2]:
t[0] = 100

TypeError: 'tuple' object does not support item assignment

But of course, if the sequence contains mutable objects, then although we cannot modify the sequence of elements (cannot replace, delete or insert elements), we certainly **can** change the contents of the mutable objects:


In [4]:
t = ( [1, 2], 3, 4)
# `t` is immutable, but its first element is a mutable object:

In [6]:
t[0][0] = 100
t

([100, 2], 3, 4)

#### Iterables
An **iterable** is just something that can be iterated over, for example using a `for` loop:

In [10]:
t = (10, 'a', 1+3j)
for c in t:
    print(c)

10
a
(1+3j)


In [9]:
s = {10, 'a', 1+3j}
for c in s:
    print(c)

(1+3j)
10
a


Note how we could iterate over both the tuple and the set. Iterating the tuple preserved the **order** of the elements in the tuple, but not for the set. Sets do not have an ordering of elements - they are iterable, but not sequences.

Most sequence types support the `in` and `not in` operations. Ranges do too, but not quite as efficiently as lists, tuples, strings, etc.


In [11]:
'a' in ['a', 'b', 100]

True

In [12]:
100 in range(200)

True

#### Min, Max and Length
Sequences also generally support the `len` method to obtain the number of items in the collection. Some iterables may also support that method.

In [13]:
len('python'), len([1, 2, 3]), len({10, 20, 30}), len({'a': 1, 'b': 2})

(6, 3, 3, 2)

Sequences (and even some iterables) may support `max` and `min` as long as the data types in the collection can be **ordered** in some sense (`<` or `>`).

In [14]:
a = [100, 300, 200]
min(a), max(a)

(100, 300)

In [15]:
s = 'python'
min(s), max(s)

('h', 'y')

In [16]:
s = {'p', 'y', 't', 'h', 'o', 'n'}
min(s), max(s)

('h', 'y')

In [18]:
# But if the elements do not have an ordering defined:
a = [1+1j, 2+2j, 3+3j]
min(a)

TypeError: '<' not supported between instances of 'complex' and 'complex'

`min` and `max` will work for heterogeneous types as long as the elements are pairwise comparable (`<` or `>` is defined). 

For example:

In [19]:
from decimal import Decimal
t = 10, 20.5, Decimal('30.5')
min(t), max(t)

(10, Decimal('30.5'))

In [20]:
t = ['a', 10, 1000]
min(t)

TypeError: '<' not supported between instances of 'int' and 'str'

Even `range` objects support `min` and `max`:

In [21]:
r = range(10, 200)
min(r), max(r)

(10, 199)

#### Concatenation
We can **concatenate** sequences using the `+` operator:

In [22]:
[1, 2, 3] + [4, 5, 6]

[1, 2, 3, 4, 5, 6]

In [23]:
(1, 2, 3) + (4, 5, 6)

(1, 2, 3, 4, 5, 6)

Note that the type of the concatenated result is the same as the type of the sequences being concatenated, so concatenating sequences of varying types will not work:

In [24]:
(1, 2, 3) + [4, 5, 6]

TypeError: can only concatenate tuple (not "list") to tuple

In [25]:
'abc' + ['d', 'e', 'f']

TypeError: can only concatenate str (not "list") to str

Note: if you really want to concatenate varying types you'll have to transform them to a common type first:

In [26]:
(1, 2, 3) + tuple([4, 5, 6])

(1, 2, 3, 4, 5, 6)

In [27]:
tuple('abc') + ('d', 'e', 'f')

('a', 'b', 'c', 'd', 'e', 'f')

In [28]:
''.join(tuple('abc') + ('d', 'e', 'f'))

'abcdef'

#### Repetition
Most sequence types also support **repetition**, which is essentially concatenating the same sequence an integer number of times:

In [29]:
'abc' * 5

'abcabcabcabcabc'

In [30]:
[1, 2, 3] * 5

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

We'll come back to some caveats of concatenation and repetition in a bit.

#### Finding things in Sequences
We can find the index of the occurrence of an element in a sequence:

In [1]:
s = "gnu's not unix"

In [2]:
s.index('n')

1

In [7]:
s.index('n', 1), s.index('n', 2), s.index('n', 7)

(1, 6, 11)

In [8]:
s.index('n', 13)

ValueError: substring not found

Note that these methods of finding objects in sequences do not assume that the objects in the sequence are ordered in any way. These are basically searches that iterate over the sequence until they find (or not) the requested element.

If you have a sorted sequence, then other search techniques are available - such as binary searches. I'll cover some of these topics in the extras section of this course.


#### Slicing
We'll come back to slicing in a later lecture, but sequence types generally support slicing, even ranges (as of Python 3.2). Just like concatenation, slices will return the same type as the sequence being sliced:

In [9]:
s = 'python'
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [10]:
s[0:3], s[4:6]

('pyt', 'on')

In [11]:
l[0:3], l[4:6]

([1, 2, 3], [5, 6])

It's ok to extend ranges past the bounds of the sequence:

In [12]:
s[4:1000]

'on'

If your first argument in the slice is `0`, you can even omit it. Omitting the second argument means it will include all the remaining elements:

In [13]:
s[0:3], s[:3]

('pyt', 'pyt')

In [14]:
s[3:1000], s[3:], s[:]

('hon', 'hon', 'python')

We can even have extended slicing, which provides a start, stop and a step:

In [15]:
s, s[0:5], s[0:5:2]

('python', 'pytho', 'pto')

In [16]:
s, s[::2]

('python', 'pto')

Technically we can also use negative values in slices, including extended slices (more on that later):

In [17]:
s, s[-3:-1], s[::-1]

('python', 'ho', 'nohtyp')

In [18]:
r = range(11)  # numbers from 0 to 10 (inclusive)

In [19]:
print(r)
print(list(r))

range(0, 11)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


In [20]:
print(r[:5])

range(0, 5)


In [21]:
print(list(r[:5]))

[0, 1, 2, 3, 4]


As you can see, slicing a range returns a range object as well, as expected.

#### Hashing
Immutable sequences generally support a `hash` method that we'll discuss in detail in the section on mapping types:

In [22]:
l = (1, 2, 3)
hash(l)

529344067295497451

In [23]:
s = '123'
hash(s)

6817638206448553287

In [27]:
hash('123')

6817638206448553287

In [28]:
r = range(10)
hash(r)

-7546101314042312252

But mutable sequences (and mutable types in general) do not:

In [30]:
l = [1, 2, 3]
hash(l)

TypeError: unhashable type: 'list'

In [31]:
t = (1, 2, [10, 20])
hash(t)

TypeError: unhashable type: 'list'

But this would work:

In [32]:
t = ('python', (1, 2, 3))
hash(t)

-9039585191953443576

In general, immutable types are likely hashable, while immutable types are not. So numbers, strings, tuples, etc are hashable, but lists and sets are not:

In [33]:
from decimal import Decimal
d = Decimal(10.5)
hash(d)

1152921504606846986

In [35]:
# Sets are not hashable:
s = {1, 2, 3}
hash(s)

TypeError: unhashable type: 'set'

In [37]:
# But frozensets, an immutable variant of the set, are:
s = frozenset({1, 2, 3})
hash(s)

-272375401224217160

#### Caveats with Concatenation and Repetition

In [38]:
# Consider this:
x = [2000]
id(x[0])

4578647888

In [39]:
l = x + x
l

[2000, 2000]

In [40]:
id(l[0]), id(l[1])

(4578647888, 4578647888)

As expected, the objects in `l[0]` and `l[1]` are the same.

Could also use:

In [41]:
l[0] is l[1]

True

This is not a big deal if the objects being concatenated are immutable. But if they are mutable:



In [45]:
x = [ [0, 0] ]
l = x + x
l

[[0, 0], [0, 0]]

In [46]:
l[0] is l[1]

True

In [48]:
l[0][0] = 100
l[0]

[100, 0]

In [49]:
l

[[100, 0], [100, 0]]

Notice how changing the 1st item of the 1st element also changed the 1st item of the second element.

While this seems fairly obvious when concatenating using the `+` operator as we have just done, the same actually happens with repetition and may not seem so obvious:


In [50]:
x = [ [0, 0] ]
m = x * 3
m

[[0, 0], [0, 0], [0, 0]]

In [51]:
m[0][0] = 100
m

And in fact, even `x` changed:

In [52]:
x

[[100, 0]]

If you really want these repeated objects to be different objects, you'll have to copy them somehow. A simple list comprehensions would work well here:

In [54]:
x = [ [0, 0] ]
m = [e.copy() for e in x*3]
m

[[0, 0], [0, 0], [0, 0]]

In [56]:
m[0][0] = 100
m

[[100, 0], [0, 0], [0, 0]]

In [57]:
x

[[0, 0]]