## Tuples

_Tuples_ are a sequential data structure. They are immutable and their items can be accessed by index (order is preserved).

In [34]:
tuple(())

()

In [35]:
type((1)) # pay attention here!

int

In [36]:
type((1,))

tuple

In [37]:
point = 0, 1

In [38]:
x, y = point

In [39]:
date = "April", 2

In [40]:
point = (5, 10)
_, y = point
y

10

In [41]:
point[0]

5

⚠️ _Searching for an item in a large tuple is slow. Each item must be checked._

In [42]:
animals = ("tiger", 2, "cat", 3, "wolf", 1)

In [43]:
animals.index("cat")

2

In [44]:
animals.count("wolf")

1

ℹ️ _Parentheses are not required but can improve the code readability:_

In [45]:
# Bad
xs = 42,
x, = xs
x

42

In [46]:
# God
xs = (42, )
[x] = xs
x

42

### Slices

In [47]:
person = ("Guido", "van Rossum", "January", 31, 1956)

In [48]:
name, birthday = person[:2], person[2:]

In [49]:
name

('Guido', 'van Rossum')

In [50]:
birthday

('January', 31, 1956)

### Named Slices 💡

In [51]:
NAME, BIRTHDAY = slice(2), slice(2, None)

In [52]:
person[NAME]

('Guido', 'van Rossum')

In [53]:
person[BIRTHDAY]

('January', 31, 1956)

### Reversing

[Why is reversing a list with slicing slower than reverse iterator](https://stackoverflow.com/questions/16465046/why-is-reversing-a-list-with-slicing-slower-than-reverse-iterator)

In [54]:
tuple(reversed((1, 2, 3)))

(3, 2, 1)

In [55]:
(1, 2, 3)[::-1]

(3, 2, 1)

### Concatenation

In [56]:
xs, ys = (1, 2), (3, )

In [57]:
id(xs), id(ys)

(140039042992256, 140038885582832)

In [58]:
# Tuples are copied when we concatenate them
id(xs + ys)

140038488224064

### Comparison

ℹ️ _Tuples and lists are **compared lexicographically** using comparison of corresponding elements._

In [59]:
(1, 2, 3) < (1, 2, 4)

True

In [60]:
(1, 2, 3, 4) < (1, 2, 4)

True

In [61]:
(1, 2) < (1, 2, 42)

True

In [62]:
(None, 1) < (2, 4)

TypeError: '<' not supported between instances of 'NoneType' and 'int'

## collections.namedtuple

In [63]:
from collections import namedtuple

In [64]:
Person = namedtuple("Person", ["name", "age"])

In [65]:
p = Person("Guido", age=64)

In [66]:
p._fields

('name', 'age')

In [67]:
p._asdict()

{'name': 'Guido', 'age': 64}

In [68]:
p._replace(name="Andrey")

Person(name='Andrey', age=64)

## Lists

In [69]:
[1, 2, 3, 4]

[1, 2, 3, 4]

In [70]:
["a", "b", "c", "d"]

['a', 'b', 'c', 'd']

### Initialization

In [71]:
[0] * 2

[0, 0]

In [72]:
[""] * 2

['', '']

⚠️ _Initial value is not copied:_

In [73]:
chunks = [[0]] * 2  # matrix 2x1 with zero values

In [74]:
chunks

[[0], [0]]

In [75]:
chunks[0][0] = 42

In [76]:
chunks

[[42], [42]]

ℹ️ _This problem can pops up on the tech interview._

Better way – **use list comprehensions**:

In [77]:
chunks = [[0] for _ in range(2)]

In [78]:
chunks

[[0], [0]]

In [79]:
chunks[0][0] = 42

In [80]:
chunks

[[42], [0]]

### Appending and extending

In [81]:
xs = [1, 2, 3]

In [82]:
xs.append(42)
xs

[1, 2, 3, 42]

In [83]:
xs.extend({-1, -2})
xs

[1, 2, 3, 42, -1, -2]

### Insertion

Insert is a relatively slow operation. On practice would be better to add an item to the end of list and sort the whole container.

In [84]:
xs = [1, 2, 3]

In [85]:
xs.insert(0, 4)  # where 4 is an index
xs

[4, 1, 2, 3]

In [86]:
xs.insert(-1, 42)
xs

[4, 1, 2, 42, 3]

### Changing sublist in-place

In [87]:
xs = [1, 2, 3]
xs[:2] = [0] * 2
xs

[0, 0, 3]

### Concatenation

In [88]:
xs, ys = [1, 2], [3]

In [89]:
id(xs), id(ys)

(140038488312704, 140038488355840)

In [90]:
id(xs + ys)

140038488325376

_In-place_ concatenation:

In [91]:
xs += ys  # xs = xs.extend(ys)
id(xs)

140038488312704

#### Reasons against `+=`

Example #1:

In [92]:
xs = []
def f():
    xs += [42]

In [93]:
f()

UnboundLocalError: local variable 'xs' referenced before assignment

Example #2:

In [94]:
xs = []
xs += "abc"

In [95]:
xs

['a', 'b', 'c']

`extend` accept an iterable object. Therefore this would work.

### Removal

In [96]:
xs = [1, 2, 3]

In [97]:
del xs[:2]  # del works also with slices
xs

[3]

In [98]:
xs = [1, 2, 3]
del xs[:]
xs

[]

In [99]:
xs = [1, 2, 3]
xs.pop(1)  # returns removed value

2

In [100]:
xs

[1, 3]

In [101]:
xs = [1, 1, 0]
xs.remove(1)  # removes first element occurrence
xs

[1, 0]

### Reversing

In [102]:
list(reversed([1, 2, 3]))

[3, 2, 1]

In [103]:
xs = [1, 2, 3]
xs.reverse()  # in-place operation, returns None
xs

[3, 2, 1]

### Sorting

https://bugs.python.org/file4451/timsort.txt

In [104]:
xs = [3, 2, 1]
sorted(xs), xs

([1, 2, 3], [3, 2, 1])

In [105]:
xs.sort()  # in-place operation, returns None
xs

[1, 2, 3]

In [106]:
xs = [3, 2, 1]
xs.sort(key=lambda x: x % 2, reverse=True)
xs

[3, 1, 2]

### Stack, Queue

In [107]:
stack = []
stack.append(1)
stack.append(2)
stack

[1, 2]

In [108]:
q = []
q.append(1)
q.append(2)
q.pop(0)  # warn: copies the whole list!
q

[2]

### Deque

In [109]:
from collections import deque

In [110]:
q = deque([1, 2, 3])

In [111]:
q.appendleft(0)
q

deque([0, 1, 2, 3])

In [112]:
q.append(4)
q

deque([0, 1, 2, 3, 4])

In [113]:
q.popleft()

0

In [114]:
q[0]

1

---

In [115]:
q = deque([1, 2], maxlen=2)

In [116]:
q.appendleft(0)
q

deque([0, 1])

In [117]:
q.append(2)
q

deque([1, 2])

## Sets

A set is an **unordered** collection **without duplicate elements**. It allows to test item membership very fast, but doesn't support indexing.

In [118]:
letters = set('somerandomstringwithrepeatableletters')
letters

{'a',
 'b',
 'd',
 'e',
 'g',
 'h',
 'i',
 'l',
 'm',
 'n',
 'o',
 'p',
 'r',
 's',
 't',
 'w'}

In [119]:
xs, ys, zs = {1, 2}, {2, 3}, {3, 4}

In [120]:
set.union(xs, ys, zs)  # xs | ys | zs

{1, 2, 3, 4}

In [121]:
set.intersection(xs, ys, zs)  # xs & ys & zs

set()

In [122]:
set.difference(xs, ys, zs)  # xs - ys - zs

{1}

In [123]:
xs.isdisjoint(ys)

False

In [124]:
xs <= ys  # xs ⊆ ys

False

In [125]:
xs < xs  # xs ⊂ xs

False

In [126]:
xs | ys >= xs  # xs ∪ ys ⊇ xs

True

### Adding to a set and updating

In [127]:
seen = set()
seen.add(42)  # adds a single element into the set
seen

{42}

In [128]:
seen.update([1, 2])  # adds sequence of elements into the set
seen

{1, 2, 42}

In [129]:
seen.update([], [1], [2], [3])
seen

{1, 2, 3, 42}

### Removal

In [130]:
seen = {1, 2, 3}
seen.remove(3)
seen

{1, 2}

In [131]:
seen.remove(100500)

KeyError: 100500

In [132]:
seen.discard(100500)
seen

{1, 2}

In [133]:
seen.clear()
seen

set()

### frozenset

ℹ️ Sets in Python are hash sets. It means it can contain only hashable objects.

In [134]:
{set(), set()}

TypeError: unhashable type: 'set'

In [135]:
{frozenset(), frozenset()}

{frozenset()}

`frozenset` supports all operations around sets except addition and removal.

## Dictionaries

Dicts and sets are **not ordered** data structures. They are not guarantee an order in which elements will be produced while iterating over them.

_Dict's retaining insertion order is guaranteed for Python 3.7._

* [Are dictionaries ordered in Python 3.6+?](https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6?rq=1)
* [[Python-Dev] Guarantee ordered dict literals in v3.7?](https://mail.python.org/pipermail/python-dev/2017-December/151283.html)
* [[Python-Dev] More compact dictionaries with faster iteration](https://mail.python.org/pipermail/python-dev/2012-December/123028.html)
* [[Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered](https://mail.python.org/pipermail/python-dev/2016-September/146327.html)

⚠️ Dictionary keys can only be **immutable** (TypeError: unhashable type).

ℹ️ Looking for a key in a large dictionary is extremely fast.

In [136]:
type({})

dict

In [137]:
d = dict(foo="bar")

In [138]:
dict(d)  # (shallow) a copy

{'foo': 'bar'}

In [139]:
dict(d, boo="baz")  # copy the dict with adding a new key

{'foo': 'bar', 'boo': 'baz'}

In [140]:
dict.fromkeys(["foo", "bar"])

{'foo': None, 'bar': None}

In [141]:
dict.fromkeys("abcd", 0)  # warn: creates a dict from an iterable object

{'a': 0, 'b': 0, 'c': 0, 'd': 0}

💥 Dictionary creation with `fromkeys` and mutable objects:

In [142]:
d = dict.fromkeys("abcd", [])
d

{'a': [], 'b': [], 'c': [], 'd': []}

In [143]:
d["a"].append(x)
d

{'a': [42], 'b': [42], 'c': [42], 'd': [42]}

💡 Use list comprehensions:

In [144]:
d = {ch: [] for ch in "abcd"}
d

{'a': [], 'b': [], 'c': [], 'd': []}

In [145]:
d["a"].append(x)
d

{'a': [42], 'b': [], 'c': [], 'd': []}

### Keys and values

In [146]:
d = dict.fromkeys(["foo", "bar"], 42)
d.keys()

dict_keys(['foo', 'bar'])

In [147]:
d.values()

dict_values([42, 42])

In [148]:
d.items()

dict_items([('foo', 42), ('bar', 42)])

In [149]:
len(d.items())

2

In [150]:
42 in d.values()

True

Additionally, keys support some set operations:

In [151]:
d.keys() & {"foo"}

{'foo'}

### Iterating through a dict

In [152]:
{v for v in d.values()}

{42}

⚠️ We can't modify a dict during iteration:

In [153]:
for k in d:
    del d[k]

RuntimeError: dictionary changed size during iteration

💡 Workaround:

In [154]:
for k in set(d):
    del d[k]

In [155]:
d

{}

### Getting a value

In [156]:
d = {"foo": "bar"}

In [157]:
d["foo"]

'bar'

In [158]:
d["boo"]

KeyError: 'boo'

In [159]:
d.get("boo", 42)

42

It's much better (even performance-wise) than doing something like this:

In [160]:
if "boo" not in d:
    value = 42
else:
    value = d["boo"]

### Adding to a dict and updating

In [161]:
d["fizz"] = "buzz"
d

{'foo': 'bar', 'fizz': 'buzz'}

[setdefault(key[, default])](https://docs.python.org/3/library/stdtypes.html#dict.setdefault)

If _key_ is in the dictionary, return its value. If not, insert _key_ with a value of _default_ and return _default_. _default_ defaults to `None`.

In [162]:
d = {"foo": "bar"}
d.setdefault("foo", "???")

'bar'

In [163]:
d.setdefault("boom", 42)

42

In [164]:
d

{'foo': 'bar', 'boom': 42}

---

In [165]:
d = {}
d.update([("foo", "bar")])
d.update(boo=42)
d

{'foo': 'bar', 'boo': 42}

In [166]:
d = {}
# or
d.update([("foo", "bar")], boo=42)
d

{'foo': 'bar', 'boo': 42}

### Removal

In [167]:
del d["boo"]
d

{'foo': 'bar'}

In [168]:
d.pop("foo")

'bar'

In [169]:
d

{}

In [170]:
d["what"] = "?"
d

{'what': '?'}

In [171]:
d.clear()
d

{}

### Merge two dicts

In [172]:
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
x.update(y)
x

{'a': 1, 'b': 3, 'c': 4}

Python 3.5+

In [173]:
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = {**x, **y}
z

{'a': 1, 'b': 3, 'c': 4}

## collections.defaultdict

💡 _It can be very helpful for working with graphs._

In [174]:
g = {"a": {"b"}, "b": {"c"}}
g["a"]

{'b'}

How we can add graph edges `("b", "a")` and `("c", "a")`?

In [175]:
g["b"].add("a")
g["c"].add("a")

KeyError: 'c'

Using `defaultdict`:

In [176]:
from collections import defaultdict

In [177]:
g = defaultdict(set, **{"a": {"b"}, "b": {"c"}})
g

defaultdict(set, {'a': {'b'}, 'b': {'c'}})

In [178]:
g["c"].add("a")

In [179]:
g

defaultdict(set, {'a': {'b'}, 'b': {'c'}, 'c': {'a'}})

## collections.OrderedDict 🆕

As of Python 3.7, a new improvement is:

> the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.

This means there is no real need for `OrderedDict` anymore. They are **almost** the same.

In Python 3.8, `dict` and `dictviews` are now _iterable_ in reversed insertion order using `reversed()`. `move_to_end` is the last major difference between `dict` and `OrderedDict`.

## collections.Counter

[PyMOTW-3: Counter — Count Hashable Objects](https://pymotw.com/3/collections/counter.html)

> A Counter is a container that keeps track of how many times equivalent values are added. It can be used to implement the same algorithms for which other languages commonly use bag or multiset data structures.

In [180]:
from collections import Counter

In [181]:
c = Counter(["foo", "foo", "foo", "bar"])
c["foo"] + 1
c

Counter({'foo': 3, 'bar': 1})

`Counter` supports all dictionary methods as well as implementing a few additional methods:

In [182]:
c.pop("foo")

3

In [183]:
c["boo"]  # doesn't raise an exception

0

In [184]:
c = Counter(foo=4, bar=-1)
list(c.elements())

['foo', 'foo', 'foo', 'foo']

In [185]:
c.most_common(1)

[('foo', 4)]

In [186]:
c.update(["bar"])
c

Counter({'foo': 4, 'bar': 0})

In [187]:
c.subtract({"foo": 2})
c

Counter({'foo': 2, 'bar': 0})

---

In [188]:
c1 = Counter(foo=4, bar=-1)
c2 = Counter(foo=2,  bar=2)
c1 + c2  # c1[k] + c2[k]

Counter({'foo': 6, 'bar': 1})

In [189]:
c1 - c2  # c1[k] - c2[k]

Counter({'foo': 2})

In [190]:
c1 & c2  # min(c1[k], c2[k])

Counter({'foo': 2})

In [191]:
c1 | c2  # max(c1[k], c2[k])

Counter({'foo': 4, 'bar': 2})

## Arrays

- Numerical data types
- Must all be the same type

vs Lists:

- Store anything
- Store any type

In [192]:
from array import array

scores = array('d')
scores.append(97)
scores.append(98)

In [193]:
scores

array('d', [97.0, 98.0])

In [194]:
scores[1]

98.0