In [None]:
# encoding=utf8

# Fundamentals of Python


# Built-In Data Structures

We have seen Python's simple types: ``int``, ``float``, ``complex``, ``bool``, ``str``, and so on.
Python also has several built-in compound types, which act as containers for other types.
These compound types are:

| Type Name | Example                   |Description                            |
|-----------|---------------------------|---------------------------------------|
| ``list``  | ``[1, 2, 3]``             | Ordered collection                    |
| ``tuple`` | ``(1, 2, 3)``             | Immutable ordered collection          |
| ``dict``  | ``{'a':1, 'b':2, 'c':3}`` | Unordered (key,value) mapping         |
| ``set``   | ``{1, 2, 3}``             | Unordered collection of unique values |

As you can see, round, square, and curly brackets have distinct meanings when it comes to the type of collection produced.
We'll take a quick tour of these data structures here.

## Lists
Lists are the basic *ordered* and *mutable* data collection type in Python.
They can be defined with comma-separated values between square brackets; for example, here is a list of the first several prime numbers:

### Basics

In [45]:
empty_list1 = [ ]
empty_list1,type(empty_list1)

([], list)

In [46]:
empty_list2 = list()
empty_list2,type(empty_list2)

([], list)

In [38]:
weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
weekdays

['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']

In [54]:
press = ["Springer","Apress", u"清华大学出版社",u"电子工业出版社", u"наурыз"]
press

['Springer', 'Apress', '清华大学出版社', '电子工业出版社', 'наурыз']

In [35]:
L = [2, 3, 5, 7]

Lists have a number of useful properties and methods available to them.
Here we'll take a quick look at some of the more common and useful ones:

In [2]:
# Length of a list
len(L)

4

In [3]:
# Append a value to the end
L.append(11)
L

[2, 3, 5, 7, 11]

In [4]:
# Addition concatenates lists
L + [13, 17, 19]

[2, 3, 5, 7, 11, 13, 17, 19]

In [5]:
# sort() method sorts in-place
L = [2, 5, 1, 6, 3, 4]
L.sort()
L

[1, 2, 3, 4, 5, 6]

In addition, there are many more built-in list methods; they are well-covered in Python's [online documentation](https://docs.python.org/3/tutorial/datastructures.html).

While we've been demonstrating lists containing values of a single type, one of the powerful features of Python's compound objects is that they can contain objects of *any* type, or even a mix of types. For example:

In [6]:
L = [1, 'two', 3.14, [0, 3, 5]]

This flexibility is a consequence of Python's dynamic type system.
Creating such a mixed sequence in a statically-typed language like C can be much more of a headache!
We see that lists can even contain other lists as elements.
Such type flexibility is an essential piece of what makes Python code relatively quick and easy to write.

So far we've been considering manipulations of lists as a whole; another essential piece is the accessing of individual elements.
This is done in Python via *indexing* and *slicing*, which we'll explore next.

### List indexing and slicing
Python provides access to elements in compound types through *indexing* for single elements, and *slicing* for multiple elements.
As we'll see, both are indicated by a square-bracket syntax.
Suppose we return to our list of the first several primes:

In [7]:
L = [2, 3, 5, 7, 11]

Python uses *zero-based* indexing, so we can access the first and second element in using the following syntax:

In [8]:
L[0]

2

In [9]:
L[1]

3

Elements at the end of the list can be accessed with negative numbers, starting from -1:

In [10]:
L[-1]

11

In [11]:
L[-2]

7

Where *indexing* is a means of fetching a single value from the list, *slicing* is a means of accessing multiple values in sub-lists.
It uses a colon to indicate the start point (inclusive) and end point (non-inclusive) of the sub-array.
For example, to get the first three elements of the list, we can write:

In [12]:
L[0:3]

[2, 3, 5]

Notice where ``0`` and ``3`` lie in the preceding diagram, and how the slice takes just the values between the indices.
If we leave out the first index, ``0`` is assumed, so we can equivalently write:

In [13]:
L[:3]

[2, 3, 5]

Similarly, if we leave out the last index, it defaults to the length of the list.
Thus, the last three elements can be accessed as follows:

In [14]:
L[-3:]

[5, 7, 11]

Finally, it is possible to specify a third integer that represents the step size; for example, to select every second element of the list, we can write:

In [15]:
L[::2]  # equivalent to L[0:len(L):2]

[2, 5, 11]

A particularly useful version of this is to specify a negative step, which will reverse the array:

L[::-1]

### Manipulation of list

Both indexing and slicing can be used to set elements as well as access them.
The syntax is as you would expect:

In [61]:
L

[100, 3, 5, 7]

In [64]:
L[0] = 101
print(L)

[101, 3, 5, 7]


In [18]:
L[1:3] = [55, 56]
print(L)

[100, 55, 56, 7, 11]


In [65]:
L.append(110)
L

[101, 3, 5, 7, 110]

In [66]:
L.extend(range(10))
L

[101, 3, 5, 7, 110, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [67]:
L += range(3)
L

[101, 3, 5, 7, 110, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]

In [70]:
L.append([1,2,3,4,5,6,7])
L

[101,
 3,
 5,
 7,
 110,
 0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 0,
 1,
 2,
 [1, 2, 3, 4, 5, 6, 7]]

In [71]:
L.insert(0,0)
L

[0,
 101,
 3,
 5,
 7,
 110,
 0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 0,
 1,
 2,
 [1, 2, 3, 4, 5, 6, 7]]

In [72]:
del L[-1]
L

[0, 101, 3, 5, 7, 110, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]

In [73]:
L.remove(0)
L

[101, 3, 5, 7, 110, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2]

In [79]:
L.pop()

2

In [80]:
L

[101, 3, 5, 7, 110, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1]

In [81]:
L.index(0)

5

In [82]:
0 in L

True

In [83]:
L.count(0)

2

In [84]:
sorted(L)

[0, 0, 1, 1, 2, 3, 3, 4, 5, 5, 6, 7, 7, 8, 9, 101, 110]

In [85]:
L

[101, 3, 5, 7, 110, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1]

In [86]:
L.sort()

In [87]:
L

[0, 0, 1, 1, 2, 3, 3, 4, 5, 5, 6, 7, 7, 8, 9, 101, 110]

In [88]:
L.sort(reverse=True)

In [89]:
L

[110, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0]

In [90]:
len(L)

17

In [92]:
L1 = L
L1 

[110, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0]

In [93]:
L1[0] = 111
L1,L

([111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0],
 [111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0])

In [94]:
L2 = L.copy()
L2

[111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0]

In [95]:
L2[0] = 112
L2,L

([112, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0],
 [111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0])

In [96]:
L3 = list(L)
L3

[111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0]

In [97]:
L3[0] = 112
L3,L

([112, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0],
 [111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0])

In [98]:
L4 = L[:]
L4

[111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0]

In [99]:
L4[0] = 112
L4,L

([112, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0],
 [111, 101, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 1, 1, 0, 0])

### Conversion from other python types to list
+ To list from string
+ To list from tuple

In [47]:
list('cat')

['c', 'a', 't']

In [48]:
a_tuple = ('ready', 'fire', 'aim')
a_tuple

('ready', 'fire', 'aim')

In [49]:
list(a_tuple)

['ready', 'fire', 'aim']

In [51]:
birthday = '1/10/1949'
birthday

'1/10/1949'

In [52]:
birthday.split('/')

['1', '10', '1949']

### Lists of Lists

In [55]:
small_birds = ['hummingbird', 'finch']
small_birds

['hummingbird', 'finch']

In [56]:
extinct_birds = ['dodo', 'passenger pigeon', 'Norwegian Blue']
extinct_birds

['dodo', 'passenger pigeon', 'Norwegian Blue']

In [57]:
carol_birds = [3, 'French hens', 2, 'turtledoves']
carol_birds

[3, 'French hens', 2, 'turtledoves']

In [58]:
all_birds = [['hummingbird', 'finch'], ['dodo', 'passenger pigeon', 'Norwegian Blue'], 
             'macaw',[3, 'French hens', 2, 'turtledoves']]
all_birds

[['hummingbird', 'finch'],
 ['dodo', 'passenger pigeon', 'Norwegian Blue'],
 'macaw',
 [3, 'French hens', 2, 'turtledoves']]

In [59]:
all_birds[-1]

[3, 'French hens', 2, 'turtledoves']

### List Comprehessions

In [189]:
[number for number in range(10)]

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [190]:
[number ** 2 for number in range(10) if number%2 == 0 ]

[0, 4, 16, 36, 64]

In [191]:
odds = [number+1 for number in range(10) if number%2 == 0]
odds

[1, 3, 5, 7, 9]

In [192]:
evens = [number for number in range(10) if number%2 == 0]
evens

[0, 2, 4, 6, 8]

In [193]:
[(odd,even) for odd in odds for even in evens]

[(1, 0),
 (1, 2),
 (1, 4),
 (1, 6),
 (1, 8),
 (3, 0),
 (3, 2),
 (3, 4),
 (3, 6),
 (3, 8),
 (5, 0),
 (5, 2),
 (5, 4),
 (5, 6),
 (5, 8),
 (7, 0),
 (7, 2),
 (7, 4),
 (7, 6),
 (7, 8),
 (9, 0),
 (9, 2),
 (9, 4),
 (9, 6),
 (9, 8)]

A very similar slicing syntax is also used in many data science-oriented packages, including NumPy and Pandas (mentioned in the introduction).

Now that we have seen Python lists and how to access elements in ordered compound types, let's take a look at the other three standard compound data types mentioned earlier.

## Tuples
Tuples are in many ways similar to lists, but they are defined with parentheses rather than square brackets:

In [102]:
empty_tuple = ()
empty_tuple,type(empty_tuple),len(empty_tuple)

((), tuple, 0)

In [106]:
one_d_tuple = 1,
one_d_tuple

(1,)

In [104]:
t = (1, 2, 3)
t

(1, 2, 3)

In [105]:
type(t),len(t),t[0]

(tuple, 3, 1)

They can also be defined without any brackets at all:

In [20]:
t = 1, 2, 3
print(t)

(1, 2, 3)


Like the lists discussed before, tuples have a length, and individual elements can be extracted using square-bracket indexing:

In [21]:
len(t)

3

In [22]:
t[0]

1

In [108]:
a,b,c = t
a,b,c

(1, 2, 3)

In [109]:
type(a),type(b),type(c)

(int, int, int)

In [111]:
a,b,c = c,a,b
a,b,c

(2, 3, 1)

The main distinguishing feature of tuples is that they are *immutable*: this means that once they are created, their size and contents cannot be changed:

In [23]:
t[1] = 4

TypeError: 'tuple' object does not support item assignment

In [24]:
t.append(4)

AttributeError: 'tuple' object has no attribute 'append'

Tuples are often used in a Python program; a particularly common case is in functions that have multiple return values.
For example, the ``as_integer_ratio()`` method of floating-point objects returns a numerator and a denominator; this dual return value comes in the form of a tuple:

In [25]:
x = 0.125
x.as_integer_ratio()

(1, 8)

These multiple return values can be individually assigned as follows:

In [26]:
numerator, denominator = x.as_integer_ratio()
print(numerator / denominator)

0.125


The indexing and slicing logic covered earlier for lists works for tuples as well, along with a host of other methods.
Refer to the online [Python documentation](https://docs.python.org/3/tutorial/datastructures.html) for a more complete list of these.

## Dictionaries
Dictionaries are extremely flexible mappings of keys to values, and form the basis of much of Python's internal implementation.
They can be created via a comma-separated list of ``key:value`` pairs within curly braces:

In [112]:
empty_dict = {}
empty_dict

{}

In [113]:
type(empty_dict)

dict

In [120]:
dict1 = dict([[1,2],[3,4],[5,6]])
dict1

{1: 2, 3: 4, 5: 6}

In [121]:
dict2 = dict([(1,2),(3,4),(5,6)])
dict2

{1: 2, 3: 4, 5: 6}

In [123]:
dict3 = dict(['ab','cd','ef'])
dict3

{'a': 'b', 'c': 'd', 'e': 'f'}

In [118]:
numbers = {'one':1, 'two':2, 'three':3}
numbers

{'one': 1, 'three': 3, 'two': 2}

Items are accessed and set via the indexing syntax used for lists and tuples, except here the index is not a zero-based order but valid key in the dictionary:

In [28]:
# Access a value via the key
numbers['two']

2

New items can be added to the dictionary using indexing as well:

In [125]:
# Set a new key:value pair
numbers['ninety'] = 90
numbers,len(numbers)

({'ninety': 90, 'one': 1, 'three': 3, 'two': 2}, 4)

In [126]:
del numbers['ninety']
numbers

{'one': 1, 'three': 3, 'two': 2}

In [127]:
numbers.clear()
numbers

{}

In [128]:
numbers = {'one':1, 'two':2, 'three':3}
numbers

{'one': 1, 'three': 3, 'two': 2}

In [151]:
for name,contents in numbers.items():
    print(name,contents)

one 1
two 2
three 3


In [129]:
'one' in numbers

True

In [130]:
c['one']

1

In [133]:
numbers.keys(),numbers.values()

(dict_keys(['one', 'two', 'three']), dict_values([1, 2, 3]))

In [143]:
list(numbers.keys()),list(numbers.values())[1]

(['one', 'two', 'three'], 2)

In [152]:
dict1,dict2

({1: 2, 3: 4, 5: 6}, {1: 2, 3: 4, 5: 6})

In [154]:
set(dict1) & set(dict2)

{1, 3, 5}

In [155]:
set(dict1).intersection(set(dict2)) 

{1, 3, 5}

In [157]:
set(dict1) - set(dict2)

set()

In [158]:
set(dict1) ^ set(dict2)

set()

### Dictionary Convensions

In [195]:
dict3 = {value: value ** 2 for value in dict2 }
dict2,dict3

({1: 2, 3: 4, 5: 6}, {1: 1, 3: 9, 5: 25})

In [197]:
dict4 = {value: value ** 2 for value in dict2 if value % 2 == 0}
dict2,dict4

({1: 2, 3: 4, 5: 6}, {})

Keep in mind that dictionaries do not maintain any sense of order for the input parameters; this is by design.
This lack of ordering allows dictionaries to be implemented very efficiently, so that random element access is very fast, regardless of the size of the dictionary (if you're curious how this works, read about the concept of a *hash table*).
The [python documentation](https://docs.python.org/3/library/stdtypes.html) has a complete list of the methods available for dictionaries.

## Sets

The fourth basic collection is the set, which contains unordered collections of unique items.
They are defined much like lists and tuples, except they use the curly brackets of dictionaries:

In [164]:
empty_set = set()
empty_set,type(empty_set),len(empty_set)

(set(), set, 0)

In [165]:
alphebat_set = set("abcdefg")
alphebat_set

{'a', 'b', 'c', 'd', 'e', 'f', 'g'}

In [166]:
dict_to_set = set(dict([(1,2),(3,4),(5,6)]))
dict_to_set

{1, 3, 5}

In [167]:
primes = {2, 3, 5, 7}
odds = {1, 3, 5, 7, 9}

In [172]:
primes.remove(7)

In [173]:
primes

{2, 3, 5}

If you're familiar with the mathematics of sets, you'll be familiar with operations like the union, intersection, difference, symmetric difference, and others.
Python's sets have all of these operations built-in, via methods or operators.
For each, we'll show the two equivalent methods:

In [174]:
# union: items appearing in either
primes | odds      # with an operator
primes.union(odds) # equivalently with a method

{1, 2, 3, 5, 7, 9}

In [175]:
# intersection: items appearing in both
primes & odds             # with an operator
primes.intersection(odds) # equivalently with a method

{3, 5}

In [176]:
primes > odds

False

In [177]:
# difference: items in primes but not in odds
primes - odds           # with an operator
primes.difference(odds) # equivalently with a method

{2}

In [179]:
# symmetric difference: items appearing in only one set
primes ^ odds                     # with an operator
primes.symmetric_difference(odds) # equivalently with a method

{1, 2, 7, 9}

In [180]:
# subset check
odds <= primes

False

In [181]:
odds.issubset(primes)

False

In [182]:
alphebat_set.issuperset(set('ab'))

True

In [184]:
alphebat_set > (set('ab'))

True

In [187]:
primes.remove(2)

In [188]:
primes

{3, 5}

Many more set methods and operations are available.
You've probably already guessed what I'll say next: refer to Python's [online documentation](https://docs.python.org/3/library/stdtypes.html) for a complete reference.

### Set Comprehension

In [198]:
alphebat_set

{'a', 'b', 'c', 'd', 'e', 'f', 'g'}

In [201]:
alphebat_set_double = {letter for letter in alphebat_set}
alphebat_set_double, type(alphebat_set_double)

({'a', 'b', 'c', 'd', 'e', 'f', 'g'}, set)

## More Specialized Data Structures

Python contains several other data structures that you might find useful; these can generally be found in the built-in ``collections`` module.
The collections module is fully-documented in [Python's online documentation](https://docs.python.org/3/library/collections.html), and you can read more about the various objects available there.

In particular, I've found the following very useful on occasion:

- ``collections.namedtuple``: Like a tuple, but each value has a name
- ``collections.defaultdict``: Like a dictionary, but unspecified keys have a user-specified default value
- ``collections.OrderedDict``: Like a dictionary, but the order of keys is maintained

Once you've seen the standard built-in collection types, the use of these extended functionalities is very intuitive, and I'd suggest [reading about their use](https://docs.python.org/3/library/collections.html).