To generate a presentation run the following command:

``jupyter nbconvert Lecture_3.ipynb --to slides --post serve``

# How to use this notebook

This notebook serves as both a presentation and interactive environment for students to experiment with Python. If you run it in the interactive mode using [Binder](https://mybinder.org/v2/gh/krzysztofarendt/deap/master), you can modify all code cells. Press `Shift+Enter` to run the modified code.

Link to the repository: https://github.com/krzysztofarendt/deap 

# Data types

You will probably be using several different data types when programming in Python. Today, you'll learn about the built-in types, which are those readily available in Python without any additional libraries. Some of them you've been already using when learning the basics.

### Built-in scalar types

Scalar types represent a single value:

In [2]:
a = 1   # Integer
b = 1.  # Floating-point number (real number)
c = 'Hello!'                                  # String
d = "How are you?"                            # String
e = "String with single quote (') character"  # String
f = 'String with double quote (") character'  # String
g = True          # Boolean
h = False         # Boolean
i = 2 + 3j        # Complex numbers 

### Built-in data structures

Data structures represent multiple values and possibly they relations to one another. The built-in data strucures are:

- `list` - list, e.g. `[1, 2, 3]`
- `tuple` - tuple, e.g. `(1, 2, 3)`
- `set` - set, `{1, 2, 3}`
- `dict` - dictionary, e.g. `{1: [1, 2], 2: [3, 4], 3: [5, 6]}`

You know already a bit about lists, but let's summarize...

# List

- A list is an ordered sequence of objects.
- Any objects can be stored in a list.
- Objects of different types can be hold in a list.

In [11]:
a = [1, 2, 3]  # A list holding integers
b = [1., 2., 3.]  # A list holding floating-point numbers
c = ['a', 'b', 'c']  # A list holding strings
d = [1, 2., '3', 'anything can be added to a list']
e = [a, b, c, d, 'even another list...']

print(e)

[[1, 2, 3], [1.0, 2.0, 3.0], ['a', 'b', 'c'], [1, 2.0, '3', 'anything can be added to a list'], 'even another list...']


A prettier way to print the elements of this list:

In [12]:
for x in e:
    print(x)  # x is a variable that can be used ONLY within THIS for loop

[1, 2, 3]
[1.0, 2.0, 3.0]
['a', 'b', 'c']
[1, 2.0, '3', 'anything can be added to a list']
even another list...


The same result can be achieved with another loop:

In [13]:
for i in range(len(e)):
    print(e[i])

[1, 2, 3]
[1.0, 2.0, 3.0]
['a', 'b', 'c']
[1, 2.0, '3', 'anything can be added to a list']
even another list...


There are few useful methods worth remembering when working with lists: `append`, `extend`, `pop`. A method is a function specific to some object type, in this example a list.

- `append` adds new elements to the end of the list.
- `extend` extends the given list with another list.
- `pop` returns the last element of a list and deletes it from the list.

In [27]:
a = [1, 2, 3]
print(a)

a.append(0)  # Add 0 at the and
print(a)

a.extend([4, 5])  # Add 4, 5 at the end
print(a)

b = a.pop()  # Return last element and delete
print(a)
print(b)

[1, 2, 3]
[1, 2, 3, 0]
[1, 2, 3, 0, 4, 5]
[1, 2, 3, 0, 4]
5


If two lists are added together, this is what happens:

In [108]:
a = [1, 2]
b = [3, 4, 5]
print(a + b)

[1, 2, 3, 4, 5]


There are also two generic functions often used with lists:

- `len` - return the length of the list
- `reversed` - return the reversed sequence

Both of these functions can be used also with other data structures, **as long as the operation makes sense**. E.g. a set is an unordered collection, so it has a size (`len` can be used), but cannot be reversed (`reversed` cannot be used).

There are many ways to achieve the same result in Python. E.g. these are three equivalent loops printing a list in a reversed order:

In [51]:
print('--- EASY ---')
a = [1, 2, 3]
while len(a) > 0:
    print(a.pop())

print('--- ALSO EASY ---')
a = [1, 2, 3]
for x in reversed(a):
    print(x)
    
print('--- CONFUSING? ---')
a = [1, 2, 3]
while len(a) > 0:
    print(a[-1])
    a = a[:-1]
    
print('--- MORE CONFUSING? ---')
a = [1, 2, 3]
for i in range(len(a) - 1, -1, -1):
    print(a[i])

--- EASY ---
3
2
1
--- ALSO EASY ---
3
2
1
--- CONFUSING? ---
3
2
1
--- MORE CONFUSING? ---
3
2
1


Surprised in the way `range` was used? You can always get a quick `help`:

In [52]:
help(range)

Help on class range in module builtins:

class range(object)
 |  range(stop) -> range object
 |  range(start, stop[, step]) -> range object
 |  
 |  Return an object that produces a sequence of integers from start (inclusive)
 |  to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
 |  start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
 |  These are exactly the valid indices for a list of 4 elements.
 |  When step is given, it specifies the increment (or decrement).
 |  
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __contains__(self, key, /)
 |      Return key in self.
 |  
 |  __eq__(self, value, /)
 |      Return self==value.
 |  
 |  __ge__(self, value, /)
 |      Return self>=value.
 |  
 |  __getattribute__(self, name, /)
 |      Return getattr(self, name).
 |  
 |  __getitem__(self, key, /)
 |      Return self[key].
 |  
 |  __gt__(self, value, /)
 |      Return self>value.
 |  
 |  __hash__(self, /)
 |

Let's do a small repetition on indexing and slicing:

In [58]:
a = ['a', 'b', 'c', 'd']
print('a =', a)
print('a[0] =', a[0])
print('a[0:2] =', a[0:2])
print('a[-1] =', a[-1])
print('a[:] =', a[:])

a = ['a', 'b', 'c', 'd']
a[0] = a
a[0:2] = ['a', 'b']
a[-1] = d
a[:] = ['a', 'b', 'c', 'd']


With indexing (and slicing) you can also replace specific elements in the list:

In [65]:
a = ['a', 'b', 'c', 'd']
a[1] = 'x'
print(a)

['a', 'x', 'c', 'd']


Finally, unlike basic data types (scalars), data structures are requires special attention when copying variables.

In [78]:
# Create new list
a = [1, 2, 3]
print(a)

# Make a "copy" and replace the first element
b = a
b[0] = 'x'
print(a)

[1, 2, 3]
['x', 2, 3]


What happened here is that in fact `b` is not a copy of `a`, but just an *alias*. They both point to the same object in the memory!

**What we need is a real copy.** A list can be copied in a number of ways:

In [79]:
a = [1, 2, 3]
b = a.copy()  # Call the list-specific method copy()
b = list(a)   # Create a new list using the values of a
b = a[:]      # Slice the entire list

b[0] = 'x'

print(a)
print(b)

[1, 2, 3]
['x', 2, 3]


### List comprehensions

Python has a special and very useful way to construct new lists, called *list comprehensions*. In many cases *list comprehensions* are equivalent to several lines of code!

In [174]:
a = [x for x in range(10)]  # Create a sequence from 0 to 9
print(a)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [173]:
b = [x for x in a if x > 2]  # Take values larger than 2
print(b)

[3, 4, 5, 6, 7, 8, 9]


In [172]:
c = [x for x in a if (x > 2) and (x % 2 == 0)]  # Take only EVEN values larger than 2
print(c)

[4, 6, 8]


# Tuple

Tuples are very similar to lists. The only difference is that they are *immutable*, meaning that once created, you cannot change their elements.

In [80]:
a = (1, 2, 3)
print(a)

(1, 2, 3)


In [81]:
a[0] = 0

TypeError: 'tuple' object does not support item assignment

How do you think, why is it useful to use tuples instead of lists in some cases?

# Set

Sets are unordered collections of objects.

Some of the often used methods are `add`, `remove`, `intersection`, `union`, `difference`.

In [115]:
basket1 = {'Apple', 'Orange', 'Tomato'}
print(basket1)

basket1.add('Lemon')
print(basket1)

{'Tomato', 'Orange', 'Apple'}
{'Tomato', 'Lemon', 'Orange', 'Apple'}


In [116]:
basket2 = {'Onion', 'Apple'}
basket3 = basket1.union(basket2)
print(basket3)

{'Tomato', 'Apple', 'Lemon', 'Onion', 'Orange'}


In [117]:
basket3 = basket1.intersection(basket2)
print(basket3)

{'Apple'}


In [118]:
basket1.remove('Apple')  # Apple MUST be in the basket
print(basket1)

{'Tomato', 'Lemon', 'Orange'}


In [120]:
# You can always check if a given object is in the set or not
if 'Apple' in basket1:
    basket1.remove('Apple')
else:
    print('No more apples in the basket')

No more apples in the basket


Finally, sets can be initialized from other sequences, like lists and tuples. This is a useful way to get rid of any duplicate values.

In [121]:
my_list = [1, 2, 3, 3]
my_set = set(my_list)
print(my_set)

{1, 2, 3}


# Dictionary

Dictionaries are used to map some values (called *keys*) to another values (called *values*).

In [143]:
d = {'City': 'Odense', 'Postal code': 5230, 'Country': 'Denmark', 'Buildings': [1, 2, 3]}
print(d)

{'City': 'Odense', 'Postal code': 5230, 'Country': 'Denmark', 'Buildings': [1, 2, 3]}


Once a dictionary is initialized, new keys and values can be added easily:

In [145]:
d['Street'] = 'Campusvej'
print(d)

{'City': 'Odense', 'Postal code': 5230, 'Country': 'Denmark', 'Buildings': [1, 2, 3], 'Street': 'Campusvej'}


A specific values can be accessed by related keys:

In [146]:
print('We are on {} in {} ({}).'.format(d['Street'], d['City'], d['Country']))

We are on Campusvej in Odense (Denmark).


In a typical scenario, the iteration over the objects stored in a dictionary looks as follows:

In [147]:
for key in d:
    print(key, '-->', d[key])

City --> Odense
Postal code --> 5230
Country --> Denmark
Buildings --> [1, 2, 3]
Street --> Campusvej


Another example:

In [148]:
food = {'vegetables': ['beetroot', 'potato'], 'fruit': ['apple', 'banana']}

for key in food:
    print('\nCattegory:', key)  # \n is simply 'new line'
    print('=========================')
    for x in food[key]:
        print(x)


Cattegory: vegetables
beetroot
potato

Cattegory: fruit
apple
banana


# Pitfalls

### Do not modify the sequence while iterating over it!
Avoid modifying a list or a dictionary when iterating over its objects. In example this might have an unexpected outcome:

In [131]:
animals = ['Dog', 'Cat', 'Horse']

for a in animals:
    print(a)
    animals.pop()

Dog
Cat


Example with a dictionary:

In [142]:
food = {'vegetables': ['beetroot', 'potato'], 'fruit': ['apple', 'banana']}

for key in food:
    food['meat'] = ['chicken']  # This is not a good idea...  
    print('\nCattegory:', key)
    print('=========================')
    for x in food[key]:
        print(x)


Cattegory: vegetables
beetroot
potato


RuntimeError: dictionary changed size during iteration

### Dictionary keys should be treated as unordered

In the recent versions of Python (3.6, 3.7) the key order in dictionaries is preserved:

In [156]:
d = {'a': 1, 'b': 2, 'c': 3}

for key in d:
    print(key)

a
b
c


However, in older Python versions the outcome is undefined. It could be `a, b, c` or `b, a, c` or any other combination.

### Copy vs. alias

Consider the following code:

In [161]:
abc = ['a', 'b', 'c']
d = {1: abc}  # A dictionary with one key
print(d)

abc[0] = 'x'  # When you change a value using indexing, you change the underlaying object
print(d)

abc = ['x', 'x', 'x']  # When you assign a new list to variable abc, it points to another object
print(d)

{1: ['a', 'b', 'c']}
{1: ['x', 'b', 'c']}
{1: ['x', 'b', 'c']}


It's always better to be explicit when you want to use a copy:

In [164]:
abc = ['a', 'b', 'c']
d = {1: abc.copy()}  # A dictionary with one key
print(d)

abc[0] = 'x'  # We change abc, but not the list in the dictionary (because it holds a copy)
print(d)

{1: ['a', 'b', 'c']}
{1: ['a', 'b', 'c']}


# Exercises

1. Define a list of lists storing the following matrix:

```
1 2 3
4 5 6
7 8 9
```

2. Define a dictionary storing the following table:

```
x  y
----
1  4
2  5
3  6
```

# Final remarks

The present data types are the most generic types which can be used in any application. E.g. you could use lists of lists to store matrices and write your own functions for matrix multiplication and other matrix-related operations. Similarly dictionaries could be used to work with tabular data.

**However**, using the built-in data types for those specialized applications would require from you to write quite complex code for things which seem to be used very often. In addition your code for those purposes would be very slow and inefficient, because:

* Python built-in types are not optimized for mathematical computations,
* You have no experience in writing such applications.

**Fortunately**, there are specialized scientific libraries available:
- `numpy` - vector and matrix data types and operations,
- `scipy` - scientific algorithms for linear algebra, optimization, statistics etc.,
- `pandas` - data types and functions to work with tabular data,
- `matplotlib` - data types and functions for data visualization.

The libraries are compatible between one another and highly inter-related.

We will get to them soon!

# The end