# DNDS6013 Scientific Python: 2nd class
## Central European University, Winter 2019/2020

Instructor: Márton Pósfai, TA: Luis Natera Orozco

Emails: posfaim@ceu.edu, natera_luis@phd.ceu.edu

#### Today's topics

- Strings + regular expressions -> extract information from a website
- Lists
- Functions
- Analyze city data
- Dictionaries

### Quiz results

Average ~50%, best 73%

Solutions on Moodle

## Lists

In [None]:
a = [ 7, 3 ,8, 10, 7, 1, 9, 1, 5]
print(a[0], a[0:4], a[-1])

### Problem 12 from the quiz

Assume the following list definition:
```python
>>> a = ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']
```
Several short interactive sessions are shown below. Which display the correct output?

1. ```python
>>> print(a[-6])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  IndexError: list index out of range
```
    
2. ```python
>>> print(a[4::-2])
['quux', 'baz', 'foo']
   ```
3. ```python
>>> print(a[-5:-3])
['bar', 'baz']
```
    
4. ```python
>>> a[:] is a
True
```
    
5. ```python
>>> max(a[2:4] + ['grault'])
'qux'
```

In [None]:
a = ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']
print(a[-6])
print(a[4::-2])
print(a[-5:-3])
print(a[:] is a)
print(max(a[2:4]+['grault']))

In [None]:
a = [1, "a", [1, 2], 45.576]
print(a[1], a[2])
b = a + [ "apple", True ]
print(b)
a += [ "pear", False ]
print(a)

In [None]:
print(a[2], "is a list.")
# we can access its elements the same way as we access the elements of a (e.g. a[2])
print("a[2][1] is the second element of a[2]:", a[2][1])

### Problem 11 from the quiz

Write an expression that returns 'z' from 'baz'.

In [None]:
x = [10,[3.141,20, [30,'baz', 2.718]],'foo']

print(x[1])
print(x[1][2])
print(x[1][2][1])
print(x[1][2][1][2])

In [None]:
b = []
a = [ 1, 2 ]
print(b)
b.append("a")
b.append(5)
b.append(a)
print(b)
b.remove("a")
print("a removed:", b)
print(b)
print("Pop",b.pop(-1))
a.reverse()
print("a reversed:",a)

In [None]:
print(list(range(10)))
print(list(range(3,10,2)))

### Basic python control flow

#### While loops, for loops and conditionals (if else)

In [None]:
for a in range(10):
    if a < 5:
        print(a)
    else:
        print(10-a)

In [None]:
f = 1.0
v = 1.0
while f * v < 1e10:
    f *= v
    v += 1.0
print("The largest factorial less than 1e10 is: %d! = %g" % (v - 1, f))

### Problem 9 of the quiz

Will the `print()` statement on line 5 execute?

In [None]:
a = ['foo', 'bar', 'baz', 'qux', 'quux', 'corge']
while a:
    print(a.pop())
else:
    print("Done.")

### Strings
Escape sequences, operations, conversions

In [None]:
s = "Hello world!"
print(s[0],s[1],s[-1],s[-2])
print(s[0:4],s[:4])
print(s[0:-1])
print(s[0:-1:2],s[::2])

#### Useful string methods

In [None]:
s = "Hello world!\n"
print(s[6:].capitalize() + "XXX")
print(s.rstrip() + "YYY")
print(s.count("l"))
print(s.index("l"))
print("123".isdigit(),"1e3".isdigit(),s.isprintable(),"Körte".isprintable())
print(s.split("l"))
print(s.strip("\n! lH"))
print(s.upper())

# Some more advanced details

Python is a dynamic language. This means that one does not have to tell the computer what is the type of the variable that you're going to create.

Normally in a static programming language working with variables has two steps:
* instantiation
* initialization

To instantiate a variable means that the programing language reserves a space in the memory according with the type of the variable. To initialize a variable instead means to use that space and to write in it the value that you want that give to the variable.
In Python the language manages itself to find out the type of the variable that you want to initialize and the two processes happen at the same time.
In order to work with variables and types in an efficient way you need to understand how python manages them.

When we assign a value to a variable, python stores some information in a given portion of the memory. To do so it needs to write the information. The memory looks like a long list of 1s and 0s (bits) and to write an information means to replace some of those 1s and 0s with some other series of 1s and 0s. To retrieve the information later, python needs to know the point in which we wrote our list of 1s and 0s and how long the list was. Basically we need to know an address and a size.

Let's play a little to understand how python does that.

In [None]:
import sys
gso = sys.getsizeof

what we have just done is to assign to the variable gso the same information that was stored in sys.getsizeof. sys.getsizeof is a function ("getsizeof") that is defined within the module sys. We will talk about modules later on in the course. To call gso or sys.getsizeof is now the same thing.

In [None]:
?gso

In [None]:
gso(3)

In [None]:
sys.getsizeof(3)

A byte is a collection of 8 bits and is a unit of digital information. A bit represent a binary number. To learn more ask wikipedia ;)

What an address looks like instead is the following:

In [None]:
?id

In [None]:
a = 3
id(a)

In [None]:
id(3)

In [None]:
b = 3
id(b)

In [None]:
b = b+1
id(b)

In [None]:
id(a)

What happened? What did python do with the addresses?
Python always tries to optimize the memory usage. "a" and "b" are two different variables. Nevertheless their address is the same. Python assign the same address to different variables that have the same value. But it knows that the variables are not the same. Knowing that, it creates a new address for the variable "b" as soon as "b" changes.
We don't have to worry to free the space of the memory used by values that we don't use anymore. Python does that automatically. A technical way of saying this is: a "garbage collector" is implemented in python.

In [None]:
var = 5
id(var)

In [None]:
var += 10
id(var)

The memory used in the old address is now available for other uses since python knows that var is not reserving it anymore.

Data types that behave like this are called *immutable*.

### Something on the lists now!

In [None]:
a = [1,2,3,4]

In [None]:
gso(a)

In [None]:
id(a)

In [None]:
a.append(5)
a

In [None]:
gso(a)

In [None]:
id(a)

The address of the list hasn't changed even if we modified the values inside the list itself. This is really important! We can modify, but its address will stay the same.
This happens because a list is a more complex object. In fact it is a collection of addresses, each of them point to a different element of the list.

In [None]:
id(a[0])

In [None]:
a[0] = 2

In [None]:
id(a[0])

In [None]:
id(a)

In [None]:
b=a
print(b)
b[3]='hello'
print(b)
print(a)

In [None]:
c = a[:]
d = a.copy()
print(id(a),id(c),id(d))

The address of a hasn't changed, but the address of the first element of the list changed.

* Data types that keep their address when changes: *mutable*
* Data types that are re-created: *immutable*

### Read/write to file

Write a text file

In [None]:
f = open("test.txt","w")

f.write("First line\n")
f.write("Second line\n")

f.close()

Read

In [None]:
f = open("test.txt","r")
for line in f:
    print(line)
f.close()

The `with` statement

In [None]:
with open("test.txt") as f:
    for line in f:
        print(line.rstrip())

Append

In [None]:
with open("test.txt","a") as f:
    f.write("\tThird line\n")

with open("test.txt") as f:
    for line in f:
        print(line, end='')

### Exercise 1

Import a web page (http://www.hydroinfo.hu/tables/ENG/442027.html) and get the water level forecast of the Danube.

But first the tools we need: module to download website and regular expressions to parse website

In [None]:
import urllib.request
link = "http://www.hydroinfo.hu/tables/ENG/442027.html"
#link = "http://posfaim.web.elte.hu/watertable.html"
f = urllib.request.urlopen(link)
content = str(f.read())

print(content.split("<tr>")[4])

### Regular expressions (regex)

Regular expression is a simple but versatile language for matching and manipulating patterns in text.

A regular expression is a string that will be used as a search pattern in another string.

For example, if we have a string: <pre>"This is a test"</pre>, and a regular expression <pre>"is a"</pre>.
This regular expression will 'match' the substring <pre>"is a"</pre> in the first string.

Special characters <pre>. ^ $ * + ? { } [ ] \ | ( )</pre>

The dot character (.) matches any character.

The star character (*) matches any number of repetitions of the pevious character.

The plus character (+) matches any nonzero number repetitions of the previous charcter.

The ? character matches zero or one repetitions of the previous characters.

The ^ character matches the beginning of the string.

The $ character matches the end of the string.

Helpful tool for debugging and understanding regexp: https://regexr.com

In [None]:
import re

name1 = "MP"
name2 = "János Kertész"

res = re.search(r"M.+P", name1) # + = nonzero number of repetitions
if res is not None:
    print(res.group(), res.start(), res.end())
else:
    print("No match.")
    
res = re.search(r"M.*P", name1) # * = nonzero number of repetitions
if res is not None:
    print(res.group(), res.start(), res.end())
else:
    print("No match.")

res = re.search(r"J.*K", name2)
if res is not None:
    print(res.group(), res.start(), res.end())
else:
    print("No match.")

In [None]:
re.sub(r"(.*) (.*)",r"\2 \1",name2)

Parsing goes from left to right and finds the longest substring that fits the pattern

In [None]:
t = "abracadabra, poof!"

print(re.sub(r"a.*a","",t))
print(re.sub(r"a.{1}a","",t))  # {n} : exactly n repetitions
print(re.sub(r"a.{2}a","",t))  # we can have multiple matches
print(re.sub(r"^a.{2}a","",t)) # ^ = beginning of string

`[...]` one of the characters inside the brackets, for example, `[ab]` = either `a` or `b`

You can also give ranges:
* `[0-9]` = any digit
* `[A-Z]` = any capitalized letter

In [None]:
t = "asda12aasd324asdad"

for v in re.finditer(r"([0-9]{2})", t):
    print(v.group())

Other useful shorthands:

|Character|Description
|:---|:---
|`\d`| one digit, same as `[0-9]`
|`\w`| word letter = letter, digit, or underscore
|`\s`| white space, such as space, tab, new line

[More info on regex in python.](https://docs.python.org/3.7/library/re.html)

In [None]:
import urllib.request
import re
link = "http://www.hydroinfo.hu/tables/ENG/442027.html"
#link = "http://posfaim.web.elte.hu/watertable.html"
f = urllib.request.urlopen(link)
content = str(f.read())
print(content[:1000])
print(len(content))

### Exercise 1 again

Import a web page (http://www.hydroinfo.hu/tables/ENG/442027.html) and get the water level forecast of the Danube.

Write a regular expression that matches the date and time in the a table, and another that matches the water level. Iterate through the matches and print them out.


### Exercise 2
Make a list from the previous results. Each element should contain a tuple with two elements: `(date, waterlevel)`. Watch out, for the first water level we do not have a date.

### Exercise 3
Write a regexp which is able to convert the first floating point number in a string to a float.
Example: "I have 205 Euros.", "The value of pi is 3.14.", "The gravitational constant is 6.67e-11 m$^{3}$kg$^{-1}$s$^{-2}$."

In [None]:
examples = ["I have 205 Euros.", "The value of pi is 3.14.", "6.67e-11 m3kg−1s−2"]


### Exercise 4
Change date format using regex, use the `re.sub` function:

In [3]:
import re
name1 = "Peter parker"
re.sub(r"(.*) (.*)",r"\2 \1",name1)

'parker Peter'

In [None]:
date = "Easter Sunday this year was 01.04.2018."
#change date to yyyy.mm.dd format


## Lists comprehensions

So far we created lists like this:

In [None]:
L = []
for x in range(5):
    L.append(x**2)
print(L)

In [None]:
L = [x**2 for x in range(5)]
print(L)

List comprehensions are similar to the mathematical formalism of defining sets: 
$$ L=\lbrace x^2 : x \in \lbrace 0, 1, 2, 3, 4\rbrace \rbrace.$$

In [None]:
L = [x**2 for x in range(5) if x > 2]
print(L)

In [None]:
L = []
for x in range(5):
    if x%2 == 0:
        L.append([x, x*x])
print(L)

In [None]:
L = [[x, x**2] for x in range(5) if x%2 == 0]
print(L)

In [None]:
import timeit

s_append = '''
L = []
for x in range(1000):
    if x%2==0:
        L.append([x,x**2])
'''
t_append = timeit.timeit(s_append, number=1000)

s_compr = 'L = [[x, x**2] for x in range(1000) if x%2 == 0]'
t_compr = timeit.timeit(s_compr, number=1000)

print(f"Run time w append {t_append}\nRuntime w list comprehension {t_compr}" )

In [None]:
%timeit L = [[x, x**2] for x in range(100) if x%2 == 0]

## Functions

In [None]:
def f(x,y):
    z = x + y
    
    return z
    
print(f(1,10))

The variable `z` is local:

In [None]:
f(1,2)
print(z)

Argument `a` of function `f(a)` is passed by assignment.

For example, if I call `f(10)`, a local variable is created as `a=10`

In [None]:
def f(a):
    a += 1
    
a = 3
f(a)
print(a)

In [None]:
def f(a):
    a[0] += 1
    
b = [ 1, 2, 3 ]
f(b)
print(b)

In [None]:
def f0():
    L = []
    for x in range(100):
        if x%2 == 0:
            L.append([x, x*x])
    return L

timeit.timeit(f0)

### Exercise 5
Write a small function which returns a list of lists in which the elements are 2-tuples: the fist element is a sequnce number the second is the value of the list. e.g.
l = \[ "alma", "apple", "pomme", "Apfel", "תפוח" \] to 
lnumbered = \[ (0,"alma"), (1,"apple"), (2,"pomme"), (3,"Apfel"), (4,"תפוח") \]

### Exercise 6
Write a small code which creates the list of list in which the element i has i elements from 0 to i-1. Ex:
<pre>
[], [0], [0, 1], [0, 1, 2], [0, 1, 2, 3]
</pre>

### Exercise 7
Write a code which gives you back a list of all pairs of numbers where both of them are less than 10 and the first is divisible by the other

## Sorting

In [None]:
l = [34,1,2,78,3]

sl = sorted(l)
print("Sorted list:",sl)

print()
print("Original:",l)

In place sorting: `sort()`

In [None]:
l.sort()
print(l)

In [None]:
guests = ["Kate","Peter", "Adam", "Jenny", "Zack", "Eva"]

print("Zack">"Eva")

print(sorted(guests))
print(sorted(guests, reverse=True))

### International characters

In [None]:
guests = ["Kate","Peter", "Adam", "Jenny", "Zack", "Eva"]

guests.append('Ödön')

print(sorted(guests))

print()
#The problem
print("Zack">"Odon")
print("Zack">"Ödön")

In [None]:
import locale
locale.getlocale()

In [None]:
locale.setlocale(locale.LC_ALL, ('hu_HU','UTF-8'))

print(locale.strxfrm("Ödön"))
print(locale.strxfrm("Zack")>locale.strxfrm("Ödön"))

print()
print(sorted(guests, key=locale.strxfrm))

In [None]:
def f1(x): 
    return x % 7
  
L = [15, 3, 11, 7] 
  
print("Normal sort :", sorted(L) )
print("Sorted with key:", sorted(L, key = f1) )

### Exercise 8
Write code which orders the list generated by <pre>
L = [[x, y] for x in range(5) for y in range(5) if x != y]
</pre>
by the sum of the two elements

In [None]:
L = [[x, y] for x in range(5) for y in range(5) if x != y]


## Lambdas: one-line nameless functions

In [None]:
f1 = lambda x: x%7
L = [15, 3, 11, 7] 
print("Sorted with key:", sorted(L, key = f1) )

In [None]:
L = [15, 3, 11, 7] 
print("Sorted with key:", sorted(L, key = lambda x: x%7) )

### Exercise 9
Use lambda to sort the <pre>
L = [[x, y] for x in range(5) for y in range(5) if x != y]
</pre>
array by the second value in the list

In [None]:
L = [[x, y] for x in range(5) for y in range(5) if x != y]


## Analyzing city data

In [None]:
f = open("Hun_cities.csv","r")
myfile = str(f.read())
f.close()

In [None]:
myfile.split('\n')[5]

In [None]:
cities = []
for line in myfile.split('\n'):
    cities.append(line.split(','))
print(cities[0])
print(cities[1])

Alternative way of doing this:

In [None]:
cities = []
with open("Hun_cities.csv","r") as f:
    for line in f:
        cities.append(line.rstrip().split(','))
print(cities[4])
print(cities[5])
print(len(cities))

In [None]:
long = [ float(row[5]) for row in cities[1:]]
print(long[0:5])

### Exercise 10
Remove the double quotes from city names

### Exercise 11

 * Print the top 10 most populus cities
 * Print the top 10 cities with the largest area
 * Print the smallest city with university
 * Print the top 10 cities with population density

### Exercise 12
 * find the cities with the larges distance between them
 * find the cities with the smallest distance between them

A little help, convert lattitude and longitude to distance use the [haversine formula](https://en.wikipedia.org/wiki/Haversine_formula):

In [None]:
import math
def latlongdist(lat1,long1,lat2,long2):
    rlat1 = math.radians(lat1)
    rlat2 = math.radians(lat2)
    rlong1 = math.radians(long1)
    rlong2 = math.radians(long2)
    dlat = rlat2 - rlat1
    dlong = rlong2 - rlong1
    a = math.sin(dlat / 2)**2 + math.cos(rlat1) * math.cos(rlat2) * math.sin(dlong / 2)**2
    c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
    return 6371.0 * c

In [None]:
print(latlongdist(48.105625, 20.790556, 46.07308, 18.22857))

In [None]:
import timeit
timeit.timeit(alldist,number=1)

### Bucketing
On my laptop calculation took ~9 seconds for $n = 2561$ cities. That is $\frac{n(n-1)}{2}$ distances.

What about larger systems? -> Bucketing!

* Divide the space into boxes
* Put the cities into boxes
* Number of boxes is much less than number of cities
* Select only the boxes which are candidates for the given quantity: for minimal distance only neighboring ones, for maximal distance the few with maximal distance

In [None]:
lat = [ float(row[5]) for row in cities[1:]]
print("Lattitude:",min(lat),max(lat))
long = [ float(row[6]) for row in cities[1:]]
print("Longitude:",min(long),max(long))

In [None]:
shape = (4, 7)  # shape of the grid

la_range = (min(lat),  max(lat))
lo_range = (min(long), max(long))

dla = (la_range[1]-la_range[0])/shape[0]
dlo = (lo_range[1]-lo_range[0])/shape[1]

grid = [[[] for k in range(shape[1])] for j in range(shape[0])]

for c in cities[1:]:
    ilat  = int((float(c[5]) - la_range[0]) / dla)
    ilong = int((float(c[6]) - lo_range[0]) / dlo)
    
    if ilat == shape[0]:
        ilat -= 1
    if ilong == shape[1]:
        ilong -= 1
    
    grid[ilat][ilong].append(c)

for i in range(shape[0]):
    for j in range(shape[1]):
        print("%3d" % len(grid[i][j]), end=" ")
    print()

<b>Largest distance</b><br>
* Get largest distance between nonempty boxes (28 distance calculations)
* Pair cities from the two boxes
* This is most probably the largest distance between two cities (Stop here)
* In order to be really sure one should pair cities in boxes with less then the found distance

In [None]:
boxcoords = [[grid[j][k],(j+.5)*dla+la_range[0],(k+.5)*dlo+lo_range[0]] for k in range(shape[1]) for j in range(shape[0])]

boxdists = [(b1[0],b2[0],latlongdist(b1[1],b1[2],b2[1],b2[2])) for b1 in boxcoords for b2 in boxcoords if b1[0] and b2[0]]
maxboxdist = max(boxdists, key= lambda x: x[2])
citydists = [(c1[1],c2[1],latlongdist(float(c1[5]),float(c1[6]),float(c2[5]),float(c2[6]))) for c1 in maxboxdist[0] for c2 in maxboxdist[1]]
print(max(citydists))

In [None]:
for c in cities[1:]:
    if c[2] in ['"Or"','"Ortilos"','"Uszka"', '"Szakonyfalu"']:
        ilat  = int((float(c[5]) - la_range[0]) / dla)
        ilong = int((float(c[6]) - lo_range[0]) / dlo)

        if ilat == shape[0]:
            ilat -= 1
        if ilong == shape[1]:
            ilong -= 1
        
        print(c[1],ilat,ilong)

for i in range(shape[0]):
    for j in range(shape[1]):
        print("%3d" % len(grid[i][j]), end=" ")
    print()

### Exercise 13
Find smallest distance between cities.

* Pair all cities inside boxes, and with cities in the neighboring boxes.
* Select the minimum

## Dictionaries

Dictionaries are also like lists, except that each element is a key-value pair. The syntax for dictionaries is `{key1 : value1, key2 : value2, ...}`:

In [None]:
fruits = {"bananas" : 1,
          "oranges" : 2,
          "apples" : 3,}

print(type(fruits))
print(fruits)

In [None]:
print("bananas = " + str(fruits["bananas"]))
print("oranges = " + str(fruits["oranges"]))
print("apples  = " + str(fruits["apples"]))

In [None]:
# change value
fruits["bananas"] = "no bananas"
fruits["oranges"] = 100

# add a new entry
fruits["pineapples"] = "D"

print("bananas = " + str(fruits["bananas"]))
print("oranges = " + str(fruits["oranges"]))
print("apples = " + str(fruits["apples"]))
print("pineapples = " + str(fruits["pineapples"]))

Strings, numbers, and tuples work as keys, and any type can be a value. Other types may or may not work correctly as keys (strings and tuples work cleanly since they are immutable). Looking up a value which is not in the dict throws a KeyError -- use "in" to check if the key is in the dict, or use dict.get(key) which returns the value or None if the key is not present (or get(key, not-found) allows you to specify what value to return in the not-found case).

In [None]:
print(fruits['bananas'])     ## Simple lookup

If you try to access something that does not exists in the dictionary, you will get an error:

In [None]:
print(fruits['strawberries'])

To avoid key errors, you can simply check with an ``if`` that the key is present in the dictionary: 

In [None]:
if 'bananas' in fruits: print(fruits['bananas']) ## Yes, you can also write an if in this way 

if 'strawberries' in fruits: print (fruits['strawberries']) ## and an if-else in this way 
else: print("I don\'t know what a strawberry is")

An alternative way to access keys in a dictionary is with the method ``get``. If the key does not exist you get a None:

In [None]:
print(fruits.get('bananas'))
print(fruits.get('strawberries')) 

You can also define a default different from ``None``:

In [None]:
print(fruits.get('strawberries',0))  

To iterate over key-value pairs of a dictionary:

In [None]:
for key in fruits: 
    print(key, fruits[key])

This iteration is equivalent to iterating over ```fruits.keys()```:

In [None]:
for key in fruits.keys(): 
    print(key)

In [None]:
type(fruits.keys())

If you want to show the values instead:

In [None]:
for value in fruits.values():
    print(value)

In [None]:
type(fruits.values)

If you want to show both at the same time, you can use fruits.items()

In [None]:
for key, value in fruits.items():
    print(key + " = " + str(value))

_**Note that the keys are not sorted, nor are listed in the order you added them to the dictionary!**_ If you want to do that, you should sort the keys first

In [None]:
for key in sorted(fruits.keys()):
    print(key, fruits[key])

In [None]:
#fruits['bananas']=23
#fruits['pineapples']=45
for value in sorted(fruits.values()):
    print(value)

Dictionary comprehensions

In [None]:
D = {k:k*k for k in range(3)}
print(D)

In [None]:
for value in D.values():
    print(value)

### Exercise 14
Build a dictionary from the cities list. The key should be the accentless cityname (column 2), The value should be the population

In [None]:
cdict = {c[2]:int(c[3]) for c in cities[1:]}
print(cdict['Szeged'])

### Exercise 15
Build a dictionary containing dictionaries from the cities list. The key should be the accentless cityname (column 2), The value should be a dictionary, with 'population' the population and 'area' the size in km$^2$

In [None]:
cdict2 = {c[2]:{"population":int(c[3]), "area":float(c[4])} for c in cities[1:]}
print(cdict2['Szeged'])

## Advanced part: More on lambda functions, map and iterators.

In [None]:
%timeit map(lambda x: x**2, range(1000000))

In [None]:
%timeit [x**2 for x in range(1000000)]

The first expression seems to be way faster than the second.

What map does is to apply the function at its first argument on each member on the iterable at its second argument.
But there is a trick here. We are working in the first case with generators.

If we convert everything to a list:

In [None]:
%timeit list(map(lambda x: x**2, range(1000000)))

What happened? What is a generator?

In [None]:
range(1000)

The function range itself is a generator.

In [None]:
map(lambda x: x%2, range(1000))

As we see Python doesn't compute any value when we call a map or a range, but it will do so only when the value is required, i.e. when we iterate throw the list.
This is a big difference and a key concept in python 3 where most of its elements are iterators.

To work with iterators **saves a lot of memory!!**

We don't have in fact to store the 1-million-lenght list that we created with `[x**2 for x in range(1000000)]`, but just create and call that element only when needed.
Let's say we have to do an operation with the elements of that list (in this example we sum the element of the list), and let's explore the computational time of the two appraches:

In [None]:
from sys import getsizeof

In [None]:
getsizeof(map(lambda x: x**2, range(10000000)))

In [None]:
getsizeof([x**2 for x in range(10000000)])

The difference is not negligible!!!

Another way to use generators is the following! (note the brakets!)

In [None]:
(x**2 for x in range(10))

An observation now: since a generator returns the elements it generates only when we iterate through them, we can't get the i-th element:

In [None]:
map(lambda x:3, range(10))[4]

In [None]:
(x**2 for x in range(10))[4]

Let's take a look at the computational time:

In [None]:
%timeit [x**2+3 for x in range(100)]

In [None]:
%timeit (x**2+3 for x in range(100))

In [None]:
%timeit sum([x**2+3 for x in range(10000000)])

In [None]:
getsizeof([x**2+3 for x in range(1000)])

In [None]:
%timeit sum(x**2+3 for x in range(10000000))

In [None]:
getsizeof((x**2+3 for x in range(1000)))

In [None]:
%%timeit
for i in [x**2+3 for x in range(100)]:
    i+3

In [None]:
%%timeit
for i in (x**2+3 for x in range(100)):
    i+3

In [None]:
%%timeit
l = [x**2+3 for x in range(10000000)]

In [None]:
%%timeit
g = map(lambda x: x**2+3, range(10000000))