# Built-in data structures: containers and sequences

- One of the great and popular features of python is the presence of built-in containers for sequenes of objects.
  - These are similar to STL containers discussed in C++.

- Since in python everything is an object and all objects can be refrenced in the same way, containers can include objects of different type
  - this is unlike anything seen in C++
  
- these built-in types and the reference-driven flexibility of python has made it very popular for data analysis

- basic built-in data structures in python are
  - tuple
  - list
  - set
  - dictionary
  
- Today we only focus on these types
- We will introduce more advanced types when discussing [NumPy](https://www.numpy.org) and [pandas](http://pandas.pydata.org) packages, e.g.
  - ndarrays
  - series
  - time series
  - DataFrame

## Tuples

- sequence of python objects
  - fixed length
  - immutable

to create a tuple simply separate its elements with a `,`

In [1]:
a = 'lec23', 'lec24', 'lec25'
print(a)


('lec23', 'lec24', 'lec25')


In [2]:
len(a)

3

a tuple can contain different type of objects

In [3]:
b = 'paul', 24, 1.75, 85.3
print(b)

('paul', 24, 1.75, 85.3)


In [4]:
print(a,b)

('lec23', 'lec24', 'lec25') ('paul', 24, 1.75, 85.3)


## access tuple elements
Access to the i-th element of a tuple is done with `[]` operator

In [5]:
print(a[2])
print(b[3])
print(type(b[1]))
print(len(b))
print(b[4])

lec25
85.3
<class 'int'>
4


IndexError: tuple index out of range

Note how there is protection against out-of-bound access to tuples.

## empty or one-element tuple

In [6]:
c = ()
print(type(c),c)

d = 'something',
print(type(d),d)



<class 'tuple'> ()
<class 'tuple'> ('something',)


Note that the `,` is critical to distinguish a on-element tuple from a normal variable.

In [7]:
e = 'valore'
print(type(e), e)

g = ('something')
print(type(g),g)


f = 'valore',
print(type(f), f)


h = ('another',)
print(type(h), h)

<class 'str'> valore
<class 'str'> something
<class 'tuple'> ('valore',)
<class 'tuple'> ('another',)


## conversion to tuple

In [10]:
print(range(3,10))

range(3, 10)


In [11]:
tup = range(10)
print("length: ",len(tup))
print("tup:",tup)

length:  10
tup: range(0, 10)


Note how tup is not a tuple but simply a refernce to function call `range(10)`.

If you want a tuple you have to explicitly convert the output of `range(10)` to be a tuple

In [12]:
tup = tuple(range(10))
print("length: ", len(tup))
print("tup: ", tup)

length:  10
tup:  (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)


Iterating over a tuple is easy

In [13]:
for i in tup:
    print(i)

0
1
2
3
4
5
6
7
8
9


## converting strings to tuples


In [15]:
tup = tuple("hello world")
print("tup: ",tup)
print(len(tup))

for i in tup:
    print(i)

for i in tup:
    print(i,end=":")
    
print("\n")    


tup:  ('h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd')
11
h
e
l
l
o
 
w
o
r
l
d
h:e:l:l:o: :w:o:r:l:d:



## Tuples can contain any object
even a function is a valid object

In [17]:
def myprod(a,b=3.145,scale=1.0):
    return a*b*scale

tup = (1, 'name', myprod)
print("tup: ",tup)

for i in tup:
    print(type(i))

tup:  (1, 'name', <function myprod at 0x105a02950>)
<class 'int'>
<class 'str'>
<class 'function'>


In [18]:
scale = 3.2
print(tup[2](3,4))
print(myprod(2,scale=scale))

12.0
20.128


a tuple can contain tuples as its elements

In [19]:
x = a,b,c, tup

for i in x:
    print("i: ", i)

i:  ('lec23', 'lec24', 'lec25')
i:  ('paul', 24, 1.75, 85.3)
i:  ()
i:  (1, 'name', <function myprod at 0x105a02950>)


In [20]:
print(x[2])
print(x[0])
print(x[3][2](3,5))

()
('lec23', 'lec24', 'lec25')
15.0


## Tuple is immutable
You can bind a variable to a new tuple but you cannot change an element of a tuple

In [21]:
print(b)

('paul', 24, 1.75, 85.3)


In [22]:
b[0] = 'one'

TypeError: 'tuple' object does not support item assignment

In [25]:
y = 'one', a, (2,3)
print(y)
print(b)

('one', ('lec23', 'lec24', 'lec25'), (2, 3))
('paul', 24, 1.75, 85.3)


In [26]:
b = y
print(b)

('one', ('lec23', 'lec24', 'lec25'), (2, 3))


In [27]:
ntuple  = 'lec23', 'lec27', 'lec25', 'lec25', 3.14, 3.56, 3.97
b = ntuple
print(b)
print(b.index('lec25'))
print(b.count('lec25'))
print(b.count(3.14))
print(type(b.count('lec25')))

('lec23', 'lec27', 'lec25', 'lec25', 3.14, 3.56, 3.97)
2
2
1
<class 'int'>


## Tuple methods
given the limitation of tuple, content and size are immutable, there are very few methods. (checkout `dir(tuple)`)

One very useful one is `count()`

In [28]:
grades = [30, 22, 24, 23, 30, 18, 24, 27, 28, 28, 25, 24, 22, 30, 30, 18, 20]
grades.count(30)

4

## Lists
- Lists are also a collection of objects but unlike tuples they are mutable
  - variable length
  - each element can be modified
  

In [29]:
alist = [2,3,4]
print(alist)
print(alist[2])
alist[2] = -3
print(alist)

[2, 3, 4]
4
[2, 3, -3]


lists (and tuples) are protected against out of range index

In [30]:
print(len(alist))
alist[3]

3


IndexError: list index out of range

A list can cantain any type of data. In this example the list is made of strings, float, int, function, lists, and tuples

In [31]:
alist = ['one', 2, 3.24, myprod, (23,24), ['lec1', 'lec2', [myprod,3.14]]]
print(alist)
print(alist[5][2][0](6,7))

['one', 2, 3.24, <function myprod at 0x105a02950>, (23, 24), ['lec1', 'lec2', [<function myprod at 0x105a02950>, 3.14]]]
42.0


## lists and tuples
- a list is created using the `[]` operator or the explicit type `list`
- a tuple is created with the `()` operator or the explicit type `tuple`
- Lists and tuples are semantically similar
  - many functions can take a tuple or a list
  
- Lists are used in data analysis to store data from iterators or generators

In [34]:
values  = range(-3,10, 2)
print(values)
print(list(values))
print(tuple(values))

range(-3, 10, 2)
[-3, -1, 1, 3, 5, 7, 9]
(-3, -1, 1, 3, 5, 7, 9)


Note that as with tuples, you have to convert the output of `range` to be a list.

## list from tuple
you can create a list from a tuple by explicit conversion 

In [35]:
print(a)
blist = list(a)
print(blist)
blist[2] = 'lec28'
blist
a = tuple(blist)
print(a)

('lec23', 'lec24', 'lec25')
['lec23', 'lec24', 'lec25']
('lec23', 'lec24', 'lec28')


## Manipulating lists

### adding and removing elements
to add an element at the end of the list

In [36]:
clist = ['one', 2, 3.14, 4, 'five']
clist.append(6)
print(clist)

['one', 2, 3.14, 4, 'five', 6]


We can also insert a value at a specific location by providing the index

In [37]:
clist.insert(2, 'two')
print(clist)

['one', 2, 'two', 3.14, 4, 'five', 6]


note how the new element is inserted __before__ the indicated index. 

You can also remove an element from the list at a specific location with `pop`

In [38]:
clist.pop(2)
print(clist)

['one', 2, 3.14, 4, 'five', 6]


The `insert` and `pop` methods have a return value. 

In particular with `pop` it is useful to see the value you have removed from the list

In [39]:
x = clist.insert(2, 'test')
print (x)
x = clist.pop(2)
print(x)
print(clist)

None
test
['one', 2, 3.14, 4, 'five', 6]


### removing by value
Although not very efficient, you can remove a given value from the list. It will only remove the first such occurance. python will linearly go through all elements until it finds the first occurance

In [41]:
print(4 in clist)
print(clist)
clist.append(4)
print(clist)

True
['one', 2, 3.14, 4, 'five', 6, 4]
['one', 2, 3.14, 4, 'five', 6, 4, 4]


In [42]:
if 4 in clist:
    clist.remove(4)
print(clist)


['one', 2, 3.14, 'five', 6, 4, 4]


In [43]:
if 4 in clist:
    clist.remove(4)
print(clist)

['one', 2, 3.14, 'five', 6, 4]


### combining lists
you can use `+` to combine or extend exisiting or new lists

In [44]:
print(blist)
print(clist)
all = blist + ['id', 'name', 'major']
print(all)

['lec23', 'lec24', 'lec28']
['one', 2, 3.14, 'five', 6, 4]
['lec23', 'lec24', 'lec28', 'id', 'name', 'major']


Note that this is very different than doing

In [58]:
all = [blist,'id', 'name', 'major']
print(all)

[['lec23', 'lec24', 'lec28'], 'id', 'name', 'major']


The most efficient way to extend a list is with `extend`. It can take one or more elements to be added

In [59]:
print(all.index('id'))
print(all[-1])
print(all[-3])

1
major
id


In [60]:
all.extend([2,3,4, 'test', 'python'])
print(all)
all.append(4.56)
print(all)
all.extend( (2,3))
print(all)
all.append( (2,3))
print(all)
print(all[ all.index( (2,3)) ][1])
print(all[-1][1])

[['lec23', 'lec24', 'lec28'], 'id', 'name', 'major', 2, 3, 4, 'test', 'python']
[['lec23', 'lec24', 'lec28'], 'id', 'name', 'major', 2, 3, 4, 'test', 'python', 4.56]
[['lec23', 'lec24', 'lec28'], 'id', 'name', 'major', 2, 3, 4, 'test', 'python', 4.56, 2, 3]
[['lec23', 'lec24', 'lec28'], 'id', 'name', 'major', 2, 3, 4, 'test', 'python', 4.56, 2, 3, (2, 3)]
3
3


### sorting a list
lists of elements that can be compared to each other can be sorted

In [52]:
all.sort()
print(all)


TypeError: '<' not supported between instances of 'str' and 'list'

In [53]:
months = ['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december']
print(months)

['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december']


In [54]:
months.sort()
print(months)

['april', 'august', 'december', 'february', 'january', 'july', 'june', 'march', 'may', 'november', 'october', 'september']


In [61]:
months.sort(key=len)
print(months)

['may', 'july', 'june', 'april', 'march', 'august', 'january', 'october', 'december', 'february', 'november', 'september']


In [62]:
help(list.sort)

Help on method_descriptor:

sort(self, /, *, key=None, reverse=False)
    Stable sort *IN PLACE*.



In [64]:
months.sort(key=len,reverse=True)
print(months)

['september', 'december', 'february', 'november', 'january', 'october', 'august', 'april', 'march', 'july', 'june', 'may']


### sort vs sorted
in this example `sort()` is applied to the object and modifies it. Instead we might prefer keeping the data intact and have a new sorted copy

In [65]:
months = ['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december']
print(months)
sorted_months_byname = sorted(months)
print(sorted_months_byname)
sorted_months_bylen = sorted(months, key=len)
print(sorted_months_bylen)
help(sorted)

['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december']
['april', 'august', 'december', 'february', 'january', 'july', 'june', 'march', 'may', 'november', 'october', 'september']
['may', 'june', 'july', 'march', 'april', 'august', 'january', 'october', 'february', 'november', 'december', 'september']
Help on built-in function sorted in module builtins:

sorted(iterable, /, *, key=None, reverse=False)
    Return a new list containing all items from the iterable in ascending order.
    
    A custom key function can be supplied to customize the sort order, and the
    reverse flag can be set to request the result in descending order.



### lists and strings

In [66]:
chars = list("in a far away galaxy")
print(chars)
chars.count(' ')


['i', 'n', ' ', 'a', ' ', 'f', 'a', 'r', ' ', 'a', 'w', 'a', 'y', ' ', 'g', 'a', 'l', 'a', 'x', 'y']


4

### enumerate function
useful python function  to keep track of index while iterating on a collection, e.g. a list.

see how in python the `for` loop takes advantage of `numerate`

In [67]:
for i,m in enumerate(months):
    print("month %-2d: %s"%(i+1,m))

month 1 : january
month 2 : february
month 3 : march
month 4 : april
month 5 : may
month 6 : june
month 7 : july
month 8 : august
month 9 : september
month 10: october
month 11: november
month 12: december


### slicing
one of most popular featurs in data analysis with python is the possibility of accessing a subset of a collection by specifying the indices

In [68]:
print(months[:3])

['january', 'february', 'march']


In [69]:
print(months[4:6])

['may', 'june']


In [70]:
print(months[5:])
print(len(months[6:]))

['june', 'july', 'august', 'september', 'october', 'november', 'december']
6


In [71]:
print(months[:-2])
print(months[-2:])
print(months[-6:-2])

['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october']
['november', 'december']
['july', 'august', 'september', 'october']


### references and lists
in python all collection objects are handled as a reference. This is shown explicitly in this example

In [72]:
newlist = months
print(newlist)

['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december']


In [73]:
newlist.append('NewMonth')
print(months)

['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december', 'NewMonth']


so `newlist` __is not a new copy__. `newlist` and `months` are simply two references to the same list object!

to have a new copy you have to use the explcit conversion

In [75]:
newlist = list(months)
newlist.append('CrazyMonth')
print(months)
print(newlist)

['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december', 'NewMonth']
['january', 'february', 'march', 'april', 'may', 'june', 'july', 'august', 'september', 'october', 'november', 'december', 'NewMonth', 'CrazyMonth']


# Using a list for plotting

## motion of a body under gravity
We want to simulate the motion of a body under gravity. This is one of the first exercises in __Laboratorio di Calcolo__. This time we also want to quickly plot the trajectory to check our equations.

In [77]:
%matplotlib notebook
import matplotlib.pyplot as plt
import math


# initial conditions
g = 9.8
h = 0.
theta = (45./180.)*math.pi
v0 = 1.
dt=0.01
        
#compute velocity components
v0x = v0*math.cos(theta)
v0y = v0*math.sin(theta)
print("v0_x: %.3f m/s \t v0_y: %.3f m/s"%(v0x,v0y))

t=0.
x=[]
y=[]
xi=0
yi=h

while yi>=0:
    x.append(xi)
    y.append(yi)
    t+=dt
    xi=v0x*t
    yi=h+v0y*t-0.5*g*t*t

#print(x,y)
plt.plot(x,y)

v0_x: 0.707 m/s 	 v0_y: 0.707 m/s


<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x10fe3a048>]

To make it more flexible we could ask the user to provide initial conditons

In [78]:
%matplotlib notebook
import matplotlib.pyplot as plt
import math

# initial conditions
g = 9.8
h = 0.

v0 = 10.
dt=0.1

theta = 45
while True:
    theta = float(input("angle theta in [0,90] degrees: "))
    if(theta>0 and theta<90): break
theta = (theta/180.)*math.pi

#compute velocity components
v0x = v0*math.cos(theta)
v0y = v0*math.sin(theta)
print("v0_x: %.3f m/s \t v0_y: %.3f m/s"%(v0x,v0y))

t=0.
x=[]
y=[]
xi=0
yi=h

while yi>=0:
    x.append(xi)
    y.append(yi)
    t+=dt
    xi=v0x*t
    yi=h+v0y*t-0.5*g*t*t

#print(x,y)
plt.plot(x,y)

angle theta in [0,90] degrees: 67
v0_x: 3.907 m/s 	 v0_y: 9.205 m/s


<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x11014a4a8>]

__To make it more user friendly we could provide a default value for the angle!__

We do this by providing a default and pressing return w/o any input

In [80]:
%matplotlib notebook
import matplotlib.pyplot as plt
import math

# initial conditions
g = 9.8
h = 0.

v0 = 10.
dt=0.01

theta = 23.
while True:
    x = input("angle theta in [0,90] degrees (press return for {0} degree): ".format(theta))
    if x == "" : break
    theta  = float(x)
    if(theta>0 and theta<90): break
theta = (theta/90.)*math.pi/2.
 
#compute velocity components
v0x = v0*math.cos(theta)
v0y = v0*math.sin(theta)
print("v0_x: %.3f m/s \t v0_y: %.3f m/s"%(v0x,v0y))

t=0.
x=[]
y=[]
xi=0
yi=h

while yi>=0:
    x.append(xi)
    y.append(yi)
    t+=dt
    xi=v0x*t
    yi=h+v0y*t-0.5*g*t*t

#print(x,y)
plt.plot(x,y)

angle theta in [0,90] degrees (press return for 23.0 degree): 
v0_x: 9.205 m/s 	 v0_y: 3.907 m/s


<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x1105cb908>]

We now change all the variables to be configurable by the user

In [81]:
%matplotlib notebook
import matplotlib.pyplot as plt
import math

g = 9.8

h = 0.
while True:
    x = input("initial height h in m: (press return for h = 0 m): ")
    if x == "":  break
    h = float(x)
    if(h>=0): break


theta = 20.
while True:
    x = input("angle theta in [0,90] degrees (press return for {0} degree): ".format(theta))
    if x == "" : break
    theta  = float(x)
    if(theta>0 and theta<90): break
theta = (theta/90.)*math.pi/2.
        

v0 = 10.
while True:
    x = input("insert v_0 > 0 in m/s (press return for {0} m/s): ".format(v0))
    if x == "":  break
    v0 = float(x)
    if(v0>0): break

dt=0.1
while True:
    x = input("insert dt > 0 in sec (press return for {0} sec): ".format(dt))
    if x == "": break
    dt = float(x)
    if(dt>0): break
      
        
v0x = v0*math.cos(theta)
v0y = v0*math.sin(theta)

print("v0_x: %.3f m/s \t v0_y: %.3f m/s"%(v0x,v0y))

t=0.
x=[]
y=[]
xi=0
yi=h

while yi>=0:
    x.append(xi)
    y.append(yi)
    t+=dt
    xi=v0x*t
    yi=h+v0y*t-0.5*g*t*t

#print(x,y)
plt.plot(x,y)

# we also make the plot nicer
plt.title('motion under gravity')
plt.xlabel("x [m]")
plt.ylabel("y [m]")
plt.grid(True)
plt.xlim(-0.1, max(x)*1.1)
plt.ylim(-0.1,max(y)*1.10)


initial height h in m: (press return for h = 0 m): 3
angle theta in [0,90] degrees (press return for 20.0 degree): 56
insert v_0 > 0 in m/s (press return for 10.0 m/s): 3
insert dt > 0 in sec (press return for 0.1 sec): 0.001
v0_x: 1.678 m/s 	 v0_y: 2.487 m/s


<IPython.core.display.Javascript object>

(-0.1, 3.6471580533156365)