# Containers   
Fisica Terrestre  
7/10/2022  
Marco Scuderi

To perform computation with data we need some containers to put them and then be able to perform manipulation and analysis.  
Understand what is the **best container** for your data and your analysis is fundamental to obtain the best result and reduce computational time.   
There are many differnt types of containers:  
- Lists  
- Dictionaries  
- Pandas Series/DataFrame  
- Array  
- Rec Array  
- And many others ... 

### But what is the basic concept behind a container?  
They are a sequence of ordered items.  

### Lists  
Lists in Python are one-dimensional, ordered containers whose elements may be any Python objects. Lists are mutable and have methods for adding and removing elements to and from themselves.  
You can think of them as a box where you can put whatever you want to collect.  
**In Python, unlike in other languages, the elements of a list do not have to match other
in type.**

Let's look at some examples:

In [None]:
my_list = [1,2,3] # this list has three items
print(type(my_list))
empty_list = []                                              # this list has no items
mixed_list = [-0.2, 300, 'a string', 5.3, True, my_list]     # this list has four items of different types, one is another list

print(mixed_list)

Note, **lists** use square brackets [] whereas **functions** use round brackets () - ***syntax!***

#### List **INDEXING**  
We can **access** the items of a list by passing an **index** to the variable name. The indices begin at 0 (the first item in a list) and increment by 1. For example:

In [None]:
my_list = [1,2,3]  
print(my_list[2])
# sum two lists 
print(my_list[1]+my_list[2])
var1 = 2
var2 = 3
print(var1+var2)

In [None]:
# let's look at some indexing
print(my_list[2], my_list[-1])                 # these access the same element of the list
print(my_list[1], my_list[-2])                 # so do these
print(my_list[-0])

Some more examples

In [None]:
print(mixed_list)
print(mixed_list[1:5])                         # all items between index 1 and 3 NOT including 3
print(mixed_list[:3])                          # all items from the START of the list up to index 3 NOT including 3
print(mixed_list[3:])                          # all items from index 3 up to the END of the list

In [None]:
# revert the list 
print(mixed_list[::-1])

Is there anything you notice when performing indexing?   

In other words, what elements would you expect by running the cell below?

In [None]:
print (mixed_list[0:1])

In Python indices are not referred to elements, but in between them !!


In [None]:
a = [1,2,3,4,5]
a[:]

#### Some basic properties and methods of Lists 

You can **concatenate** two lists together using the addition operator (+) to form a longer list:

In [None]:
list1 = [1, 1]
list2 = [2, 3, 5]
tot = list1+list2
print(tot)

[1, 1] + [2, 3, 5] + [8]

You can also **append** to lists in-place using the append() method, which adds a single
element to the end:

In [None]:
# define a list 
fib = [1, 1, 2, 3, 5, 8]
fib

In [None]:
# append an element
pippo = [100,101,102]
fib.append(pippo)
fib

You can **insert an element**

In [None]:
# insert element
fib.insert(1,'foo')
fib

Or **Eliminate elements**

In [None]:
# eliminate elements 
fib.pop(-1)
fib

Or check if an elemnt is in the list 

In [None]:
# check if an element is in the list 
'foo' in fib

#### List sorting

In [None]:
pippo = [4,3,1,2,8,9]
pippo.sort()
pippo

In [None]:
pluto = ['foo','thisislong','v']
pluto.sort(key=len)
pluto

## Arrays  
they are a **list of numbers** that have to be of the same type. Arrays have special properties that can greatly help us in data handling and scientific computing.   


### Why numpy ? 
Let' say that I have two series of numbers and I want to add them together.  
If I use two listst and add them together what will be the result?

In [None]:
a = [1,2,3]
b = [4,5,6]
print (a+b)
results=[]
print ('This is wrong')
for i, j in zip(a,b):
    results.append(i+j)
print(results)

An **array** is a generic multidimensional container for homogenous data. This means that all the elemnts must be the same type (e.g. int, float etc.).  
Arrays have **dtype** that shows what type of elemnts the array is composed by.

In [None]:
import numpy as np 
a = np.array([1,2,3,4])
b = np.array([1.0,2.,3.,4.])
print (type(a))
print (a.dtype)
print (b.dtype)
  

When using arrays be careful with the type of the elements

In [None]:
a[0] = 10
print (a)
a[0] = 11.7 # !! the element is truncated at the decimal place because it has to be an integer !!
print (a)

## Creating arrays #
There are many functions that allow us to create arrays:  
- array([])
- arange(start,stop,step,dtype)
- linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)
- zeros(shape, dtype=)
- ones(shape,dtype=)

### arange(start,stop,step,dtype)

In [None]:
a = np.arange(0,10,0.1)
print(a.dtype)

### linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)


In [None]:
np.linspace(0,10,100)


### zeros(shape, dtype=)


In [None]:
np.zeros(10)

In [None]:
np.zeros((10,3))

In [None]:
np.ones((2,3))

**Arrays can be of multiple dimensions, up to 4 dimention.**

In [None]:
a = np.array([1,2,3,4])
print (a)
a.ndim

To show the number of elements along each dimension we use the function ***shape***.  
This function always return a touple with the number of elemnts along each axis.

In [None]:
a.shape

In [None]:
a = np.arange(0,9)
print(a)
b = a.reshape(3,3)
print (b)
b.shape


## Perform arithmetic with arrays elemnt wise #  
Unlike  lists, arrays are designed to perform element wise operations.  
This is called **vectorized computation**. It means that instead of writing a for loop as before here it is implicit (it happens behind the scene).  
It can be done between arrays, constants or other elements.


In [None]:
a = np.arange(0,5)
print (a)
f = np.array([20,21,22,23,24])

# operations between arrays
print (a+f)
print (a*f)
print (a/f)
print (a**f) #elevate to the power 

# operations between arrays and constants
print (f+10)


## Array indexing ##
As we have seen before we can access different elements of an array.  
Arrays are mutable containers !!  
For one dimentional arrays the indexing works the same way as for lists.

Extract a portion of an array by specifying a lower and upper bound.  
As always the lower elemnt is included but the upper element is excluded.  
It works the same way as for lists as we have seen before.  
**array[lower:upper:step]**  

In [None]:
a = np.arange(10)
print (a)
a[3:6]

**Multi-dimentional arrays**  
Those are very important because data retrieved from an instrument always contain at least two columns: time, quantity.  

In [None]:
# 2-dimentional array
a = np.linspace(1,10,10).reshape(2,5)
a

**The shape of this array is given by**  

In [None]:
a.shape

**Access and set elements of multi-dimentional array**  
**array[row,column]**  
Think as coordinates in a two dimentional object.  


In [None]:
print (a)
a[0,-2] # acces a row

In [None]:
a[1,1]

## Array slicing ##
Extract a portion of an array by specifying a lower and upper bound.  
As always the lower elemnt is included but the upper element is excluded.  
It works the same way as for lists as we have seen before.  
**a[lower:upper:step]**  

In [None]:
# just a reminder
a = np.arange(0,10,1)
print (a)
print (a[::2])  #even number 
print (a[::-1]) #reverse array
print (a[1::2]) #odd numbers

Now lets wee how it works **2-dimensional array slicing**

In [None]:
a = np.arange(0,12,1).reshape(3,4)
a

In [None]:
# take the two elements in the middle 
a[2,2:4]

In [None]:
# take the last two elemnts of the third row (you answer)
a[2,2:4]

In [None]:
# take the same element for all the rows 
a[:,2]

# Visualizing data with **Matplotlib** #

In general, if you produce/analyze data there is the need to communicate them to the public (a colleague, scientific community, your employer etc.) in an effective way.  

The gold standard for scientific communication is the ***figure***. This is a presentation of your findings in pictorial form. The goal of making *figures* is to clearly show your data to inform a reader of your ideas.  

Also, when you are analyzing big data you have the need to visualize them in a direct way (look at hundreds of thousands of numbers is not useful), so that you will want to make *figures*. 


In [None]:
import matplotlib.pylab as plt
import numpy as np

Make a plot of a straight line

In [None]:
q = 2
m = 4
x = np.arange(0,10,1)

y = m*x+q

fig = plt.figure()
plt.plot(x,y)

In [None]:
x = np.linspace(start=0,stop=100, num = 100)# Create a large array with 100 samples.0)
m = 2
q = 5
y = m*x + q
plt.plot(x,y, label = 'Moto ret. unif.', c = 'r', linestyle = 'dashed')
plt.xlabel('Tempo', fontsize = 14)
plt.ylabel('Spazio', fontsize = 14)
plt.legend(loc='best')

### Plotting some simple functions

Plottare una funzione esponenziale

In [None]:
plt.figure(num=1, figsize=(10,5))
x = np.linspace(start=-10, stop=4, num=100)
y1 = np.exp(x) #esponenziale
plt.plot(x,y1, color = 'k', linestyle='dashdot', marker='o' )
plt.title('Esponenziale: y = e$^x$', fontsize=20)
plt.axhline(y=0, color='r')
plt.axvline(x=0, color='r')
plt.ylim(-10,20)

In [None]:
plt.figure(num=2, figsize=(10,5))
x=np.arange(0.1,5,0.1)
y = np.exp(1.0/x)             #esponenziale 1/x
plt.plot(x,-y, color = 'k', linestyle='dashdot', marker='o' )
plt.title('Esponenziale: y = e$^{(1/x)}$', fontsize=20)
plt.axhline(y=0, color='r')
plt.axvline(x=0, color='r')
plt.ylim(-20,0)

Radice quadrata

In [None]:
plt.figure(3, figsize=(10,5))
x1 = np.arange(0,10,0.1)   # start, stop, a step di,
y2 = np.sqrt(x1) #Square root
plt.plot(x1,y2, color = 'k', linestyle='dashdot', marker='o' )
plt.title('Radice quadrata', fontsize=20)
plt.axhline(y=0, color='r')
plt.axvline(x=0, color='r')

Logaritmo 

In [None]:
plt.figure(4, figsize=(10,5))
x = np.arange(0.01,4,0.1)
y = np.log(x)              # logaritmo
plt.plot(x,y, color = 'k', linestyle='dashdot', marker='H' )
plt.title('Logaritmo', fontsize=20)
plt.axhline(y=0, color='r')
plt.axvline(x=0, color='r')

Parabola

In [None]:
plt.figure(5, figsize=(10,5))
a = 1
b = 2
c = 3
x = np.arange(-10,10,0.5)
y = a +b*x +c*(x**2)              # parabola
plt.plot(x,y, color = 'k', linestyle='dashdot',marker='o'  )
plt.title('Parabola', fontsize=20)
plt.axhline(y=0, color='r')
plt.axvline(x=0, color='r')
plt.ylim(-2,80)

# <font color='blue'>Summary </font>

Variables: pippo = 2 (int), pippo = 'cane' (str)

Numbers: int, float, complex

String & indexing: p = 'Seismogram', p[2:4] ...is

List: everything = [a, b, c, d,1, 2, 3, "hello"]
    
    operation with list: a = [1,2,3], b = [2,3,3], a+b= [1,2,3,2,3,3]
    
    list sorting: pippo = [4,3,1,2,8,9], pippo.sort(), [1,2,3,4,8,9]

Numpy module
    
    a = np.array([1,2,3]); b = np.array([2,3,4]), (a+b)= [3 5 7]
    
    a = np.arange(0,10,1)
    
    a = np.arange(1,10,1), a[3:6] = [4,5,6]
    
    Slicing:
    a = [[0. 1. 2. 3. 4.]
    [5. 6. 7. 8. 9.]]   a[1](access a raw), a[1,2] (access an element)

Matplotlib module

Plotting simple functions </font>