<center><font size = "10"> Week 1 - Python 3 <center>
<center><font size = "8">Tutorial 02: Useful Modules and Packages<center>

# Module and Package

<font size = "3"> We don't usually store all of our files at the same location on our computer. Rather, we use a well-organized hierarchy of directories. Similar files are kept in the same directory, for example, we might keep our collection of music in the "music" directory". Analogous to this, Python has __packages__ for directories and __modules__ for files. As our application program grows larger in size, we place similar modules in the same package and different modules in different packages. This makes a project easy to manage and conceptually clear.
    
<font size = "3">- A __module__ is a file containing Python definitions and statements. The file name is the module name with the suffix ".py" appended.
  
<font size = "3">- __Packages__ are a way of structuring Python’s module namespace using “dotted module names”.

<font size="3">Let's see some important packages for this course and learn how to use them. Much more information can be found online at the links provided or elsewhere.

## PANDAS

<font size ="3">Pandas is a powerful package for data management in Python: https://pandas.pydata.org/pandas-docs/stable/

<font size ="3">Example creating a pandas dataframe, which takes the form of a table (similar to an Excel table):

In [None]:
"""
Import the pandas module
"as" allows us to give a shorthand name to the module
These shorthands are mostly used consistently by all Python programmers
"""
import pandas as pd


"""
Intialise dictionary of lists
"""
data = {'Name':['Tom', 'nick', 'krish', 'jack'], 
        'Age':[20, 21, 19, 18]} 


"""
Create the dataframe
"""
df = pd.DataFrame(data) 


"""
Print the dataframe
"""
print(df)

In [None]:
"""
Simpler (reduced) visualization of the dataframe with head()
"""

print(df.head())     # shows by default the first 5 rows of the data
print (df.head(1))   # shows only the first row

In [None]:
"""
Check data:
- shape (number of rows, number of columns)
- length (number of rows)
- dimension (number of columns)
"""

print (df.shape)
print (len(df))
print (df.ndim)

In [None]:
"""
Select a column
"""
df['Age']

In [None]:
"""
Select a specific row
"""
df.loc[1]      

In [None]:
"""
Select row and column
"""

df.loc[1,'Name'] # 1 is the row index, 'Name' is the column name

In [None]:
"""
Change which column is the primary index of the rows
"""
df.set_index('Name')

In [None]:
"""
Operate on groups (groupby operation)
"""
import pandas as pd

"""
Create dataframe
"""
data = pd.DataFrame({
    'height (cm)': [100, 112, 132, 115, 200, 184, 192, 153, 180],
    'age': [10, 12, 10, 14, 20, 20, 50, 20, 50],
    'likes bananas': [True, False, True, True, False, False, True, True, True]
})

"""
Iterate through elements grouped by value of "likes bananas"
"""
for likes_bananas, subdata in data.groupby('likes bananas'):
    print(subdata)

"""
Calculate mean height by age
"""
data.groupby('age')['height (cm)'].mean()

## NUMPY

<font size ="3">Numpy is the fundamental Python package for scientific computing and array processing: https://numpy.org


In [None]:
"""
Examples of array creation
"""

import numpy as np 
  
"""
Creating array from list with type float 
"""
a = np.array([[1, 2, 4], [5, 8, 7]], dtype = 'float') 
print ("Array created using from list:\n", a) 
    
"""
Creating a 3X4 array with all zeros 
"""
b = np.zeros((3, 4)) 
print ("\nAn array initialized with all zeros:\n", b) 
    
"""
Create an array with random values 
"""
c = np.random.random((2, 2)) 
print ("\nA random array:\n", c) 
  
"""
Create a sequence of integers from 0 to 30 with steps of 5
"""
d = np.arange(0, 30, 5) 
print ("\nA sequential array with steps of 5:\n", d) 

In [None]:
"""
Basic operations on a single array 
"""

import numpy as np 
  
a = np.array([1, 2, 5, 3]) 
  
"""
Add 1 to every element 
"""
print ("Adding 1 to every element:", a+1) 
    
"""
Multiply each element by 10 
"""
print ("Multiplying each element by 10:", a*10) 
    
"""
Transpose of array 
"""
a = np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]]) 
  
print ("\nOriginal array:\n", a) 
print ("Transpose of array:\n", a.T) 

In [None]:
# Unary operators in numpy 

import numpy as np 
  
arr = np.array([[1, 5, 6], 
                [4, 7, 2], 
                [3, 1, 9]]) 
  
# Maximum element of array 
print ("Largest element is:", arr.max()) 

print ("\nRow-wise maximum elements:", 
                    arr.max(axis = 1)) 
    
# Sum of array elements 
print ("\nSum of all array elements:", 
                            arr.sum()) 
  
# Cumulative sum along each row 
print ("\nCumulative sum along each row:\n", 
                        arr.cumsum(axis = 1)) 

In [None]:
# Binary operators in Numpy 

import numpy as np 
  
a = np.array([[1, 2], 
            [3, 4]]) 
b = np.array([[4, 3], 
            [2, 1]]) 
  
# Add arrays 
print ("Array sum:\n", a + b) 
  
# Multiply arrays (elementwise multiplication) 
print ("\nArray multiplication:\n", a*b) 
  
# Matrix multiplication 
print ("\nMatrix multiplication:\n", a.dot(b)) 

## MATPLOTLIB

<font size ="3">Matplotlib is a python 2D plotting library: https://matplotlib.org

In [None]:
# Easy example

import matplotlib.pyplot as plt     # here we import a specific submodule of matplotlib
                                    # and give it a shorthand name

# IPython magic command: plots will be generated inline instead of in a separated window.
# See http://ipython.org/ipython-doc/dev/interactive/magics.html
%matplotlib inline

x = list(range(0, 50, 2))
y = list(range(0, 25, 1))

plt.scatter(x, y, c="g", alpha=0.5, marker='o', label="Luck")
plt.xlabel("Leprechauns")
plt.ylabel("Gold")
plt.legend(loc='upper left')
plt.show()

<font size ="3">Now a more complex example of loading data from a file and plotting it using __Numpy__ and __Matplotlib__

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# If you already run this line before you don't need to write it again
# %matplotlib inline

# Open data1.txt and save rows into different lists
with open ('data1.txt', 'r') as d:
    data = d.readlines()
    #print (data)                   # check what is inside data1.txt by uncomment this line
    
# Organize data
xdata = []
ydata = []
for d in data:
    if d == data[0]:
        pass
    else:
        d_split = d.split('\t')
        xdata.append(float(d_split[0]))
        d_split[1].replace('\n', '')
        ydata.append(float(d_split[1]))
    
# Plot data
fig0, ax = plt.subplots(figsize=(10,6))
ax.scatter(xdata, ydata)
ax.set_xlabel('t [ms]')
ax.set_ylabel('Y0')

<font size ="3">Let's  plot a histogram:

In [None]:
import matplotlib.mlab as mlab

x = np.random.normal(size = 1000)
plt.hist(x, density=True, bins=100, facecolor='green', alpha=0.75)
plt.ylabel('Probability')
plt.title('Histogram')
plt.grid(True)

<font size ="3">Now let's plot the ydata:

In [None]:
plt.hist(ydata, density=True, bins=100, facecolor='blue', alpha=0.75)

plt.xlabel('y')
plt.ylabel('Probability')
plt.title('Histogram')
plt.grid(True)

## Scipy

<font size ="3">Scipy is an "ecosystem" of open-source software for mathematics, science, and engineering: https://docs.scipy.org/doc/scipy/reference/

As an example, let's solve the system:

\begin{equation}
x + 3y + 5z = 10
\end{equation}

\begin{equation}
2x + 5y + z = 8
\end{equation}

\begin{equation}
2x + 3y + 8z = 3
\end{equation}
 
We can write it in the format:

\begin{split}\left[\begin{array}{ccc} 1 & 3 & 5\\ 2 & 5 & 1\\ 2 & 3 & 8\end{array}\right]\left[\begin{array}{c} x\\ y\\ z\end{array}\right]=\left[\begin{array}{c} 10\\ 8\\ 3\end{array}\right].\end{split}

In [None]:
from scipy import linalg        # here we import only the linalg submodule from scipy
                                # this is equivalent to "import scipy.linalg as linalg"

A = np.array([[1,3,5], [2,5,1], [2,3,8]])
b = np.array([10, 8, 3])

x = linalg.solve(A,b)

In [None]:
# Result, determinant and norm
print (A)
print (linalg.det(A))
print (linalg.norm(A, 1))

<font size ="3">Now we will fit a curve using the data loaded from data1.txt file in the previous example

In [None]:
# We will use the function "curve_fit", that can be found in the module "optimize" of the library "SciPy"

from scipy.optimize import curve_fit

def expcurve(x, a):
    '''A model'''
    return 1 - a*np.exp(-x)

def betterexpcurve(x, a, b, c):
    '''A better model'''
    return a - b*np.exp(-x*c)


# Fit model
popt, pcov = curve_fit(betterexpcurve, xdata, ydata)

# Standard deviation errors on the parameters, see curve_fit documentation
perr = np.sqrt(np.diag(pcov))

# Generate data from the model
x = np.linspace(min(xdata)-1, max(xdata)+1)
y = betterexpcurve(x, *popt)

# Plot raw data and model
fig1, ax = plt.subplots(figsize=(12,8))
ax.scatter(xdata, ydata, label='Raw Data')
ax.plot(x,y, '--g', lw=3, label='Model')
ax.set_xlabel('t [ms]')
ax.set_ylabel('Y0')
ax.legend(loc=2)