<a name="top"></a>
# Python Workshop 2
To help you deal with complex codes, Python allows you to bundle codes together to make your code more organized and improve code reusability. There are three main levels of doing this:
1. [Function](#function)
2. [Class](#class)
3. [Module](#module)

At the end, we will look at [NumPy](#numpy), which is one of the most popular package for data analysts these days.

<a name="function"></a>
## 1. Function
A function is a section of program that performs a specific task. To use functions, **define** then **call** them. Python uses indentation to indicate the section.

In [1]:
def speak():
    print("Speak")
    print("hi")
    print("Yo!")

speak()

Speak
hi
Yo!


### Parameters and Arguments
You can pass data to a function but you must specify them in the function definition. These input(s) are called **parameters**. When we call a function that requires parameter(s), we must provide input values/variables, called **arguments**.

In [2]:
def speak(name):
    print("Hello!", name)

name = input("Name ")
speak(name)

Name  Yoyo


Hello! Yoyo


In [3]:
def speak(name = 'John Doe'):
    print("Hello!", name)

speak()

Hello! John Doe


### Return Values
Functions can also send back results using `return` statement.

In [4]:
def sum(a,b):
    summation = a+b
    return summation

a = int(input("input a number to sum : "))
b = int(input("input b number to sum : "))

summation = sum(a,b)
print("summation is : ",summation)

input a number to sum :  5
input b number to sum :  7


summation is :  12


[[top](#top)]

<a name="class"></a>
## 2. Class
Class allows you to bundle both variables and functions into a self-contained reusable unit. It helps us tackle problems at the abstract level by a) treating any component relevant to the problem as an object; and b) representing any relationship through a function.

First we must define a **class**, which is like a blueprint of a thing using `class` keyword: 

In [5]:
class Employee:
    def __init__(self, name, dept, salary=1000):
        self.name = name
        self.dept = dept
        self.salary = salary
    
    def show_record(self):
        print('Name: {}\nDepartment: {}\nSalary: {}$'
                .format(self.name, self.dept, self.salary))

Then, we must create an instance of a class, called **object**.

In [6]:
mrA = Employee('Sarun Gulyanon', 'Sales')
mrA.show_record()

Name: Sarun Gulyanon
Department: Sales
Salary: 1000$


`__init__` function is a constructor. It defines how to create an object. `self` variable represents the instance of the object itself. We can access variables or functions belonging to a class (called "attributes") using this expression: `obj.name`.

[[top](#top)]

<a name="module"></a>
## 3. Module
A module is like a code library. Functions and Classes bundles codes within the same file but Modules separate codes into different files for better organization and reusability. There are 3 types of modules:
1. [Built-in module](#builtin)
2. [Customized module](#customized)
3. [External module](#external)

<a name="builtin"></a>
### 3.1. Built-in Module
Built-in module or Python standard library is very extensive, offering a wide range of facilities, and it makes Python powerful ([full list of Python standard library](https://docs.python.org/3/library/)).

To call functions in modules, first we must `import` the module. Then, call it using both module and function names.

In [7]:
import os
#folder = 'C:\\users\sarun\Desktop'  # for linux
folder = '/home/sarun/Desktop'  # for linux
file = 'a.txt'
print(os.path.join(folder, file))

/home/sarun/Desktop/a.txt


You can import only specific things using `from` statement.

In [8]:
from os import getcwd
print('Current Directory is {}'.format(getcwd()))

Current Directory is /home/sarung/Desktop/datascience/day02_database


<a name="customized"></a>
### 3.2. Customized Module
You can build your own module as well. Just save your functions and/or classes (or even variables) in a module (a file with `.py` extension containing function and/or class definitions). Then, import it using the filename (make sure you put your codes/scripts and your module in the same folder).

In [9]:
import mymodule
print(mymodule.mul(2,3))

6


### Naming a Module
You can create an alias when you import a module for easy access by using the `as` keyword.

In [10]:
import mycode.myfile as mm
from mymodule import mul as mu
mm.greeting('Yoyo')
print(mu(3,4))

Hi! Yoyo
12


You can bundle a collection of modules into a "package" by putting modules in a folder. In Python 2 or < 3.3, we must create an empty file called `__init__.py` in every folder you import from. Python 3.3+ has [Implicit Namespace Packages](https://www.python.org/dev/peps/pep-0420/) that allow it to create a packages without an `__init__.py` file.



<a name="external"></a>
### 3.3. External Module
These modules are created by the third party and you are allowed to use them as long as you comply with their license. Some of them can be installed easily through the package manager like [pip](https://packaging.python.org/tutorials/installing-packages/) or [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-pkgs.html). Here we will look at two important modules for data analytics:

1. [NumPy](http://www.numpy.org/) is the fundamental package for scientific computing 
2. [Pandas](https://pandas.pydata.org/) provides high-performance, easy-to-use data structures and data analysis tools.

[[top](#top)]

-----
<a name="numpy"></a>
## NumPy
NumPy’s main object is the homogeneous multidimensional array (a N-dimensional hypervolume whose elements, usually numbers, are all of the same type). NumPy’s array class is called `ndarray`. Here we will cover:
* Array Creation
* Basic Operations
* Indexing, Slicing and Iterating
* Shape Manipulation
* Copies

More details [here](https://docs.scipy.org/doc/numpy-1.15.1/user/quickstart.html).

### Array Creation
First, we need to create the `ndarray`. There are several ways to create arrays.

In [11]:
import numpy as np
a = np.array([2,3,4])
print('a={}\ntype={}'.format(a, type(a)))
b = np.array([1.2, 3.5, 5.1])
print('b=',b)
c = np.zeros((2,4))
print('c=',c)
d = np.ones((2,3,4))
print('d=',d)
e = np.empty((2,3))
print('e=',e)

a=[2 3 4]
type=<class 'numpy.ndarray'>
b= [1.2 3.5 5.1]
c= [[0. 0. 0. 0.]
 [0. 0. 0. 0.]]
d= [[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]
e= [[5.95525570e+228 9.40039426e-154 1.23475616e-259]
 [3.68777421e+180 4.47593816e-091 7.13637443e+159]]


Notice how NumPy displays array (in NumPy dimensions are called axes):
* The last axis is printed from left to right,
* The second-to-last is printed from top to bottom,
* The rest are also printed from top to bottom, with each slice separated from the next by an empty line.

You can create sequences of numbers as well

In [12]:
f = np.arange(10, 30, 5)
print('arange=', f)
g = np.linspace(0, 2, 9)
print('linspace=', g)
h = np.random.random((3,4))
print('random=', h)

arange= [10 15 20 25]
linspace= [0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]
random= [[0.25775132 0.5618098  0.18777053 0.4303184 ]
 [0.51306857 0.15371667 0.90919544 0.67974337]
 [0.90938036 0.68866324 0.72536581 0.82115209]]


`ndarray` has a number of attributes.

In [13]:
print('a=',a)
print('a.ndim=',a.ndim)
print('a.shape=',a.shape)
print('a.size=',a.size)
print('a.dtype=',a.dtype)

a= [2 3 4]
a.ndim= 1
a.shape= (3,)
a.size= 3
a.dtype= int64


### Basic Operations
Arithmetic operators on arrays apply elementwise.

In [14]:
print('a=',a)
print('b=',b)
print('a-b=',a-b)

a= [2 3 4]
b= [1.2 3.5 5.1]
a-b= [ 0.8 -0.5 -1.1]


In [15]:
print(a**2)
print(np.sqrt(a))
print(b>3)

[ 4  9 16]
[1.41421356 1.73205081 2.        ]
[False  True  True]


Many unary operations, such as the sum of all the elements in the array, are implemented as methods of the `ndarray` class.

In [16]:
print(a.sum())
print(a.mean())
print(a.std())
print(a.min(), a.max())

9
3.0
0.816496580927726
2 4


### Indexing, Slicing and Iterating
**Indexing** and **slicing** are similar to list. Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by commas:

In [17]:
print('a=',a)
print('a[1]=',a[1])
print('a[1:]=',a[1:])

a= [2 3 4]
a[1]= 3
a[1:]= [3 4]


In [18]:
print('e=',e)
print('e[0,1]=',e[0,1])
print('e[:,1]=',e[:,1])
print('e[0,0:2]=',e[0,0:2])

e= [[5.95525570e+228 9.40039426e-154 1.23475616e-259]
 [3.68777421e+180 4.47593816e-091 7.13637443e+159]]
e[0,1]= 9.400394261596198e-154
e[:,1]= [9.40039426e-154 4.47593816e-091]
e[0,0:2]= [5.95525570e+228 9.40039426e-154]


When fewer indices are provided than the number of axes, the missing indices are considered complete slices.

In [19]:
print('d=',d)
print('d[1]=',d[1])
print('d[...,3]=', d[...,3])

d= [[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]
d[1]= [[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
d[...,3]= [[1. 1. 1.]
 [1. 1. 1.]]


**Iterating** over multidimensional arrays is done with respect to the first axis:

In [20]:
for row in e:
    print('row=',row)

row= [5.95525570e+228 9.40039426e-154 1.23475616e-259]
row= [3.68777421e+180 4.47593816e-091 7.13637443e+159]


### Shape Manipulation
You can change the shape of an array.

In [21]:
print('h=',h)
print(h.shape)

h= [[0.25775132 0.5618098  0.18777053 0.4303184 ]
 [0.51306857 0.15371667 0.90919544 0.67974337]
 [0.90938036 0.68866324 0.72536581 0.82115209]]
(3, 4)


In [22]:
print('reshape=',h.reshape(6,2))

reshape= [[0.25775132 0.5618098 ]
 [0.18777053 0.4303184 ]
 [0.51306857 0.15371667]
 [0.90919544 0.67974337]
 [0.90938036 0.68866324]
 [0.72536581 0.82115209]]


In [23]:
print('transpose=',h.T)

transpose= [[0.25775132 0.51306857 0.90938036]
 [0.5618098  0.15371667 0.68866324]
 [0.18777053 0.90919544 0.72536581]
 [0.4303184  0.67974337 0.82115209]]


### Copies
**No Copy at All**: Simple assignments make no copy of array objects or of their data.

In [24]:
a = np.arange(12)
b = a
print('a=',a)
print(b is a)
b[0] = 100
print('a=',a)

a= [ 0  1  2  3  4  5  6  7  8  9 10 11]
True
a= [100   1   2   3   4   5   6   7   8   9  10  11]


**Deep Copy**: The `copy` method makes a complete copy of the array and its data.

In [25]:
d = a.copy()
print(d is a)
d[0] = 50
print('a=',a)
print('d=',d)

False
a= [100   1   2   3   4   5   6   7   8   9  10  11]
d= [50  1  2  3  4  5  6  7  8  9 10 11]


[[top](#top)]