# Python Refresher

Here I will outline a quick refresher of Python, the basic data types are Integers, Floats, Boolean and Strings and they have many useful methods that allow one to manipulate these data types. I will mainly be drawing from [this resource](http://cs231n.github.io/python-numpy-tutorial/). 

Python is an *interpreted* language meaning that it is interpreted one line at a time when it is run, since I am using Jupyter notebooks I don't really have to worry about this.

At any point you can type `function?` to return the docstring of the function.

In [9]:
a = 3 # this is an integer
print(type(a))
print(a)
a += 1 # no unary operation such as ++ or --, but does have +=, -= and *=
print(a)

b = 3. # this is a float
print(type(b))
print(b)

print(a+b) # returns a float
print(type(a+b))

# Reutrns the docstring of a function.
print?

<class 'int'>
3
4
<class 'float'>
3.0
7.0
<class 'float'>


## Data Types 

There are a few basic types of data that are native to Python, these are:
- Integer
- Complex
- Float
- Boolean

Just for fun, I will use the Wallis formula $$ \pi = 2\prod_{i=1}^{\infty} \frac{4i^2}{4i^2 - 1} $$ to calculate digits of pi. I will start buy instantiating a list. Of course there are better ways to get pi, it is stored as a field.



In [18]:
# Make variables
# Set limit of how accurate pi should be

num = 10000
pi = 1

for i in range(num):
    i+=1
    pi = pi*((4*i**2)/(4*i**2 - 1))
    
# Try using it using list comprehensions
%timeit pi = [((4*i**2)/(4*i**2 - 1)) for i in range(num)]
print(pi)

9.98 ms ± 233 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.5707570593409783


# NumPy and SciPy

SciPy is a scientific library for python, NumPy is the fundamental building block of all of SciPy's computations, it provides a multi-dimensional array, on which operations are done. A good *quickstart guide* can be found [here](https://docs.scipy.org/doc/numpy/user/quickstart.html) and explains the fundamental object 'ndarray' or n-dimensional arrays. The *full documentation* can be found [here](https://docs.scipy.org/doc/).

#### The ndarray object

In NumPy the n-dimensional array has *axes* which is essentially the number of dimensions, the following code shows an ndarray that has been explicitly instantiated, it has 3 axes, and therefore a size of 6.

In [2]:
import numpy as np

# explicitly defining an array, with dimension 1 and a length of 3
a = np.ndarray([1,2,3])
print(a.size)
print(a.ndim)

6
3



#### Array Creation

NumPy also offers a variety of ways to create an array (essentially an alias of the ndarray), with a variety of ways to specify the initial size of the array with placeholder variables. These are especially useful for linear algebra applications, which will be the topic of the next notebook. 


In [25]:
a = np.zeros((2,2))   # Create an array of all zeros
print(a)              # Prints "[[ 0.  0.]
                      #          [ 0.  0.]]"

b = np.ones((1,2))    # Create an array of all ones
print(b)              # Prints "[[ 1.  1.]]"

c = np.full((2,2), 7)  # Create a constant array
print(c)               # Prints "[[ 7.  7.]
                       #          [ 7.  7.]]"

d = np.eye(2)         # Create a 2x2 identity matrix
print(d)              # Prints "[[ 1.  0.]
                      #          [ 0.  1.]]"

e = np.random.random((2,2))  # Create an array filled with random values
print(e)                     # Might print "[[ 0.91940167  0.08143941]
                             #               [ 0.68744134  0.87236687]]"
    


[[0. 0.]
 [0. 0.]]
[[1. 1.]]
[[7 7]
 [7 7]]
[[1. 0.]
 [0. 1.]]
[[0.46175494 0.8826019 ]
 [0.28934112 0.02091026]]


There are many ways to create an array which can be found in the [documentation](https://docs.scipy.org/doc/numpy/user/basics.creation.html#arrays-creation).

#### Array Indexing

Like in native python, slicing is a great way to reach certain portions of an array, allowing for one to take parts of an array and do operatrions with them. If the slice (partition) is modified, it modifies the original array. For example, the following code creates an array and then references slices of it. 

Because arrays in NumPy are multidimensional, you must specify a slice for each dimension of the axis.


In [27]:
x = np.array([[1,2,3],[4,5,6],[7,8,9]])
print(x,"\n")

ele0 = x[0]
axis1 = x[0,:]
axis2 = x[:,0]
axis1_2 = x[1:3,1:3]


print("Just the 0th element:", ele0,"\n") #prints just the 0th element
print("The elements along first axis", axis1,"\n")
print("The elements along second axis", axis2,"\n") #prints the 0th element of the first axis
print("2nd and 3rd element of second and third row:\n", axis1_2,"\n") 
# Modifying a slice will modify the array
x[2,:] = [0,0,0]
print(x)

[[1 2 3]
 [4 5 6]
 [7 8 9]] 

Just the 0th element: [1 2 3] 

The elements along first axis [1 2 3] 

The elements along second axis [1 4 7] 

2nd and 3rd element of second and third row:
 [[5 6]
 [8 9]] 

[[1 2 3]
 [4 5 6]
 [0 0 0]]


A really useful application of slicing is **boolean array indexing**, this allows you to select elements from an array that satisfy a condition. This could be very useful for cleaning data. 

In [30]:
a = np.array([[1,2],[3,4],[5,6],[7,8]])

exp = (a > 3)

print(exp) # Prints whether values in a are greater than 3
print(a[exp]) # Prints all values in a greater than 3

[[False False]
 [False  True]
 [ True  True]
 [ True  True]]
[4 5 6 7 8]


Again, a more complete description of array indexing can be found in the [documentation](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html) of array indexing.

#### Data Types

Like in native python there are different data types, all of which can be found in the [documentation](https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html) of data types. The main ones that I am interested in are integer, float and complex numbers.

In [37]:
int_arr = np.array([1,2,3,4]) # Integer array
float_arr = np.array([1.,2.,3.,4.]) # Float array
cplx_arr = np.array([1+0j,2+0j,3+0j,4+0j]) # Complex Array

print("Integer array:", int_arr, int_arr.dtype)
print("Float array:", float_arr, float_arr.dtype)
print("Complex array:", cplx_arr, cplx_arr.dtype)

Integer array: [1 2 3 4] int32
Float array: [1. 2. 3. 4.] float64
Complex array: [1.+0.j 2.+0.j 3.+0.j 4.+0.j] complex128


You can also specifiy a particular dataype by specifying it in the parameters of the array creation method

In [41]:
arr = np.array([1,2,3,4], dtype='float64') # Forces the array to be float
print(arr, arr.dtype) 

[1. 2. 3. 4.] float64


#### Array Operations

There are many inbuilt functions for performing arithmetic and operations on ndarrays, a complete list can be found in the [documentation](https://docs.scipy.org/doc/numpy/reference/routines.math.html) on mathematical routines. Some very simple operations are matrix addition, subtraction e.t.c. these can all be used with the standard Python arithmetic operators, +,-,* and / however it must be noted that the latter two are not matrix multiplication and division, rather they are elementwise operations.

The routines for manipulating arrays, such as transpositions and rotations can be found in the [documentation](https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html).

In [7]:
a = np.array([[1,1,1],[2,2,2],[3,3,3]])
b = np.array(a.copy())

print(a+b) # Adds each element to one another
print(np.add(a,b))

print(a-b) # subtracts each element from one another
print(np.subtract(a,b))

print(a*b) # multiplies each element by one another
print(np.multiply(a,b))

print(a/b) # divides each element by one another
print(np.divide(a,b))

[[2 2 2]
 [4 4 4]
 [6 6 6]]
[[2 2 2]
 [4 4 4]
 [6 6 6]]
[[0 0 0]
 [0 0 0]
 [0 0 0]]
[[0 0 0]
 [0 0 0]
 [0 0 0]]
[[1 1 1]
 [4 4 4]
 [9 9 9]]
[[1 1 1]
 [4 4 4]
 [9 9 9]]
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


To do the matrix multiplication operation we must use np.dot() to find the inner product. Another very useful function is the sum() function which adds up all the elements in an array. You can also specify the axis over which to do the sum.

In [17]:
to_sum = np.array([[1,2,3],[1,2,3],[1,2,3]])

print(to_sum.sum())
print(to_sum.sum(axis=0)) # Sums each column
print(to_sum.sum(axis=1)) # Sums each row
print(to_sum[0:3,2].sum()) # Takes the third element of each row and sums them.

18
[3 6 9]
[6 6 6]
9


#### Broadcasting

Can't be bothered writing this yet, will do it soon.