# Python Libraries

For this tutorial, we are going to outline the most common uses for each of the following libraries:

* **Numpy** is a library for working with arrays of data.

* **Scipy** is a library of techniques for numerical and scientific computing.

* **Matplotlib** is a library for making visualizations.

* **Seaborn** is a higher-level interface to Matplotlib that can be used to simplify many visualization tasks.

*__Important__: While this tutorial provides insight into the basics of these libraries,  I recommend digging into the documentation that is available online.*

## NumPy

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

* a powerful N-dimensional array object
* sophisticated (broadcasting) functions
* tools for integrating C/C++ and Fortran code
* useful linear algebra, Fourier transform, and random number capabilities

We will focus on the numpy array object.

#### Numpy Array

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension. 

# Useful libraries

## Numpy

import numpy as np
a = np.array([1,2,3])
print(type(a))


In [4]:
import numpy as np
a = np.array([1,2,3])
print(type(a))


<class 'numpy.ndarray'>


### 1d array

In [7]:
 print (np.shape(a)) #print the dimensions of the array

(3,)


### 2d array + indexing (line, column)

In [13]:
b = np.array ([[1,2],[3,4]])
print(b, 2*'\n', np.shape(b))

[[1 2]
 [3 4]] 

 (2, 2)


In [17]:
b[1,0]

3

In [19]:
c = np.zeros([3,2]) #I might pass in a tuple or a list
print(c)

[[0. 0.]
 [0. 0.]
 [0. 0.]]


In [20]:
d = np.ones((3,2))
print(d)

[[1. 1.]
 [1. 1.]
 [1. 1.]]


- useful function:

In [25]:
e = np.full((3,3),8.0)
print(e)
print(type(e[1,0]))

[[8. 8. 8.]
 [8. 8. 8.]
 [8. 8. 8.]]
<class 'numpy.float64'>


In [30]:
f = np.random.random((3,3))
print(f)

[[0.77970012 0.22041842 0.86902066]
 [0.73880808 0.77479003 0.40686538]
 [0.3407052  0.92532396 0.24195048]]


- Indexing and creating determined arrays

In [38]:
h = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print(h)

#index
h1 = h[:2,1:]
print(h1)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
[[2 3 4]
 [6 7 8]]


- if you edit slices of arrays, you changed the sliced array too:

In [49]:
h1[0,0] = 1000
print(h1, 2*'\n', h)
print(h1.dtype)

[[1000    3    4]
 [   6    7    8]] 

 [[   1 1000    3    4]
 [   5    6    7    8]
 [   9   10   11   12]]
int32


## Array math

In [72]:
x = np.array ([[1,2],[3,4]], dtype= np.float32)
y = np.array ([[5,6],[7,8]], dtype= np.float32)

print(x+y, 2*'\n', x-y,  2*'\n')
print(x*y, 2*'\n', x/y, 2*'\n')
print(np.sqrt(x))

[[ 6.  8.]
 [10. 12.]] 

 [[-4. -4.]
 [-4. -4.]] 


[[ 5. 12.]
 [21. 32.]] 

 [[0.2        0.33333334]
 [0.42857143 0.5       ]] 


[[1.        1.4142135]
 [1.7320508 2.       ]]


## Descriptive mathematics

In [76]:
np.sum(x)

10.0

In [77]:
np.sum (x, axis=1)

array([3., 7.], dtype=float32)

In [78]:
np. sum (x, axis = 0)

array([4., 6.], dtype=float32)

## SciPy

Numpy provides a high-performance multidimensional array and basic tools to compute with and manipulate these arrays. SciPy builds on this, and provides a large number of functions that operate on numpy arrays and are useful for different types of scientific and engineering applications.

For this course, we will primariyl be using the **SciPy.Stats** sub-library.

### SciPy.Stats

The SciPy.Stats module contains a large number of probability distributions as well as a growing library of statistical functions such as:

* Continuous and Discrete Distributions (i.e Normal, Uniform, Binomial, etc.)

* Descriptive Statistcs

* Statistical Tests (i.e T-Test)

In [80]:
from scipy import stats as st

- Create normal random variables

In [82]:
print(st.norm.rvs(size=1000))

[ 6.58613795e-01  1.38919711e-01 -2.31934093e-01  1.95620918e-01
  3.65389442e-01 -1.24165554e+00 -1.20105577e+00  3.29464331e-02
 -1.06095570e+00  1.81016658e+00 -1.34123032e+00 -1.35422471e+00
 -1.35619703e+00 -4.53340936e-01  5.03728052e-01  3.65101645e-01
 -9.13855925e-01  7.88843014e-01 -6.55033891e-01  1.56285537e+00
  2.71586632e-01  1.16384839e+00  2.22569569e-01 -5.22512054e-01
  4.34123577e-02 -1.55798914e+00  1.48733165e+00 -5.41335040e-02
 -1.68740643e+00  2.07389875e+00 -5.17842854e-01 -2.75431072e-02
 -1.10673976e+00  6.94776836e-01  1.01467278e-01 -7.49329641e-01
 -6.50565042e-01  1.92513274e-01 -1.66630120e-01 -1.12823475e+00
  9.97486196e-01 -8.66006049e-01 -3.18366655e-02  3.85921093e-01
  5.86626797e-01  1.04357440e-01  3.93672867e-02 -5.85298827e-01
  4.21544247e-01 -3.59872706e-01  1.65374177e-01 -1.25788741e+00
  5.16728599e-01  8.64007356e-01 -1.33012827e-01 -1.25620567e+00
  1.61828179e-01  7.32010513e-01  1.53015262e+00 -1.83978433e+00
  8.35397721e-01 -3.77774

## Descriptive statistics

In [89]:

np.random.seed(1234)
x = st.t.rvs(10, size = 100) #library . type of distribution #rvs

print(x.min())

st.describe(x) #limited to arrays. Do not work with data frames


-2.8910757913419554


DescribeResult(nobs=100, minmax=(-2.8910757913419554, 2.874095696918692), mean=0.05748894228519995, variance=1.008103430750141, skewness=-0.13595535434253678, kurtosis=1.0429155607860423)