## Introduction to Numpy and Pandas

### Numpy
Also ```Numerical Python```.

Is a library for developing matrixes and multidimensional arrays 

```python3
import numpy as np
```
```np``` - is a an **alias** for numpy as used within a file.

In [2]:
# including numpy in the project
import numpy as np 

In [3]:
# creating a one dim array
# starting with a list
l = [1, 2, 3, 4, 5]

# confirm the type 
type(l)

list

In [5]:
# creating a one dim
oneDim = np.array(l)

# confirm type
type(oneDim)

numpy.ndarray

In [6]:
# count number of rows 
oneDim.ndim

1

In [7]:
# counting number of columns
oneDim.shape 

(5,)

In [14]:
# check the data type 
oneDim.dtype

dtype('int32')

In [15]:
# transposition
print(oneDim)
t = oneDim.T
t

[1 2 3 4 5]


array([1, 2, 3, 4, 5])

## Python Data Type
### integers
```python
x = 9
print(type(x))
```
**output**
```
>> int
```
### floats
```python
x = 9.0
print(type(x))
```
**output**
```
>> float
```
### Booleans
```python
x = True
print(type(x))
```
**output**
```
>> Bool
```
### String
```python
x = 'hello'
print(type(x))
```
**output**
```
>> str
```
### list
```python
x = [9.0, 1, 2, 4]
print(type(x))
```
**output**
```
>> list
```
### tuple
```python
x = (9.0, 1, 2, 4)
print(type(x))
```
**output**
```
>> tuple
```

### np.array().T
- transposes arrays of greater than two dimensions 

In [19]:
# creating an array with 3 rows and 4 columns 
# using arange(start, stop, step) to create a random number array
samp = np.arange(12)

In [20]:
samp

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

In [21]:
# rearange the array to be 3 * 4
re_samp = samp.reshape(3, 4)
re_samp

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [23]:
# create values equaly spaced between two numbers
# using np.linspace(start, stop, number of numbers)
# uses an arithmetic progression
equal = np.linspace(0, 1, 10)
equal

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

In [24]:
# dividing a region with such that first number is log10(start) last number is log10(end)
# notice all numbers are equally spaced logs
logs = np.logspace(1, 3, 8)
logs

array([  10.        ,   19.30697729,   37.2759372 ,   71.9685673 ,
        138.94954944,  268.26957953,  517.94746792, 1000.        ])

In [27]:
# generating a random n-dim array of just zeros
# notice that it accepts a tuple as input
nil = np.zeros((2, 4))
nil

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [29]:
# one vector of length 6
one = np.ones(6)
one


array([1., 1., 1., 1., 1., 1.])

In [30]:
# one matrix of ones (4*4)
mat_one = np.ones((4,4))
mat_one

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [31]:
# identity matrix
# generating a 4 * 4 identity matrix
ident = np.eye(4)
ident

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [33]:
# matrix arrithmetic
# add  one to each element
# add an equal array of ones A => given  B => one
x = np.array((1, 2, 3, 4, 5))
x

array([1, 2, 3, 4, 5])

In [35]:
# add one to each element
x + 1

array([2, 3, 4, 5, 6])

In [36]:
# element multiplication
x * 2

array([ 2,  4,  6,  8, 10])

In [38]:
# element exponents 
x * x

array([ 1,  4,  9, 16, 25])

In [39]:
# element integer division
# return the answer of integer division with atruncated decimal
x // 2

array([0, 1, 1, 2, 2], dtype=int32)

In [40]:
# modulus
x %2

array([1, 0, 1, 0, 1], dtype=int32)

In [41]:
# create an array of 10 elements
y = np.arange(-5, 5)
y

array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4])

In [43]:
# slicing and indexing
# indexing
y[8]

3

In [47]:
# slicing
# get the third to the sixth element 
y[2:6]

array([-3, -2, -1,  0])

In [48]:
# get the last two elements
y[-2:]

array([3, 4])

In [49]:
# creating a copy
a = np.arange(1, 10)
a

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [51]:
b = a[2:6]
b

array([3, 4, 5, 6])

In [53]:
b[1] = 5
b

array([3, 5, 5, 6])

In [55]:
a

array([1, 2, 3, 5, 5, 6, 7, 8, 9])

In [56]:
# editing the slice edits the original to prevent this copy is used
a = np.arange(1, 10)
b = a[2:6].copy()
b

array([3, 4, 5, 6])

In [57]:
b[1] = 5
a

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [58]:
b

array([3, 5, 5, 6])

## Numpy Arithmetic and Statistical Functions


In [59]:
samp2 = np.array([-1.4, 0.4, -3.2, 2.5, 3.4])
samp2

array([-1.4,  0.4, -3.2,  2.5,  3.4])

In [60]:
# absolute value
ab_val = np.abs(samp2)
ab_val

array([1.4, 0.4, 3.2, 2.5, 3.4])

In [61]:
# maximum and minimum
max_val = np.max(samp2)
max_val

3.4

In [62]:
# add and subtract
# subtract 2
minus = np.subtract(samp2, 2)
minus

array([-3.4, -1.6, -5.2,  0.5,  1.4])

In [63]:
# mean
mn = np.mean(samp2)
mn

0.33999999999999997

In [64]:
# standard deviation
sd = np.std(samp2)
sd

2.432776191925595

## Class Task
1. generate 20 elements that are normaly distributed. calculate the:
- mean
- standard deviation 
- mode

2. generate another 30 elements that are normaly distributed. calculate the:
- mean
- standard deviation 
- mode

3. combine and find:
- mean
- standard deviation 
- mode