### Numpy Primer
----------------------

### Arrays using Python Lists
---------------------------------

Let us set a benchmark using pure Python. 

First, a `vector` as `list` object.

In [1]:
v = [0.5, 0.75, 1.0, 1.5, 2.0]  # vector of numbers

Second, a `matrix` as `list of list`.

In [2]:
m = [v, v, v]  # matrix of numbers
m

[[0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0]]

In [3]:
m[1]

[0.5, 0.75, 1.0, 1.5, 2.0]

In [4]:
m[1][0]

0.5

Third, a `tensor` as a `nested list`.

In [5]:
v1 = [0.5, 1.5]
v2 = [1, 2]
m = [v1, v2]
c = [m, m]  # cube of numbers
c

[[[0.5, 1.5], [1, 2]], [[0.5, 1.5], [1, 2]]]

In [6]:
c[1][1][0]

1

In [7]:
v = [0.5, 0.75, 1.0, 1.5, 2.0]
m = [v, v, v]
m

[[0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0]]

In [8]:
v[0] = 'Python'
m

[['Python', 0.75, 1.0, 1.5, 2.0],
 ['Python', 0.75, 1.0, 1.5, 2.0],
 ['Python', 0.75, 1.0, 1.5, 2.0]]

#### What just happened? 

Python does not really create a copy of an object when `=` is used. What it does is to create a pointer in memory where `m` points in memory to `v`, hence, under the hood `m` is not a new object but rather a reference to the existing object `v`.
To create a new object / space in memory , Python has a built-in library to handle copy operations.

In [9]:
from copy import deepcopy
v = [0.5, 0.75, 1.0, 1.5, 2.0]
m = 3 * [deepcopy(v), ]
m

[[0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0]]

In [10]:
v[0] = 'Python'
m

[[0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0],
 [0.5, 0.75, 1.0, 1.5, 2.0]]

### Numpy Data Structures

#### Regular Numpy Arrays

Numpy is a cornerstone in the Python scientific and PyData ecosystem.

It provides a powerful `ndarray` class for the handling of multi-dimensional arrays.

In [11]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
%matplotlib inline
import random

In [12]:
data2 = [[1,2,3,4],[5,6,7,8]]

In [13]:
arr2 = np.array(data2)
arr2

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [14]:
print (arr2.ndim) # Counts the number of subarrays
print (arr2.shape) # matrix shape
print (arr2.dtype) # type of elements in the array

2
(2, 4)
int32


In [15]:
print ('Creating an array with 0s as elements')
np.zeros(8)

Creating an array with 0s as elements


array([0., 0., 0., 0., 0., 0., 0., 0.])

In [16]:
np.zeros((2,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [17]:
print ('It creates an array 1s as elements')
np.ones((2,3)) 

It creates an array 1s as elements


array([[1., 1., 1.],
       [1., 1., 1.]])

In [18]:
print ('It creates the identity matrix , size should be provided')
np.eye(3) 

It creates the identity matrix , size should be provided


array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [19]:
arr3 = np.array([1.1,2.2,3.3,4.4])

In [20]:
print ('It changes from floating to integer by truncating the number')
print('arr3: Integers ',arr3.astype(np.int32))
print ('It does not change the original data structure')
print('arr3: ',arr3) 

It changes from floating to integer by truncating the number
arr3: Integers  [1 2 3 4]
It does not change the original data structure
arr3:  [1.1 2.2 3.3 4.4]


In [21]:
print ('It retrieves an specific element in the array')
print (arr3[2])

It retrieves an specific element in the array
3.3


### Structured Arrays

In [22]:
dt = np.dtype([('Name', 'S10'), ('Age', 'i4'),
               ('Height', 'f'), ('Children/Pets', 'i4', 2)])
s = np.array([('Smith', 45, 1.83, (0, 1)),
              ('Jones', 53, 1.72, (2, 2))], dtype=dt)
s

array([(b'Smith', 45, 1.83, [0, 1]), (b'Jones', 53, 1.72, [2, 2])],
      dtype=[('Name', 'S10'), ('Age', '<i4'), ('Height', '<f4'), ('Children/Pets', '<i4', (2,))])

In [23]:
s['Name']

array([b'Smith', b'Jones'], dtype='|S10')

In [24]:
s['Height'].mean()

1.7750001

In [25]:
s[1]['Age']

53