# M.A.D. Python Libraries - `numpy`

<span style="color:red;">**M.A.D.** => **M**achine **L**earning and **D**ata Science<span>

**Purpose:** The purpose of this workbook is to help you get comfortable with the topics outlined below.

**Prereqs**
* None
    
**Recomended Usage**
* Run each of the cells (Shift+Enter) and edit them as necessary to solidify your understanding
* Do any of the exercises that are relevant to helping you understand the material

**Topics Covered**
* Numpy

# Workbook Setup

## Troubleshooting Tips

If you run into issues running any of the code in this notebook, check your version of Jupyter, Python, extensions, libraries, etc.

```bash
!jupyter --version

jupyter core     : 4.6.1
jupyter-notebook : 6.0.2
qtconsole        : not installed
ipython          : 7.9.0
ipykernel        : 5.1.3
jupyter client   : 5.3.4
jupyter lab      : 1.2.3
nbconvert        : 5.6.1
ipywidgets       : not installed
nbformat         : 4.4.0
traitlets        : 4.3.3
```

```bash
!jupyter-labextension list

JupyterLab v1.2.3
Known labextensions:
   app dir: /usr/local/share/jupyter/lab
        @aquirdturtle/collapsible_headings v0.5.0  enabled  OK
        @jupyter-widgets/jupyterlab-manager v1.1.0  enabled  OK
        @jupyterlab/git v0.8.2  enabled  OK
        @jupyterlab/github v1.0.1  enabled  OK
        jupyterlab-flake8 v0.4.0  enabled  OK

Uninstalled core extensions:
    @jupyterlab/github
    jupyterlab-flake8
```

In [56]:
# # Run this cell to check the version of Jupyter you are running
# !jupyter --version

In [57]:
# # Run one of these cells to check what extensions you are using
# !jupyter-labextension list
# !jupyter-nbextension list

In [58]:
# # Check ipython version
# import sys
# print(sys.version)

## Notebook Configs

In [1]:
# AUTO GENERATED CELL FOR NOTEBOOK SETUP

# NOTEBOOK WIDE MAGICS

# Reload all modules before executing a new line
%load_ext autoreload
%autoreload 2

# Abide by PEP8 code style
%load_ext pycodestyle_magic
%pycodestyle_on

# LIBRARY SPECIFIC MAGICS - UNCOMMENT AS NEEDED

# Plot all matplotlib plots in output cell and save on close
# %matplotlib inline

In [3]:
import numpy as np

In [1]:
# If numpy didn't import you many need install using pip
# !pip install numpy

In [4]:
def print_a(a):
    print('Shape: {}\n{}'.format(a.shape, a))

# [`numpy`](https://docs.scipy.org/doc/numpy/reference/)

`numpy` (numerical Python) is a widely used library for data representation and manipulation (written in C). 

[Numpy Cheatsheet (pdf)](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf)

[Numpy Docs](https://docs.scipy.org/doc/numpy/reference/)

## Creating Arrays

Creating numpy arrays can be done from Python lists or they can be initialized with a variety of other placeholders like random numbers, ones, zeros, etc.

#### Create arrays from lists

```python
np.array()
```

In [4]:
a = np.array([1, 2, 3])
print_a(a)

Shape: (3,)
[1 2 3]


In [5]:
b = np.array([(1.5, 2, 3), (4, 5, 6)], dtype=float)
print_a(b)

Shape: (2, 3)
[[1.5 2.  3. ]
 [4.  5.  6. ]]


In [6]:
c = np.array([[(1.5, 2, 3), (4, 5, 6)], [(3, 2, 1), (4, 5, 6)]], dtype=float)
print_a(c)

Shape: (2, 2, 3)
[[[1.5 2.  3. ]
  [4.  5.  6. ]]

 [[3.  2.  1. ]
  [4.  5.  6. ]]]


#### Create arrays of constant numbers

```python
np.zeros()
np.ones()
np.full()
```

In [7]:
# Create an array of zeros with shape 3 by 4
a = np.zeros((3, 4))
print_a(a)

Shape: (3, 4)
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]


In [8]:
# Create an array of ones with shape 2 x 3 x 4
a = np.ones((2, 3, 4), dtype=np.int16)
print_a(a)

Shape: (2, 3, 4)
[[[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]

 [[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]]


In [9]:
# Create a constant array of 7s
e = np.full((2, 2), 7)
print_a(e)

Shape: (2, 2)
[[7 7]
 [7 7]]


#### Create arrays of random numbers

```python
np.random.random()
```

In [10]:
# Create a 2x2 array with random values (defaults to random nums b/t 0 and 1)
a = np.random.random((2, 2))
print_a(a)

Shape: (2, 2)
[[0.73995323 0.27790262]
 [0.20077016 0.55050662]]


In [11]:
# Create a 2x3x4 array with random values
a = np.random.rand(2, 3, 4)
print_a(a)

Shape: (2, 3, 4)
[[[0.00633874 0.27644084 0.17340644 0.76341107]
  [0.64315063 0.1006094  0.24823375 0.06521121]
  [0.10391418 0.69666518 0.03685724 0.23748964]]

 [[0.39801309 0.92291537 0.82092685 0.09837016]
  [0.57353373 0.44527238 0.52481273 0.70337655]
  [0.53421923 0.54637318 0.26687017 0.2495686 ]]]


In [12]:
# Create an array with random ints
a = np.random.randint(-3, 3, size=12)
print_a(a)

Shape: (12,)
[-2 -2 -1 -1  1 -3 -2 -2 -3  2 -3 -3]


In [13]:
# Create uniform distribution between 0 and 1 in a 2x4 matrix
a = np.random.random_sample((2, 4))
print_a(a)

Shape: (2, 4)
[[0.62049994 0.00231598 0.17504248 0.05006043]
 [0.68392223 0.70521001 0.93827918 0.42485141]]


#### Create empty arrays
```python
np.empty()
```

In [14]:
# Create an empty array (uninitialized so whatever is in mem loc already stays)
np.empty((3, 2))

array([[1.39069238e-309, 1.39069238e-309],
       [1.39069238e-309, 1.39069238e-309],
       [1.39069238e-309, 1.39069238e-309]])

#### Create evenly spaced arrays

```python
np.arange()  # start, stop, step
np.linspace()  # start, stop, num quantities
```

Whats the difference?
* `linspace` enables you to control the precise end value
* `arange` gives you more direct control over the increments between values

In [15]:
# Create an array of evenly spaced values (step value)
# from 10 to 25 in steps of 5
d = np.arange(10, 25, 5)
print_a(d)

Shape: (3,)
[10 15 20]


In [16]:
# Create an array of evenly spaced values (number of samples)
# 9 numbers from 0 to 2
print_a(np.linspace(0, 2, 9))

Shape: (9,)
[0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]


#### Create an identity matrix

```python
np.eye()
```

In [6]:
# Create a 10X10 identity matrix; diagonal of 1s
f = np.eye(10)
print_a(f)

Shape: (10, 10)
[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]


#### Create a mesh grid

```python
np.mgrid()
```

In [18]:
# Create a 2 meshgrids with nums from 0 to 5
x, y = np.mgrid[0:5, 0:5]
print_a(x)
print_a(y)

Shape: (5, 5)
[[0 0 0 0 0]
 [1 1 1 1 1]
 [2 2 2 2 2]
 [3 3 3 3 3]
 [4 4 4 4 4]]
Shape: (5, 5)
[[0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]
 [0 1 2 3 4]]


In [19]:
# Create a meshgrid from -1 to 1 with 5 numbers
x = np.mgrid[-1:1:5j]
print_a(x)

Shape: (5,)
[-1.  -0.5  0.   0.5  1. ]


## I/O

```python
np.save()
np.load()
np.savez()
```

### Saving and Loading on Disk

Save an array as a .npy file

In [78]:
np.save('my_array', a)

In [79]:
np.load('my_array.npy')

array([[0.97794309, 0.61588732, 0.03405745, 0.17937364],
       [0.43927472, 0.98477142, 0.61058369, 0.51144631]])

Save several arrays in an uncompressed .npz file

In [80]:
np.savez('array.npz', array1=a, array2=b)

In [81]:
my_arrays = np.load('array.npz')

In [82]:
my_arrays['array1']

array([[0.97794309, 0.61588732, 0.03405745, 0.17937364],
       [0.43927472, 0.98477142, 0.61058369, 0.51144631]])

### Saving and Loading Text Files

We can also load data from text file (each row must have same # of vals)

In [8]:
# Create the file to run this cell
# np.loadtxt("myfile.txt")

Or load data from a text file with missing values handled in a specific way

In [None]:
# Create the file to run this cell
# np.genfromtxt("my_file.csv", delimiter=',')

In [9]:
# Create the file to run this cell
# np.savetxt("myarray.txt", a, delimiter=" ")

## Datatypes

```python
np.int64, np.float32, np.complex, np.bool, np.object, np.string_, np.unicode_
```

In [None]:
np.int64  # Signed 64-bit integer type

In [None]:
np.float32  # Standard double-precision floating point

In [None]:
np.complex  # Complex numbers represented by 128 floats

In [None]:
np.bool  # Boolean type storing TRUE and FALSE values

In [None]:
np.object  # Python object type

In [None]:
np.string_  # Fixed-length string type

In [None]:
np.unicode_  # Fixed-length unicode type

## Inspecting Your Array

```python
my_array.shape  # array shape
my_array.ndim  # num dimensions
my_array.size  # num elements
len(my_array)  # array length
my_array.astype(int)  # convert type
```

In [20]:
print(a)
a.shape  # Array dimensions

[[0.62049994 0.00231598 0.17504248 0.05006043]
 [0.68392223 0.70521001 0.93827918 0.42485141]]


(2, 4)

In [21]:
# Length of array
len(a)

2

In [22]:
# Length of array
len(a[0])

4

In [23]:
b

array([[1.5, 2. , 3. ],
       [4. , 5. , 6. ]])

In [24]:
b.ndim  # Number of array dimensions

2

In [25]:
e

array([[7, 7],
       [7, 7]])

In [26]:
e.size  # Number of array elements

4

In [27]:
b.dtype  # Data type of array elements

dtype('float64')

In [28]:
b.dtype.name  # Name of data type

'float64'

In [29]:
b.astype(int)  # Convert an array to a different type

array([[1, 2, 3],
       [4, 5, 6]])

## Array Mathematics

### Arithmetic Ops

There is a feature of numpy called **broadcasting**. It allows mathematical array operations to be performed very quickly (in C instead of Python)

Broadcasting happens automatically and is generally the fastest approach. 

It is done using four rules:

* All input arrays with ndim smaller than the input array of largest ndim, have 1’s prepended to their shapes.

* The size in each dimension of the output shape is the maximum of all the input sizes in that dimension.

* An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.

[More on broadcasting from the docs](https://docs.scipy.org/doc/numpy-1.10.0/user/basics.broadcasting.html)

Let's see it in action using the two arrays defined below

In [30]:
a = np.array([1, 2, 3])
b = np.array([[1.5, 2.,  3.], [4.,  5.,  6.]])
print_a(a)
print_a(b)

Shape: (3,)
[1 2 3]
Shape: (2, 3)
[[1.5 2.  3. ]
 [4.  5.  6. ]]


#### Subtraction and Addition

We can see that the dimension mismatch error doesn't happen b/c array a is broadcast to each from in b

In [31]:
g = a - b
print('Subtraction\n a - b = g\n\n {}\n - \n{}\n =\n{}'.format(a, b, g))

Subtraction
 a - b = g

 [1 2 3]
 - 
[[1.5 2.  3. ]
 [4.  5.  6. ]]
 =
[[-0.5  0.   0. ]
 [-3.  -3.  -3. ]]


In [32]:
np.subtract(a, b)

array([[-0.5,  0. ,  0. ],
       [-3. , -3. , -3. ]])

In [33]:
h = b + a
print('Addition\n a + b = h\n\n {}\n + \n{}\n =\n{}'.format(a, b, h))

Addition
 a + b = h

 [1 2 3]
 + 
[[1.5 2.  3. ]
 [4.  5.  6. ]]
 =
[[2.5 4.  6. ]
 [5.  7.  9. ]]


In [34]:
np.add(b, a)

array([[2.5, 4. , 6. ],
       [5. , 7. , 9. ]])

#### Division and Multiplication

In [35]:
i = a / b
print('Division\n a / b = h\n\n {}\n / \n{}\n =\n{}'.format(a, b, i))

Division
 a / b = h

 [1 2 3]
 / 
[[1.5 2.  3. ]
 [4.  5.  6. ]]
 =
[[0.66666667 1.         1.        ]
 [0.25       0.4        0.5       ]]


In [36]:
np.divide(a, b)

array([[0.66666667, 1.        , 1.        ],
       [0.25      , 0.4       , 0.5       ]])

In [37]:
j = a * b
print('Multiplication\n a * b = h\n\n {}\n * \n{}\n =\n{}'.format(a, b, j))

Multiplication
 a * b = h

 [1 2 3]
 * 
[[1.5 2.  3. ]
 [4.  5.  6. ]]
 =
[[ 1.5  4.   9. ]
 [ 4.  10.  18. ]]


In [38]:
np.multiply(a, b)

array([[ 1.5,  4. ,  9. ],
       [ 4. , 10. , 18. ]])

*Note: If you want to do matrix multiplication you need to use `np.dot()`; the `multiply()` function does element-wise multiplication*

#### Other Mathy Stuff

Exponentiation, square roots, sin, cos, etc

In [39]:
print(b)
np.exp(b)  # Exponentiation (ie. e^b)

[[1.5 2.  3. ]
 [4.  5.  6. ]]


array([[  4.48168907,   7.3890561 ,  20.08553692],
       [ 54.59815003, 148.4131591 , 403.42879349]])

In [40]:
print(b)
np.sqrt(b)  # Square root

[[1.5 2.  3. ]
 [4.  5.  6. ]]


array([[1.22474487, 1.41421356, 1.73205081],
       [2.        , 2.23606798, 2.44948974]])

In [41]:
print(a)
np.sin(a)  # Print sines of an array

[1 2 3]


array([0.84147098, 0.90929743, 0.14112001])

In [42]:
print(b)
np.cos(b)  # Element-wise cosine

[[1.5 2.  3. ]
 [4.  5.  6. ]]


array([[ 0.0707372 , -0.41614684, -0.9899925 ],
       [-0.65364362,  0.28366219,  0.96017029]])

In [43]:
print(a)
np.log(a)  # Element-wise natural logarithm

[1 2 3]


array([0.        , 0.69314718, 1.09861229])

### Comparison (element-wise and array-wise)

```python
array_1 == array_2  # element-wise comp
np.array_equal()  # test if same shape, same elements values
np.array_equiv()  # test if broadcastable shape, same elements values
np.allclose() # test if same shape, elements have close enough values
```

In [44]:
print_a(a)
a < 2  # Element-wise comparison array([True, False, False], dtype=bool)

Shape: (3,)
[1 2 3]


array([ True, False, False])

In [45]:
print_a(a)
print_a(b)
a == b  # Element-wise comparison

Shape: (3,)
[1 2 3]
Shape: (2, 3)
[[1.5 2.  3. ]
 [4.  5.  6. ]]


array([[False,  True,  True],
       [False, False, False]])

Let's take a look at some of the equivalency functions that numpy provides and learn how they are different.

In [46]:
a = np.array([1, 2, 3])
b = np.array([1, 2, 3])

In [47]:
print(np.array_equal(a, b))
print(np.array_equiv(a, b))
print(np.allclose(a, b))

True
True
True


Array `a` and `b` are equal in shape and element values.

In [48]:
a = np.array([1, 2, 3])
b = np.array([[1, 2, 3], [1, 2, 3]])

In [49]:
print(np.array_equal(a, b))
print(np.array_equiv(a, b))
print(np.allclose(a, b))

False
True
True


Array `a` and `b` can be broadcasted together and the element values are the same when doing so `array_equiv` and `allclose` are `True`.

In [52]:
a = np.array([1e10, 1e-8])
b = np.array([1.00001e10, 1e-9])

In [53]:
print(np.array_equal(a, b))
print(np.array_equiv(a, b))
print(np.allclose(a, b))

False
False
True


Array `a` and `b` are clearly different but they are very close so `allclose` will return `True`.

### Aggregate Functions

```python
my_array.sum()
my_array.min()
my_array.max()
my_array.cumsum()
my_array.median()
my_array.mean()
my_array.std()
```

Starting with two arrays `a` and `b` lets demonstrate some aggregation functions.

In [10]:
a = np.array([1, 2, 3])
b = np.array([[1, 2, 3], [4, 5, 6]])

In [55]:
a.sum()  # Array-wise sum

6

In [56]:
print('{}\n'.format(a))

print(a.min())  # Array-wise minimum value
print(a.max())  # Array-wise maximum value

[1 2 3]

1
3


In [57]:
b[0][2] = 0
b[1][1] = 1
print('{}\n'.format(b))

print(b.min(axis=0))  # Minimum value of an array row (min along axis)
print(b.max(axis=0))  # Maximum value of an array row (max along axis)

[[1 2 0]
 [4 1 6]]

[1 1 0]
[4 2 6]


**Note on `axis` indexing**

In numpy, axis refers to the dimension along which you want to perform the function.  

* For a 2D matrix you can think of axis 0 and axis 1 as the rows and columns of the matrix respectively.
* For a higher dimensional matrix each axis refers to a dimension in the shape of the matrix.

In [22]:
a = np.random.randint(5, size=(2, 4, 3))
print(a.ndim)
print(a.shape)
print(a)

[[[1 0 1]
  [4 3 2]
  [4 0 1]
  [1 3 3]]

 [[4 0 3]
  [2 4 4]
  [1 1 4]
  [2 1 3]]]
(2, 4, 3)


In [None]:
print('{}\n'.format(a))
print(a.min(axis=0))
print(a.min(axis=0))
print(a.min(axis=0))


In [58]:
print('{}\n'.format(b))

print(b.min(axis=1))  # Minimum value of an array row (min along axis)
print(b.max(axis=1))  # Maximum value of an array row (max along axis)

[[1 2 0]
 [4 1 6]]

[0 1]
[2 6]


In [59]:
print('{}\n'.format(b))
b.cumsum(axis=1)  # Cumulative sum of the elements

[[1 2 0]
 [4 1 6]]



array([[ 1,  3,  3],
       [ 4,  5, 11]])

In [60]:
print('{}\n'.format(a))
a.mean()  # Mean

[1 2 3]



2.0

In [61]:
print('{}\n'.format(b))
np.median(b)  # Median

[[1 2 0]
 [4 1 6]]



1.5

In [62]:
print('{}\n'.format(a))
np.corrcoef(a)  # Correlation coefficient

[1 2 3]



1.0

In [63]:
print('{}\n'.format(b))
np.std(b)  # Standard deviation

[[1 2 0]
 [4 1 6]]



2.0548046676563256

## Copying

```python
my_array.view()
my_array.copy()
```

Its important that you make sure you know when you are creating an actual copy of your array vs another view of the same array. If you just create another view and start experimenting with the array, you need to know you are changing the actual original array.

**Copy / Deep Copy:** When the contents are physically stored in another location, it is called Copy (deep by default). 

**View / Shallow Copy:** If on the other hand, a different view of the same memory content is provided, we call it as View.

In [11]:
print('Array: {} --> mem loc: {}\n'.format(a, a.__array_interface__['data']))

h = a  # Create a view of the array with the same data

print('Array: {} --> mem loc: {}\n'.format(h, h.__array_interface__['data']))

Array: [1 2 3] --> mem loc: (140695021199088, False)

Array: [1 2 3] --> mem loc: (140695021199088, False)



In [12]:
print('Array: {} --> mem loc: {}\n'.format(a, a.__array_interface__['data']))

h = a.view()  # Create a view of the array with the same data

print('Array: {} --> mem loc: {}\n'.format(h, h.__array_interface__['data']))

Array: [1 2 3] --> mem loc: (140695021199088, False)

Array: [1 2 3] --> mem loc: (140695021199088, False)



We can see that `h` is just another view of a. They each point to or reference the same memory location.

In [13]:
print('Array: {} --> mem loc: {}\n'.format(a, a.__array_interface__['data']))

h = np.copy(a)  # Create a copy of the array

print('Array: {} --> mem loc: {}\n'.format(h, h.__array_interface__['data']))

Array: [1 2 3] --> mem loc: (140695021199088, False)

Array: [1 2 3] --> mem loc: (140695021943792, False)



In [14]:
print('Array: {} --> mem loc: {}\n'.format(a, a.__array_interface__['data']))

i = a.copy()  # Create a copy of the array

print('Array: {} --> mem loc: {}\n'.format(h, h.__array_interface__['data']))

Array: [1 2 3] --> mem loc: (140695021199088, False)

Array: [1 2 3] --> mem loc: (140695021943792, False)



These two examples however create actual copies of the array that are stored in different memory locations.

## Sorting

```python
my_array.sort()
```

In [68]:
a = np.array([2, 1, 4])
print(a)

a.sort()  # Sort an array in place
print(a)

[2 1 4]
[1 2 4]


We can also sort along a specific axis of an array

In [69]:
c = np.random.randint(0, 9, (2, 2, 3))
print('{}\n\n'.format(c))

c.sort(axis=0)  # Sort the elements of an array's axis
print(c)

[[[4 8 1]
  [5 8 2]]

 [[1 3 0]
  [4 7 6]]]


[[[1 3 0]
  [4 7 2]]

 [[4 8 1]
  [5 8 6]]]


## Subsetting, Slicing, Indexing

```python
my_array[2]
my_array[0][1]
my_array[0, 1] 
my_array[0:2, 1]
my_array[0:2, ...]
```

**Subsetting** - selecting specific rows and columns within the data

**Slicing** - selecting subsets based on a range of values

**Indexing** - selecting specific array values

In [70]:
a = np.array([1, 2, 4])
b = np.array([[1, 2, 0], [4, 1, 6]])
print_a(a)
print_a(b)

Shape: (3,)
[1 2 4]
Shape: (2, 3)
[[1 2 0]
 [4 1 6]]


In [71]:
a[2]  # Select the element at the 2nd index

4

In [72]:
b[0][1]  # Select the 0th element then the 1st element of the 0th

2

In [73]:
a[a < 2]  # Select elements less than 2

array([1])

In [74]:
b[[1, 0, 1, 0], [0, 1, 2, 0]]  # Select elements (1,0),(0,1),(1,2) and (0,0)

array([4, 2, 6, 1])

In [75]:
b[0, 1]  # Select 0th item in first dim, 1st item in second dim

2

In [76]:
a[0:2]  # Select items between indices 0 and 2

array([1, 2])

In [77]:
b[0:2, 1]  # Select items at rows 0 and 1 in column 1

array([2, 1])

In [78]:
b[:1]  # Select all items at row 0

array([[1, 2, 0]])

In [81]:
b[0:1, :]  # Select all items at row 0

array([[1, 2, 0]])

In [82]:
b[1, ...]  # Same as [1,:,:]

array([4, 1, 6])

In [83]:
# Try predicting the output of this one on your own
b[[1, 0, 1, 0]][:, [0, 1, 2, 0]]

array([[4, 1, 6, 4],
       [1, 2, 0, 1],
       [4, 1, 6, 4],
       [1, 2, 0, 1]])

## Array Manipulation

Using the following arrays let's see how we can manipulate them.

In [87]:
a = np.array([1, 2, 4])
b = np.array([[1, 2, 0], [4, 1, 6]])

c = np.array([[1], [2], [3]])
d = np.array([[[0], [0], [0]], [[0], [0], [0]]])

print_a(a)
print_a(b)
print_a(c)
print_a(d)

Shape: (3,)
[1 2 4]
Shape: (2, 3)
[[1 2 0]
 [4 1 6]]
Shape: (3, 1)
[[1]
 [2]
 [3]]
Shape: (2, 3, 1)
[[[0]
  [0]
  [0]]

 [[0]
  [0]
  [0]]]


### Changing Array Shape

```python
np.transpose()
my_array.ravel()
my_array.flatten()
my_array.reshape()
```

In [88]:
# Transpose flips the rows and columns
i = np.transpose(b)
i

array([[1, 4],
       [2, 1],
       [0, 6]])

In [89]:
# Transpose again gives back the original array
i.T  # alternative to writing out transpose

array([[1, 2, 0],
       [4, 1, 6]])

In [90]:
# Return a flattened array
b.ravel()
# b.flatten() # alternative to ravel

array([1, 2, 0, 4, 1, 6])

In [95]:
# Return array containing the same data with a new shape
b.reshape(3, 2)

array([[1, 2],
       [0, 4],
       [1, 6]])

In [98]:
b

array([[1, 2, 0],
       [4, 1, 6]])

In [102]:
# Change shape and size of array in place
b.resize((1, 6))
b

array([[1, 2, 0, 4, 1, 6]])

In [104]:
b.reshape(2, 3)

array([[1, 2, 0],
       [4, 1, 6]])

### Adding/Removing Elements

```python
my_array.append()
my_array.insert()
my_array.delete()
```

In [106]:
# Append to array
np.append(a, [3, 3, 3])

array([1, 2, 4, 3, 3, 3])

In [109]:
# Insert values along the given axis before the given indices
np.insert(a, 1, 5)

array([1, 5, 2, 4])

In [110]:
# Return a new array with sub-arrays along an axis deleted
np.delete(a, [1])

array([1, 4])

### Combining Arrays

```python
np.concatenate()
np.vstack()
np.hstack()
np.column_stack()
np.row_stack()
```

Let's redefine our arrays again.

In [123]:
a = np.array([[1, 2, 4], [1, 2, 4]])
b = np.array([[1, 2, 0], [4, 1, 6]])

print_a(a)
print_a(b)

Shape: (2, 3)
[[1 2 4]
 [1 2 4]]
Shape: (2, 3)
[[1 2 0]
 [4 1 6]]


In [124]:
# Concatenate arrays
print(a.shape)
np.concatenate((a, b), axis=0)

(2, 3)


array([[1, 2, 4],
       [1, 2, 4],
       [1, 2, 0],
       [4, 1, 6]])

Stack arrays vertically (row-wise)

In [125]:
np.vstack((a, b))

array([[1, 2, 4],
       [1, 2, 4],
       [1, 2, 0],
       [4, 1, 6]])

Stack arrays horizontally (column-wise)

In [126]:
np.hstack((a, b))

array([[1, 2, 4, 1, 2, 0],
       [1, 2, 4, 4, 1, 6]])

### Splitting Arrays

```python
np.hsplit()
np.vsplit()
```

In [128]:
# Split the array horizontally at the 3rd index
np.hsplit(a, 3)

[array([[1],
        [1]]), array([[2],
        [2]]), array([[4],
        [4]])]

In [129]:
# Split the array vertically at the 2nd index
np.vsplit(a, 2)

[array([[1, 2, 4]]), array([[1, 2, 4]])]

# Exercises

We all know we don't really learn anything until we have to struggle through doing it :D 

Roll up your sleeves and dive in.

### Create np array of tuples with date range

Create a numpy array of tuples with dates from 2019-2030 in the first position followed by 6 random numbers. Do this in the most efficient way.

```python
data_to_create = [('2019-01-01', 100.  , 104.06,  95.96, 100.34, 22351900, 100.34)
                  ('2020-01-01', 101.01, 109.08, 100.5 , 108.31, 11428600, 108.31)
                  ('2021-01-01', 110.75, 113.48, 109.05, 109.4 ,  9137200, 109.4 )
                  ...
                  ('2028-01-01', 313.16, 341.89, 310.3 , 332.  , 10597800, 332.  )
                  ('2029-01-01', 355.79, 381.95, 345.75, 381.02,  8905500, 381.02)
                  ('2030-01-01', 393.53, 394.5 , 357.  , 362.71,  7784800, 362.71)]
```

[Numpy Datetime Docs](https://docs.scipy.org/doc/numpy/reference/arrays.datetime.html)

In [None]:
my_data = np.arange('2005-02', '2005-03', dtype='datetime64[D]')

### Create a new array with only the 3rd fourth and 6th columns

### Clean up this data (replace all NaNs with 0 and break the array up into three)

### Flip this array (so rows are columns and columns are rows)

### Remove all rows with a cumulative sum less than 10

### Sort the 5th column and 3rd row of this array

### Create a square grid of values between 0 and 1

### Create a column of linearly spaced values and concatenate if to existing array

# Answers

Many of the exercises will have several correct answers. I tried to pick the fastest and most memory efficient answers but if you think you have a better one 1) prove it using the `%timeit` magic then 2) let me know!