# import numpy as np

## Workshop: NumPy and Data Representation

NumPy Provides
  1. An array object of arbitrary homogeneous items
  2. Fast mathematical operations over arrays
  3. Linear Algebra, Fourier Transforms, Random Number Generation

### Cheatsheet

https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf

More cheatsheets:
https://www.datacamp.com/community/data-science-cheatsheets?page=3

### Import the package

In [None]:
import numpy as np

In [None]:
np?

### Basic Data Structures

![scalar vector matrix tensor](assets/numpy/scalar-vector-matrix-tensor.png)

(image: https://hadrienj.github.io/posts/Deep-Learning-Book-Series-2.1-Scalars-Vectors-Matrices-and-Tensors/)

### Scalar

In [None]:
x = np.array([1, 2, 3, 4])
x[0]

### Vector

In [None]:
x

In [None]:
x.shape

In [None]:
len(x)

### Matrix

In [None]:
A = np.zeros((3, 4))
A

In [None]:
A.shape

### Tensor (dimensions $\geq$ 3)

In [None]:
X = np.ones((2, 2, 6))
X

In [None]:
X.shape

### Rank

In [52]:
s = np.array(4)
print('A scalar is of rank {}'.format(s.ndim))

print("")
x = np.linspace(0, 1, 10) # 10 equally-spaced values between 0 and 1
print('A vector is of rank {}'.format(x.ndim))
print(x)

print("")
A = np.random.random((3, 1)) # random numbers
print('A matrix is of rank {}'.format(A.ndim))
print(A)

print("")
X = np.random.random((3, 1, 4))
print('Tensor X is of rank {}'.format(X.ndim))
print(X)

print("")
XX = np.empty((3, 1, 4, 2)) # unitialized (different from random)
print('Tensor XX is of rank {}'.format(XX.ndim))
print(XX)

A scalar is of rank 0

A vector is of rank 1
[ 0.          0.11111111  0.22222222  0.33333333  0.44444444  0.55555556
  0.66666667  0.77777778  0.88888889  1.        ]

A matrix is of rank 2
[[ 0.93905103]
 [ 0.63165122]
 [ 0.62902619]]

Tensor X is of rank 3
[[[ 0.21003664  0.19411033  0.95353435  0.03714082]]

 [[ 0.4268503   0.07245153  0.53138107  0.83209966]]

 [[ 0.65337557  0.9786256   0.57047978  0.4753161 ]]]

Tensor XX is of rank 4
[[[[ 0.63460413  0.52622078]
   [ 0.64221227  0.95125177]
   [ 0.94785865  0.65059513]
   [ 0.46013309  0.13267439]]]


 [[[ 0.67149455  0.37941553]
   [ 0.50059386  0.84152501]
   [ 0.97954904  0.53086358]
   [ 0.43180785  0.89261078]]]


 [[[ 0.1638495   0.05872894]
   [ 0.31565151  0.35396833]
   [ 0.99781086  0.66716404]
   [ 0.82542593  0.24375789]]]]


### Array manipulation

### Indexing

In [55]:
A = np.random.random((3, 2))
A

array([[ 0.714337  ,  0.15985045],
       [ 0.64382694,  0.38746079],
       [ 0.80504797,  0.01887335]])

In [66]:
A[0] # first row

array([ 0.714337  ,  0.15985045])

In [64]:
A[1][0] # second row, first column

0.64382694421780418

In [61]:
A[-2][-1] # second to last row, last column

0.3874607934437333

In [67]:
A[-4] # out of bounds access

IndexError: index -4 is out of bounds for axis 0 with size 3

Exercise: Try indexing a Tensor

### Slicing

### Transposing

# import pandas as pd

## Workshop: Pandas and Data Transformation

# import matplotlib.pyplot as plt

## Workshop: Matplotlib and Data Visualization

# Putting everything together

## Workshop: Data Workflow

## Assessment 1: Data Workflow