In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Numpy
https://numpy.org/doc/stable/
- Implemented by C
- Based on **CPU** (Vectorization)
- Doesn't support Python generic, DL/ML functions

## + DL + GPU
- Pytorch(Facebook) -> Tensor based on NUMPY
- Tensorflow(Google) -> based on NUMPY
- SKlearn -> based on NUMPY, but doesn't support GPU
  - 

## NDArray
```Block memory + Indexing Scheme + Data Type Descriptor```
- Row-major
- Homogeneous Type
  - All elements in ndarray are the same type
  - Faster
- Implemented by **Array**, not **Linked List**
  - Python List is implemented by Linked List
- If objects have ```\_\_array\_\_``` method, it can be compatible with NDArray
  - ```np.array``` method is used for converting object to NDArray
  
### dType
- It's important for the time and space performance
- Both the types of C and the types of Python are supported
  - C is faster
  - **Smaller type is better than larger**
- Since NDArray is homogeneous, type is adjusted as the most general type
- Specify dType is ```dtype = np.Type```
- Check dType is ```object.dtype```
- Convert dType is ```object.astype(np.Type)```


In [106]:
# Check supported type
print(np.sctypeDict.keys())

# Performance
x = np.ones((int(1e9),), dtype=np.int64)
y = np.zeros((int(1e9),), dtype=np.int16)

%timeit x*x
%timeit y*y

dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'bool8', 'b1', 'int64', 'i8', 'uint64', 'u8', 'float16', 'f2', 'float32', 'f4', 'float64', 'f8', 'complex64', 'c8', 'complex128', 'c16', 'object0', 'bytes0', 'str0', 'void0', 'datetime64', 'M8', 'timedelta64', 'm8', 'int32', 'i4', 'uint32', 'u4', 'int16', 'i2', 'uint16', 'u2', 'int8', 'i1', 'uint8', 'u1', 'complex_', 'int0', 'uint0', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'bytes_', 'string_', 'str_', 'unicode_', 'object_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a'])
4.14 s ± 126 

## Vectorization
- Arrays enable to express **batch operations** on data without any loop 'cause **sequential**
- Vectorization means the operations of sequential data on elementwise
- Numpy's ndarray is array-like. So, it can do parallels but it's important that the numpy is optimied at Big-Data. Thus, in small data, basic python list can be faster



## Shape and Stride

### Shape
- ```(z, y, x)```
- Each ```vector``` has x elements
- Each ```matrix``` has y vectors
- Each ```3D array``` has z matrics
- **NDIM** is equals to ```len(shape)```
- Each ndarray is called by **NDIM** Tensor
  - shape = (n,) is 1st tensor, same as vector
  - shape = (m,n,) is 2nd tensor, same as matrix
  - shape = (p,m,n,) is 3rd tensor
- Examples(Dataset structure)
  - ```Vector Data```
    - Simple Dataset
    - 2D tensor(samples, features)
  - ```Timeseries or Sequential Data```
    - NL or Audio
    - 3D tensor(samples, timesteps, features)
  - ```Images Data```
    - 4D tensor(samples, height, width, channels) or (samples, channels, height, width)
  - ```Videos Data```
    - 5D tensor(samples, frames, Image)
    
### Stride
- ```(matrix's stride, vector's stride, element's stride)```
- stride is based on **octet**
- ```transpose``` changes the strides of ndarray and set flag **transpose** so just **O(1)**

In [45]:
tensor = np.arange(100, dtype=np.int32).reshape((5, 5, 4, 1))
assert tensor.ndim == 4 == len(tensor.shape)
assert tensor.strides == (4*20, 4*4, 4*1, 4*1)
assert tensor.transpose().strides == (4*1, 4*1, 4*4, 4*20)

## Broadcasting
- Operation with **different-shape** arrays
- Basic numeric operation aren't applied at different shapes
- By **PULLING** tensor, the operation can be done
- **PULLING** doesn't copy memory. Just re-use if it is fulfilled

### Pulling Condition
- The paired dimension is same
- Pairing can't be transposed
- Pulled demension **must** be ```1```
- Pulling follows tensor definition
  - 1D -> 2D -> 3D -> ...

In [68]:
x = np.ones((10,))
y = np.ones((1,))
x + y
z = np.ones((2,))
x + z # Error

ValueError: operands could not be broadcast together with shapes (10,) (2,) 

In [73]:
x = np.ones((2, 3)) # shape = (2, 3)
y = np.ones((3,)) # shape = (1, 3)
x + y

z = np.ones((2, 1)) # shape = (2,1)
x + z

w = np.ones((3, 1)) # shape = (3, 1)
x + w # Error

ValueError: operands could not be broadcast together with shapes (2,3) (3,1) 

In [84]:
x = np.ones((3, 4, 5)) # shape = (3, 4, 5)
y = np.ones((5, )) # shape = (1, 5)
x + y

z = np.ones((4, 5)) # shape = (1, 4, 5)
x + z 

w = np.ones((3, 4)) # shape = (3, 4)
x + w # Error 'cause (3, 4) is higher dimension in 3D tensor

ValueError: operands could not be broadcast together with shapes (3,4,5) (3,4) 

## ufunc(universal Functions)
- Numpy supports vectorized operations already compiled
- The operations apply elementwise defaultly
- It is **UFUNC**

### Benefits
- Simple and Optimistic
- How to do is better, faster than What to do
  - By using dtype, the performance can be upgraded
- Vectorization is better than loop
- **DON't** mix default python functions

In [91]:
# ufunc for one-tensor
x = np.arange(-20, 0)
np.abs(x)

# ufunc for more tensors
y = np.arange(1, 2)
np.add(x, y) # broadcasting and casting is supported
np.floor_divide(x, y)

array([-20, -19, -18, -17, -16, -15, -14, -13, -12, -11, -10,  -9,  -8,
        -7,  -6,  -5,  -4,  -3,  -2,  -1])