## outline

- read and write dataset
- data types
- mathematics
- broadcasting
- example

## Reading Materials

- [HDF5](https://en.wikipedia.org/wiki/Hierarchical_Data_Format) (HDF5 python io module: [h5py](http://docs.h5py.org/en/stable/)) 
- [NumPy basics](https://docs.scipy.org/doc/numpy/user/basics.html)
    - [Data types](https://docs.scipy.org/doc/numpy/user/basics.types.html)
    - [Array creation](https://docs.scipy.org/doc/numpy/user/basics.creation.html)
    - [Indexing](https://docs.scipy.org/doc/numpy/user/basics.indexing.html)
    - [Broadcasting](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
    

- [vector](https://en.wikipedia.org/wiki/Vector_(mathematics_and_physics))
- [matrix](https://en.wikipedia.org/wiki/Matrix_(mathematics))
- [tensor](https://en.wikipedia.org/wiki/Tensor#As_multidimensional_arrays)
- [dot product](https://en.wikipedia.org/wiki/Dot_product) (`np.dot()`)
- [matrix product](https://en.wikipedia.org/wiki/Matrix_multiplication#Matrix_product_.28two_matrices.29) (`np.matmul()`)
- [entrywise product](https://en.wikipedia.org/wiki/Hadamard_product_(matrices)) (`*`)
- [cross product](https://en.wikipedia.org/wiki/Cross_product) (`np.cross()`)

# Numpy Demo

- read and write dataset
- data types
- mathematics
- broadcasting
- review example

## Math Operations in Numpy

- vector, matrix, tensor
- dot product
- matrix product
- entrywise product
- cross product

In [None]:
import numpy as np

### ndarray

In [None]:
# vector

v = np.array([1,2,3,4,5,6,7,8,9,10])
v
# Number of dimensions
v.ndim
# Dimensions
v.shape
# Size of elements
v.size
# Data type
v.dtype

In [None]:
# matrix

m = np.array([
    [1,2,3], 
    [4,5,6]
])

print (m)
print ("m ndim: ", m.ndim)
print ("m shape:", m.shape)
print ("m size: ", m.size)
print ("m dtype: ", m.dtype)

In [None]:
# tensor

t = np.array([
    [[1,2,3], 
     [3,4,5]], 
    [[5,6,7], 
     [7,8,9]]])

print (t)
print ("t ndim: ", t.ndim)
print ("t shape:", t.shape)
print ("t size: ", t.size)
print ("t dtype: ", t.dtype)

In [None]:
# Arrays from Functions
z = np.zeros((2,3,4,7,9))
z = np.zeros_like(m)

o = np.ones((2,3))
o = np.ones_like(m)

r = np.random.random((2,3))
r = np.random.randn(2,3)


### Indexing

In [None]:
# indexing and slicing
print(v)
v[9]
v[1:8]
v[1:9:2]

In [None]:
# integer array indexing
v[[1,3,5]]

In [None]:
# Boolean array indexing
v < 4
v[v > 9 | v<=8]

### Basic Math

In [None]:
np.sin, np.cos, np.tan
np.exp, np.log, np.log10
np.fft.fft, np.fft.ifft

In [None]:
# Transposing
m.transpose()
m.T

In [None]:
# entrywise
m * m

# dot and matrix product

np.dot(v, v)
np.matmul(m, m.T)

In [None]:
m.shape

In [None]:
# operation across a dimension
print(m)
np.sum(m, 1)

s = np.random.randn(10, 200)
np.mean(s, 1)

### Advanced Operations


In [None]:
# broadcasting

v1 = np.random.rand(5,3,4)
v2 = np.random.rand(5,1,4)
v3 = np.random.rand(1,3,4)

v1 * v2 * v3

In [None]:
# flattening and reshaping
m.reshape((1,6))

In [None]:
# concatenation
np.concatenate((m,m),0)

np.hstack((m,m)) # np.concatenate((m,m),1)
np.vstack((m,m)) # np.concatenate((m,m),0)

---

## HDF5 and MAT

In [None]:
from scipy.io import loadmat, savemat

M1 = np.array(np.random.rand(2, 100), dtype='f16')

sample = {'M1': M1}

savemat('test.mat', sample)

a = loadmat('test.mat')
a['M1'].dtype
# # tip: 1d vector would become 2d matrix automatically.

In [None]:
a = loadmat('test.mat')
a
a['M1'].shape

---

In [None]:
import h5py

# create dataset

sample_f = h5py.File('sample.h5', 'w')

sample_f.create_group('matrix')

sample_f.create_dataset('matrix/3darray', data=np.zeros((10, 100, 20), dtype='f8'))

sample_f.create_dataset('matrix/1darray', data=np.ones(44100*60, dtype='int16'), compression=4)

sample_f['matrix/2darray'] = np.random.rand(100, 100)

sample_f.close()


In [None]:
# compression
sample_f_data = np.ones(44100*60)
sample_f_data = np.sin(np.linspace(0, 60, 44100*60))
sample_f_data = np.random.random((44100*60,))

sample_f = h5py.File('big_data.h5', 'w')
sample_f['matrix'] = sample_f_data
sample_f.close()

sample_f = h5py.File('big_data_compression.h5', 'w')
sample_f.create_dataset(name='matrix', data=sample_f_data, compression=4)
sample_f.close()

In [None]:
# read in

_f = h5py.File('sample.h5', 'r')

# list(_f['matrix'].keys())

a = _f['matrix']['1darray']
b = np.array(_f['matrix/3darray'])

_f.close()
b.dtype

In [None]:
with h5py.File('sample.h5', 'r') as _f:
    a = np.array(_f['matrix/2darray'])