### Numpy Matrices

Numerical Python library has support for large, multi-dimensional `arrays and matrices`. 

* Vectors
* Matrices
* Access Elements
* Sparse Matrices
* Vectorization)
* Broadcasting
* Count Vectorizer

### Vectors

A vector is `one-dimensional` array.

In [None]:
import numpy as np

a = np.array([1, 2, 3])
b = np.array([[1], 
              [4], 
              [3]])

print("Row vector:\n", a)
print("Column vector:\n", b)

### Matrices

Numpy's `main data structure` is the multidimensional array (matrix).

In [None]:
# matrix with three rows, four columns

A = np.array([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
])

print("Matrix:\n", A)
print("Shape:", A.shape)
print("Size:", A.size)
print("Number of array dimension:", A.ndim)

### Access Elements

The data elements in a matrix can be accessed by using [ : ] `slice notation` (up-to OR after).

In [None]:
a = np.array([1, 2, 3, 4, 5, 6])

print("a[:] =", a[:])           # entire range of elements
print("a[:3] =", a[:3])         # 0 to 3 (not included)
print("a[0:3] =", a[0:3])       # 0 to 3 (not included)
print("a[3:] =", a[3:])         # 3 (included) to last
print("a[-1] =", a[-1])         # last
print("a[3:-1] =", a[3:-1])     # 3 to last (not included)

Arrays are `zero-indexed`, first element is 0

In [None]:
A = np.array([
    [1, 2, 3, 4],
    [5, 6, 7, 8],
    [9, 10, 11, 12],
])

print("A[1,1] =", A[1,1], "\n")       # second row, second column
print("A[:2, :] ="); print(A[:2, :], "\n")   # 0 to 2 rows, all columns
print("A[:, 1:2] ="); print(A[:, 1:2]) # all rows, second column

### Sparse Matrices

A sparse matrix stores `only non-zero` elements, for computation savings.

In [None]:
from scipy import sparse

A = np.array([
    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
    [0, 1, 0, 0, 0, 0, 0, 0, 0, 0], 
    [3, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
])
B = sparse.csr_matrix(A)
print(B)

### Vectorization

Vectorization is used to `speed up` the Python code without using loop.  
Insteed of opertating on a single value at a time, it operates on a `set of value` (vector) at a time.

In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
])
B = np.vectorize(lambda x: x + 100)(A)
print(B)

### Broadcasting

Broascasting allows Numpy to handle arrays of `different shapes` during arithmetic operations.

In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
])
B = A + 100
print(B)

### Count Vectorizer

We can represent `texts as vectors` and compute similarity.

In [None]:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

A = ['London Paris London', 'Paris Paris London']

cv = CountVectorizer()
A = cv.fit_transform(A)
print(A)
print(A.toarray())

similarity_scores = cosine_similarity(A)
print(similarity_scores)

### References

* [Scikit Count Vectorizer](https://medium.com/@sumanadhikari/building-a-movie-recommendation-engine-using-scikit-learn-8dbb11c5aa4b)
* [Numpy Matrices](https://www.minte9.com/mlearning/numpy-matrices-1434)