# NumPy and SciPy

NumPy and SciPy are the twin pillars of doing data science in Python.  Early on in Python's history, it became clear that Python's list data structures weren't ideal for doing heavy-duty number crunching on vectors and matrices. 

So, numpy was born to try to solve the problem, and introduce an array-type data structure into Python.

Let's first create an array:

In [24]:
import numpy as np
a = np.array([1,2,3])
a

array([1, 2, 3])

Notice that we have to pass in a list of numbers rather than 

np.array(1,2,3)  ## ERROR: won't work

np.array([1,2,3]) ## Correct

Let's do a sequence of numbers with arange

In [25]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [26]:
# Try to multiply sequence by a scalar
np.arange(10) * np.pi

array([  0.        ,   3.14159265,   6.28318531,   9.42477796,
        12.56637061,  15.70796327,  18.84955592,  21.99114858,
        25.13274123,  28.27433388])

We can make multi-dimensional arrays from single dimensions with shape.

In [27]:
a = np.array([1,2,3,4,5,6])
a.shape = (2,3)
a

array([[1, 2, 3],
       [4, 5, 6]])

### Matrices

We can make matrices

In [28]:
np.matrix('1 2; 3 4')

matrix([[1, 2],
        [3, 4]])

Matrix multiplication requires the use of matrices.

In [29]:
## Matrix Multiply

a1 = np.matrix('1 2; 3 4')
a2 = np.matrix('3 4; 5 7')
a1 * a2


matrix([[13, 18],
        [29, 40]])

In [30]:
# Converting an array to a matrix
mat_a = np.mat(a1)
mat_a

matrix([[1, 2],
        [3, 4]])

### Sparse Matrices

Sometimes we may have sparse data and want to store a sparse matrix

In [31]:
import numpy, scipy.sparse
n = 100000
x = (numpy.random.rand(n) * 2).astype(int).astype(float) # 50% sparse vector
x_csr = scipy.sparse.csr_matrix(x)
x_dok = scipy.sparse.dok_matrix(x.reshape(x_csr.shape))

x_dok


<1x100000 sparse matrix of type '<class 'numpy.float64'>'
	with 50149 stored elements in Dictionary Of Keys format>

### Loading from CSV file

We can load from a CSV file

In [32]:
import csv
with open('../../data/array/array.csv', 'r') as csvfile:
    csvreader = csv.reader(csvfile)
    data = []
    for row in csvreader:
        row = [float(x) for x in row]
        data.append(row)

data



[[2.0, 3.0, 4.0, 5.0], [3.0, 4.0, 5.0, 6.0], [7.0, 9.0, 9.0, 10.0]]

### Solving a matrix 

In [33]:
import numpy as np
import scipy as sp
a = np.array([[3,2,0],[1,-1,0],[0,5,1]])
b = np.array([2,4,-1])
x = np.linalg.solve(a,b)
x



array([ 2., -2.,  9.])

In [34]:
#Checking the answer

np.dot(a, x) == b


array([ True,  True,  True], dtype=bool)