# NumPy and SciPy

NumPy and SciPy are the twin pillars of doing data science in Python.  Early on in Python's history, it became clear that Python's list data structures weren't ideal for doing heavy-duty number crunching on vectors and matrices. 

So, numpy was born to try to solve the problem, and introduce an array-type data structure into Python.

Let's first create an array:

## Import

In [1]:
import numpy as np

## Array

In [2]:
## Notice that we have to pass in a list of numbers rather than 
## np.array(1,2,3)  ## ERROR: won't work

np.array([1,2,3]) ## Correct## Array
a = np.array([1,2,3])
a

array([1, 2, 3])

In [3]:
# TODO: Make a numpy array with the values 3.1, 4.2, 5.6, 7.8



## Sequence

In [4]:
## Sequence 1 to 10
np.arange(1,10)

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [5]:
## TODO : create a range of 1 to 100 (including 100)

r = np.arange(1, 101)
print (r)

## Try to multiply sequence by a scaler (np.pi)
print (r *  np.pi)

[  1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18
  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36
  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54
  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72
  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89  90
  91  92  93  94  95  96  97  98  99 100]
[   3.14159265    6.28318531    9.42477796   12.56637061   15.70796327
   18.84955592   21.99114858   25.13274123   28.27433388   31.41592654
   34.55751919   37.69911184   40.8407045    43.98229715   47.1238898
   50.26548246   53.40707511   56.54866776   59.69026042   62.83185307
   65.97344573   69.11503838   72.25663103   75.39822369   78.53981634
   81.68140899   84.82300165   87.9645943    91.10618695   94.24777961
   97.38937226  100.53096491  103.67255757  106.81415022  109.95574288
  113.09733553  116.23892818  119.38052084  122.52211349  125.66370614
  128.8052988   131.946891

## Multi Dimensinal Arrays With Shape

In [6]:
a = np.array([1,2,3,4,5,6])
a.shape = (2,3) # 2 rows, 3 columns
a

array([[1, 2, 3],
       [4, 5, 6]])

## Matrices

In [7]:
## 2 rows and 3 columns
np.matrix('1 2 3; 4 5 6')

matrix([[1, 2, 3],
        [4, 5, 6]])

Matrix multiplication requires the use of matrices.

In [8]:
## Matrix Multiply

a1 = np.matrix('1 2; 3 4')
a2 = np.matrix('3 4; 5 7')
a1 * a2


matrix([[13, 18],
        [29, 40]])

In [9]:
# Converting an array to a matrix
mat_a = np.mat(a1)
mat_a

matrix([[1, 2],
        [3, 4]])

## Sparse Matrices

Sometimes we may have sparse data and want to store a sparse matrix

In [10]:
import numpy, scipy.sparse
n = 100000
x = (numpy.random.rand(n) * 2).astype(int).astype(float) # 50% sparse vector
x_csr = scipy.sparse.csr_matrix(x)
x_dok = scipy.sparse.dok_matrix(x.reshape(x_csr.shape))

x_dok


<1x100000 sparse matrix of type '<class 'numpy.float64'>'
	with 49987 stored elements in Dictionary Of Keys format>

## Loading from CSV file


In [11]:
import csv
with open('/data/misc/array.csv', 'r') as csvfile:
    csvreader = csv.reader(csvfile)
    data = []
    for row in csvreader:
        row = [float(x) for x in row]
        data.append(row)

data



[[2.0, 3.0, 4.0, 5.0], [3.0, 4.0, 5.0, 6.0], [7.0, 9.0, 9.0, 10.0]]

## Solving a matrix 

In [12]:
import numpy as np
import scipy as sp

a = np.array([[3,2,0],[1,-1,0],[0,5,1]])
b = np.array([2,4,-1])
x = np.linalg.solve(a,b)
x



array([ 2., -2.,  9.])

In [13]:
#Checking the answer

np.dot(a, x) == b


array([ True,  True,  True], dtype=bool)