### Scientific Computing in Python

### NumPy

NumPy is the base of scientific computing in python . NumPy is used in many other libraries such as Pandas, SciPy, Matplotlib, SciKit Learn and Stats Models

In [1]:
import numpy as np

In [2]:
arr = [[1, 2, 3],
       [4, 5, 6],
       [7,8,9]]

arr = np.array(arr)

In [3]:
arr[1, 2]

6

In [4]:
arr.dtype  

dtype('int32')

* Changing the type of NumPy Arrays

In [5]:
arr.astype('float32')

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]], dtype=float32)

* Dimensions of an Array

In [6]:
arr.ndim

2

* Shape of an Array

In [7]:
arr.shape

(3, 3)

This can also be used to find the Dimensions of an array

In [8]:
len(arr.shape)

2

### NumPy Array Construction and Indexing

**Array Construction**

These can be used to create a place holder arrays. In cases where we don't want to use the intital Values for computations

In [9]:
np.ones((2,3), dtype= np.int32)

array([[1, 1, 1],
       [1, 1, 1]])

In [10]:
np.zeros((3, 3), dtype= np.float64)

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [11]:
np.empty((2, 2))

array([[2.12199579e-314, 9.57020102e-312],
       [5.73116149e-321, 9.57020102e-312]])

In [12]:
np.diag((1, 2))

array([[1, 0],
       [0, 2]])

In [13]:
np.arange(5)

array([0, 1, 2, 3, 4])

In [14]:
np.arange(2, 3, 0.25)

array([2.  , 2.25, 2.5 , 2.75])

In [15]:
np.linspace(0, 5, num = 3)

array([0. , 2.5, 5. ])

**Array Indexing**

NumPy array indexing works same as Indexing and Slicing work same as Python

In [16]:
arr[0,1]

2

In [17]:
arr[: , 2] #Entire Second Column

array([3, 6, 9])

### Math and Universal Functions

These make working with NumPy arrays way more effecient and convinient 

In [18]:
arr = np.add(arr, 1)
arr

array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

This can also be done by using + , - , *, /

In [19]:
arr = arr + 1
arr

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

**Reduce :** Combining the rows over an axis 

In [20]:
np.add.reduce(arr, axis = 0) # Calculating the sums of each column

array([18, 21, 24])

In [21]:
np.multiply.reduce(arr, axis = 1)  # Calculating the product of each column

array([ 60, 336, 990])

In [22]:
arr.sum(axis = 0)

array([18, 21, 24])

* np.mean (computes arthemtic mean or average)
* np.std (compute the standard deviation)
* np.var (compute the variance)
* np.sort (sorts an array)
* np.min (returns the minimum value of array)
* np.max (returns the maximum value of array)
* np.argmin (returns the index of the minimum value)
* np.argmax (returns the index of the maximum value)
* np.argsort (returns the indices that would sort an array)
* np.array_equal (checks if two arrays have same shape and elements)

### Broadcasting

![image.png](attachment:image.png)

In [23]:
a = np.array([1, 2, 3])
a * 2

array([2, 4, 6])

In [24]:
arr + a

array([[ 4,  6,  8],
       [ 7,  9, 11],
       [10, 12, 14]])

In [25]:
arr

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

### Memory Copy

In [26]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

In [27]:
first_row = a[0]
first_row += 99

a

array([[100, 101, 102],
       [  4,   5,   6]])

In [28]:
a = np.array([[1, 2, 3],
              [4, 5, 6]])

In [29]:
first_row = a[0].copy()
first_row += 99

a

array([[1, 2, 3],
       [4, 5, 6]])

It's not the indexing that creates views and it's the Integer indexing and Integer slicing that creates views

### Fancy Indexing

In [30]:
a = np.array([1, 2, 3, 4, 5])

In [31]:
a[[0, 2, 4, 1, 3]] # Normal Python doesn't allow this and this can be used to shuffle arrays

array([1, 3, 5, 2, 4])

In [32]:
a > 2, a < 5

(array([False, False,  True,  True,  True]),
 array([ True,  True,  True,  True, False]))

In [33]:
a[(a > 2) & (a < 5)] # This is called boolean masking

array([3, 4])

### Random Number Generators

In [34]:
import random

In [35]:
np.random.rand(10)  # Generates 10 random numbers

array([0.08515074, 0.36376597, 0.62253141, 0.26109241, 0.34677838,
       0.07536789, 0.26734587, 0.36105679, 0.06244751, 0.62904909])

In [37]:
np.random.seed(123)  # This can be used to draw the same numbers everytime so there is some reproducibilty 
np.random.rand(3) 

array([0.69646919, 0.28613933, 0.22685145])

In [46]:
from sklearn.neighbors import KNeighborsClassifier



y = np.array([0]* 50 + [1]*50)
X = np.random.rand(100, 2)


knn = KNeighborsClassifier(n_neighbors= 3)
knn.fit(X, y)
knn.score(X, y) 

0.7

The score on the training set changes everytime we run the code block so in this type of cases random.seed is used