# NumPy 

NumPy (or Numpy) is a Linear Algebra Library for Python, the reason it is so important for Data Science with Python is that almost all of the libraries in the PyData Ecosystem rely on NumPy as one of their main building blocks.

Numpy is also incredibly fast, as it has bindings to C libraries. 

Helps in working with arrays and matrices in python

## Using NumPy

Once you've installed NumPy you can import it as a library:

In [1]:
import numpy as np



# Numpy Arrays

NumPy arrays are the main way we use Numpy. Numpy arrays essentially come in two flavors: vectors and matrices. Vectors are strictly 1-d arrays and matrices are 2-d (but you should note a matrix can still have only one row or one column).

Let's begin our introduction by exploring how to create NumPy arrays.

## Creating NumPy Arrays

### From a Python List

We can create an array by directly converting a list or list of lists:

In [2]:
my_list = [1,2,3]
my_list

[1, 2, 3]

In [3]:
np.array(my_list)

array([1, 2, 3])

In [4]:
# Find if a value exists in the array
# Returns true if value exists else returns false

integerArray = np.array(my_list)

2 in integerArray # Returns true

True

In [5]:
my_matrix = [[1,2,3],[4,5,6],[7,8,9]]
my_matrix

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [6]:
np.array(my_matrix)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

---
We can also define the output array type

In [7]:
# Define an array arr with integer elements 1,2,3 and 4 of type int
integerArray = np.array([1,2,3,4], int)

print(integerArray)

[1 2 3 4]


## Built-in Methods

There are lots of built-in ways to generate Arrays

### arange

Return evenly spaced values within a given interval.

Returns integer having the given step size between them

In [8]:
np.arange(0,10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [9]:
np.arange(0,11,2)

array([ 0,  2,  4,  6,  8, 10])

### zeros and ones

Generate arrays of zeros or ones

In [10]:
np.zeros(3)

array([0., 0., 0.])

In [11]:
np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [12]:
np.ones(3)

array([1., 1., 1.])

In [13]:
np.ones((3,3))

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

### linspace
Return evenly spaced numbers over a specified interval.

Returns the given no of points between the given range

In [14]:
# 3 evenly spaced points from 0-10
np.linspace(0,10,3)

array([ 0.,  5., 10.])

In [15]:
# 10 evenly spaced points from 0-5
np.linspace(0, 5, 10)

array([0.        , 0.55555556, 1.11111111, 1.66666667, 2.22222222,
       2.77777778, 3.33333333, 3.88888889, 4.44444444, 5.        ])

In [16]:
np.linspace(0,10,50)

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

## eye

Creates an identity matrix

In [17]:
np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

## diag

---
`diag` extracts a diagonal or constructs a diagonal array.

In [18]:
y = np.ones(3)

np.diag(y)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

## Concatenate

Combines 2 arrays and returns a single array

In [19]:
integerArray = np.array([1,2,3,4])
integerArray2 = np.array([5,6])

# Concatenate two arrays
np.concatenate((integerArray, integerArray2))

array([1, 2, 3, 4, 5, 6])

---
Concatenate function on multidimensional arrays can be performed on any axis. For a two dimensional array (array[row][column]), to concatenate along row, we set axis as 0 (default is also 0). To concatenate along column, we set axis to 1

In [20]:
# Concatenation of multi-dimensional arrays
arr1 = np.array([[1,2], [3,4]])
arr2 = np.array([[5,6], [7,8]])

np.concatenate((arr1, arr2))

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

In [21]:
# Based on dimension 1 (Column)

np.concatenate((arr1, arr2), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6],
       [7, 8]])

In [22]:
# Based on dimension 2 (Row)

np.concatenate((arr1, arr2), axis=1)

array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

## Repeat

---
Repeat elements of an array using `repeat`.

In [23]:
# Repeating element wise

np.repeat([1, 2, 3], 3)

array([1, 1, 1, 2, 2, 2, 3, 3, 3])

In [24]:
# Repeating the whole list 

np.array([1, 2, 3] * 3)

array([1, 2, 3, 1, 2, 3, 1, 2, 3])


## Combining Arrays

---
Use `vstack` to stack arrays in sequence vertically (row wise).

In [25]:
p = np.ones([2, 3])

np.vstack([p, 2*p])

array([[1., 1., 1.],
       [1., 1., 1.],
       [2., 2., 2.],
       [2., 2., 2.]])

---
Use `hstack` to stack arrays in sequence horizontally (column wise).

In [26]:
np.hstack([p, 2*p])

array([[1., 1., 1., 2., 2., 2.],
       [1., 1., 1., 2., 2., 2.]])

## Random 

Numpy also has lots of ways to create random number arrays:

### rand
Create an array of the given shape and populate it with
random samples from a uniform distribution
over ``[0, 1)``.

In [27]:
np.random.rand(2)

array([0.99013578, 0.58019938])

---
We dont need to pass tuple, we can just pass the # row and columns for the random matrix

In [28]:
np.random.rand(5,5)

array([[0.24095088, 0.44951065, 0.1773361 , 0.69582222, 0.75228918],
       [0.34210182, 0.02994916, 0.47039868, 0.0460612 , 0.08552357],
       [0.81454426, 0.41892588, 0.17788338, 0.30093259, 0.11506391],
       [0.26105924, 0.10995686, 0.84247375, 0.34572739, 0.68654975],
       [0.71886537, 0.50562975, 0.88902078, 0.27830778, 0.76131031]])

In [29]:
# Getting the integer value of random int (0 - 9)

randomArray = np.floor(np.random.rand(10) * 10)
randomArray

array([1., 3., 4., 5., 6., 5., 7., 2., 4., 6.])

### randn

Return a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:

Normal distribution, also known as the Gaussian distribution, is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean

In [30]:
np.random.randn(2)

array([-0.02396113, -0.25969922])

In [31]:
np.random.randn(5,5)

array([[-0.93960144, -1.33610528,  0.13974792, -2.36473344, -0.08716036],
       [ 0.27837912,  0.42887017, -0.97162597,  0.56060645, -0.49582244],
       [-1.08039723,  0.15652606,  0.7089266 , -1.49258108,  0.58734965],
       [-0.79293774, -3.09847417,  0.79975476,  1.14735736,  0.50406118],
       [ 0.54990178,  1.39710785, -0.2757326 , -1.20316936,  0.93978288]])

### randint
Return random integers from `low` (inclusive) to `high` (exclusive).

In [32]:
np.random.randint(1,100)

10

In [33]:
np.random.randint(1,100,10)

array([58, 93, 67, 34, 73, 76, 17,  5, 22, 28])

## Array Attributes and Methods

Let's discuss some useful attributes and methods or an array:

In [34]:
arr = np.arange(25)
ranarr = np.random.randint(0,50,10)

In [35]:
arr

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24])

In [36]:
ranarr

array([ 8, 25, 48, 28, 43, 22, 14, 47, 33, 13])

## Shape

Shape is an attribute that arrays have (not a method):

(row, column)

In [37]:
arr = np.array([1, 2, 3])

# Vector
arr.shape

(3,)

(3, ) indicates that it is a 1D array

**Getting Transpose**


In [38]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

arr

array([[1, 2, 3],
       [4, 5, 6]])

---
Use `.T` to get the transpose.

In [39]:
arr.T

array([[1, 4],
       [2, 5],
       [3, 6]])

In [40]:
arr.T.shape

(3, 2)

## Reshape
Returns an array containing the same data with a new shape.

**If we cant fill the new array, it will throw an error**

In [41]:
arr = np.arange(25)

arr.reshape(5,5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])

In [42]:
arr.shape

(25,)

In [43]:
# Notice the two sets of brackets
arr.reshape(1,25)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24]])

In [44]:
arr.reshape(1,25).shape

(1, 25)

In [45]:
arr.reshape(25,1)

array([[ 0],
       [ 1],
       [ 2],
       [ 3],
       [ 4],
       [ 5],
       [ 6],
       [ 7],
       [ 8],
       [ 9],
       [10],
       [11],
       [12],
       [13],
       [14],
       [15],
       [16],
       [17],
       [18],
       [19],
       [20],
       [21],
       [22],
       [23],
       [24]])

## Resize

---
`resize` changes the shape and size of array in-place.

In [46]:
arr = np.arange(9)
arr

array([0, 1, 2, 3, 4, 5, 6, 7, 8])

In [47]:
arr.resize(3, 3)
arr

array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

## Math Functions

Numpy has many built in math functions that can be performed on arrays.

In [48]:
a = np.array([-4, -2, 1, 3, 5])

In [49]:
# Sum of array elements
a.sum()    

3

In [50]:
# Max of array elements
a.max()

5

In [51]:
# Min of array elements
a.min()

-4

In [52]:
# Mean of array elements
a.mean()

0.6

In [53]:
# Standard Deviation
a.std()

3.2619012860600183

---
`argmax` and `argmin` return the index of the maximum and minimum values in the array.

In [54]:
# Gives the index at which the max maximum value is in the array

a.argmax()

4

In [55]:
# Gives the index at which the max maximum value is in the array

a.argmin()

0

In [56]:
x = np.array([[5,2,3], [3,4,5], [1,1,1]])

In [57]:
# Get unqiue values
np.unique(x)

array([1, 2, 3, 4, 5])

In [58]:
# Get diagonal values
x.diagonal()

array([5, 4, 1])

In [59]:
# Sort values in the multidimensional array
np.sort(x)

array([[2, 3, 5],
       [3, 4, 5],
       [1, 1, 1]])

### dtype , astype

You can also grab the data type of the object in the array:

In [60]:
arr = np.array([1, 2, 3, 4, 5])

---
Use `.dtype` to see the data type of the elements in the array.

In [61]:
arr.dtype

dtype('int64')

---
Use `.astype` to cast to a specific type.

In [62]:
arr = arr.astype('f')
arr.dtype

dtype('float32')