# Numpy Tutorial

### What is a numpy array?

'numpy' is short for "Numerical Python". numpy library is the core library for scientific computing in Python. numpy array is the core data structure of that library. numpy array can we considered as a grid of values of the same type.   

In [None]:
import numpy as np
num_arr_1d = np.array([1, 2, 3])
num_arr_2d = np.array([[1, 2, 3], [3, 4, 5]])
print("num_arr_1d :", num_arr_1d)
print(" num_arr_2d : ", num_arr_2d)


Four basic properties which can define an entire numpy array are:
<l>
<li> <b>data</b>: The data pointer indicates the pointer of the first byte
<li> <b>dtype</b>: Type of the elements in the numpy array
<li> <b>shape</b>: The dimension of the numpy array
<li> <b>strides</b>: The number of elements to be skipped in memory to go to the element in next row and next column respectively
</l>

In [None]:
print("Pointer to the first element of num_arr_2d: ",num_arr_1d.data)
# Lets say we want to verify whether this is the pointer to the first element or not
print("Pointer to the first element of num_arr_2d: ",num_arr_1d.data)

# Similarily we can look at other three variables
print("dtype: ", num_arr_1d.dtype)
print("Shape: ", num_arr_1d.shape)
print("Stride:", num_arr_1d.strides)

The most important among the above four operations are shape and dtype. Shape will come very handy while broadcasting and dtype will come in handy when we run into data-type conversion issues.

###### Side Note: Another important thing to pay attention in python is that the counting starts from 0. But, when we are looking at shape values, it starts from 1 

### Initialising numpy array

We initialise the numpy array by np.array() command. We can also define the dtype or leave it (in this case, python will automatically learn the dtype). 

In [None]:
num_arr_2d = np.array([[1, 2, 3], [3, 4, 5]])

print(num_arr_2d)

Some of the other ways to define a numpy array are:

In [None]:
print(np.ones((2, 4))) #create an array of ones with given shape

print(np.zeros((2, 4))) # create an array of zeros with given shape

print(np.full((2, 4), 2, dtype = np.int64)) # create a constant array 

print(np.random.random((2, 4)))

There are also other choices for random function which we can use like uniform, normal and other similar distributions. 

In [None]:
print(np.random.normal(0, 1, size = (2, 4)), "\n")

print(np.random.uniform(-1, 1, size = (2, 4)))

### Array Slicing

Array slicing in python is similar to list. The first index is for axis 0, 2nd for axis 1 and so on. 

In [None]:
num_arr_2d = np.array([[1, 2, 3, 4], 
                       [5, 6, 7, 8],
                       [9, 10, 11, 12],
                       [13, 14, 15, 16]])
print("Slicing the first two rows of num_arr_2d: ", num_arr_2d[:2, :], "\n")
print("Slicing the first two columns of num_arr_2d: ", num_arr_2d[:, :2], "\n")
print("Slicing the values corresponding to mid two rows and columns: ", num_arr_2d[1:3, 1:3])

<b> Boolean array indexing: </b>is used to select elements of the array which satisfies certain conditions 

In [None]:
num_arr_2d[num_arr_2d > 11] # Note this outputs a 1-dim array irrespective of the dimension of the original array

<b> Note:</b><i> Slicing will return a view of the underlying data i.e if we change the sliced data, the changes will be reflected in the underlying data as well</i>

In [None]:
x = np.array([1, 2, 3, 4, 5, 6])
y = x[3:5]
y[0] = 0
print(x, y)

### Array Math

All the basic operations are performed element wise on the numpy arraies of same size. 

In [None]:
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])
print("x + y: \n", x + y, "\n")
print('x * y: \n', x * y, "\n")
print('x / y: \n', x / y, "\n")
print('x - y: \n', x - y, "\n")

### Matrix multiplication

In order to do matrix multiplication, we have to use the np.dot function

In [None]:
num_1d_1 = np.array([1, 2, 3])
num_1d_2 = np.array([0, 0, 1])
np.dot(num_1d_1, num_1d_2) # this is a dot product between the two vectors

In [None]:
num_2d_1 = np.array([[1, 2],
                     [3, 4]])
num_2d_2 = np.array([[1, 2, 3],
                     [4, 5, 6]])
np.dot(num_2d_1, num_2d_2) # Note that the dimension of the two array should be conformable for matrix multiplication
                     

### Broadcasting

Broadcasting is the most powerful feature of numpy. This is very useful when have a smaller array & a larger array and we want to use the smaller array multiple times. The important rule for broadcasting is that the adjacent dimensions should match and the other dimension of one of the arrays should be 1. 


In [None]:
a = np.array([[1, 2, 3], [2, 3, 4]])
b = np.array([1, 1, 1])
print(a + b) # This works because the adjacent dimension of the two matches


In [None]:
a = np.array([[1, 2, 3], [2, 3, 4]])
b = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3]])
print(a + b) # This doesn't works because the other dimension of none of the arrays is 1


In [None]:
a = np.array([[1, 2, 3], [2, 3, 4]])
b = np.array([1, 2])
print(a.T * b)

### Other useful functions

<b>np.arange(start, stop, step): </b> To get a range of value and stop is not included


In [None]:
print(np.arange(12, 20, 2))
np.arange(20)

<b>array.astype(dtype):</b> For type conversion

In [None]:
a = np.array([1, 2, 3], dtype = "int64")
a = a.astype(np.int32)
a.dtype

### Reducing functions

numpy has lots of reducing functions like sum, mean, min, max etc. which collapes multidimensional arrays over the given axis. 

In [None]:
x = np.ones((2, 10))
print(x.sum()) # Sum of all the values
print(x.sum(axis=1))
print(x.sum(axis=0))

<i>Side Note: Whenever there is an attribute of x then we write it as x.attribute like shape, data etc. Whenever there is a function of x like mean, sum etc., we write it as x.sum(0)</i>  