# Numpy tutorials

NumPy is a general purpose array processing package. It provides a high performance multi-dimensional array object, and tools for working with these array. It is fundamental package for scientific computing with python.

### What is an array?

An array is a data structure that stores values of same data type. In python, this is the main difference between arrays and lists. While python lists can contain values corresponding to different data types, array in python can only contain values conresponding to same data type.

In [3]:
# First we need to import numpy

import numpy as np

## Creating arrays in numpy

In [9]:
# One dimensional array.

In [4]:
my_list = [1, 2, 3, 4, 5]
arr = np.array(my_list)

In [5]:
type(arr)

numpy.ndarray

In [6]:
print(arr.dtype)

int64


In [7]:
arr

array([1, 2, 3, 4, 5])

In [8]:
arr.shape

(5,)

In [10]:
# Two dimensional array

my_list1 = [1, 2, 5, 6, 8]
my_list2 = [2, 5, 8, 3, 9]
my_list3 = [6, 4, 9, 8, 3]

arr_2d = np.array([my_list1, my_list2, my_list3])

In [11]:
arr_2d

array([[1, 2, 5, 6, 8],
       [2, 5, 8, 3, 9],
       [6, 4, 9, 8, 3]])

In [12]:
arr_2d.shape

(3, 5)

## Reshaping numpy array

In [13]:
arr_2d.reshape(5, 3)

array([[1, 2, 5],
       [6, 8, 2],
       [5, 8, 3],
       [9, 6, 4],
       [9, 8, 3]])

In [14]:
arr_2d.shape

(3, 5)

## Indexing

In [16]:
print(arr[3])

4


In [17]:
arr[3:]

array([4, 5])

In [19]:
## arr_2d[rows, columns]
# Here 2 indexed row and 4 indexed column are excluded.
arr_2d[0:2, 2:4]

array([[5, 6],
       [8, 3]])

## Special functions

In [20]:
## arange: creates 1d arrays
'''
Here you have to provide start point and end point(excluded) and step is optional. If you provide step then 
it will take steps of that amount.
'''
ar = np.arange(0, 10)

In [21]:
ar

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [22]:
ar1 = np.arange(0, 10, step=2)
ar1

array([0, 2, 4, 6, 8])

In [24]:
## linspace: creates 1d array with x numbers of points in between start and stop which are equally divided.

ar = np.linspace(0, 10, num=50)
ar

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

## Copy function and broadcasting

In [25]:
'''
Let's say you want to create a copy of an array, you just create a new name and assgin the old array to it.
What will happen in that case? An array will be created in reference type. Let's see What's a reference type
'''
arr1 = arr
print(arr)

[1 2 3 4 5]


In [26]:
arr1 

array([1, 2, 3, 4, 5])

In [27]:
arr1[2:] = 500

In [28]:
arr1

array([  1,   2, 500, 500, 500])

In [29]:
arr

array([  1,   2, 500, 500, 500])

In [35]:
'''
Now as you can see that copying this way changes both the arrays. So, what should we do? We have something 
called copy(). The copy() function copies the value and not just the reference to the initial array.
'''

arr1 = arr.copy()
arr

array([  1,   2, 500, 500, 500])

In [31]:
arr1

array([  1,   2, 500, 500, 500])

In [32]:
arr1[2:] = 1200

In [33]:
arr1

array([   1,    2, 1200, 1200, 1200])

In [34]:
arr

array([  1,   2, 500, 500, 500])

In [36]:
'''
Some conditions are very useful in Exploratory Data Analysis.
'''

val = 3

arr<val

array([ True,  True, False, False, False])

In [37]:
arr*2

array([   2,    4, 1000, 1000, 1000])

In [38]:
arr/2

array([  0.5,   1. , 250. , 250. , 250. ])

In [39]:
arr[arr<val]

array([1, 2])

In [40]:
np.arange(0,10).reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

## Inbuilt functions

In [41]:
## np.ones() inbuilt function that creates an array with all 1's which are floats. 
# If you want the 1's to be int then you need to specify it explicitly.
np.ones(4)

array([1., 1., 1., 1.])

In [43]:
np.ones(4, dtype=int)

array([1, 1, 1, 1])

In [44]:
np.ones((2, 5), dtype=int)

array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]])

In [46]:
## Random distribution
# It selects some random values of given shape and creates an array.
np.random.rand(3, 3)

array([[0.34079114, 0.6790605 , 0.11228975],
       [0.96766568, 0.08495584, 0.47906187],
       [0.8282118 , 0.94756739, 0.57778538]])

In [48]:
## np.random.randn() select the values based on standard normal distribution.
arr_std = np.random.randn(4, 4)

In [49]:
arr_std

array([[-1.19489721, -0.05525238, -0.80578797,  0.20086589],
       [-2.47931336, -0.75411959, -0.50473663,  0.28359155],
       [ 0.22993987, -0.7930691 ,  0.79487114,  0.21091136],
       [ 0.73220139,  0.95100436, -0.60899093, -0.35420536]])

In [52]:
## np.random.randint()
# Below it says select 8 integers from 0 to 100(excluded).
np.random.randint(0, 100, 8)

array([47, 32, 18, 61, 81,  3,  0, 91])

In [53]:
# We can reshape it as well
np.random.randint(0, 100, 8).reshape(4, 2)

array([[44, 37],
       [75,  2],
       [82, 20],
       [58, 22]])