## What is numpy?
- NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.

- At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional arrays of homogeneous data types

## Numpy Arrays Vs Python Sequences
1. NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.

2. The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.

3. NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.

4. A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing, and they often output NumPy arrays.

So basically around `2008`, people used to do data science using R programming language and Matlab. This was due to the fact that Python was considered a slow programming language(it still is to this date).

- So a bunch of people came around with the concept of numpy which is a library written in C language(this is one of the reasons it is very fast.)
- and they put a wrapper around it in the form of python(as python is very easy to learn)
- Fir kya uske baad se people started using python programming language for machine learning and data science.

# Creation of numpy arrays
- For creating a numpy array, we need to pass in a python list in the numpy.array() method.
- We can pass in a list of any dimension

In [1]:
import numpy as np

In [2]:
l = [1, 2, 3]

np.array(l)

array([1, 2, 3])

In [3]:
mat = [
        [1,2],
        [3,4]
    ]

In [4]:
np.array(mat)

array([[1, 2],
       [3, 4]])

In [5]:
tensor = [
            # first 2D array             
            [
                [1,2], 
                [3,4]
            ], 
            # second 2D array 
            [
                [11, 12],
                [34, 56]
            ],
            # third 2D array
            [
                [-1, -5],
                [-56, -68]
            ]
        ]

In [6]:
array = np.array(tensor)
array

array([[[  1,   2],
        [  3,   4]],

       [[ 11,  12],
        [ 34,  56]],

       [[ -1,  -5],
        [-56, -68]]])

In [7]:
array.ndim

3

- We can also create numpy arrays using the arange function. This is similar to the range() function
- This expects us to pass in 3 parameters : start, stop and step(jump)

In [8]:
np.arange(1, 20, 3) # 1, 4, 7, 10, 13, 16, 19

array([ 1,  4,  7, 10, 13, 16, 19])

#### Note
- It is a very common practice in data science to first create an array using arange() function and then reshape it to our convenience.
- The reshape() method expects us to pass in a tuple into which we want to convert our array.
- Reshaping is only possible when the number of elements match. For example, you cannot reshape a numpy array of 12 elements into a numpy array of shape (7, 2)

In [10]:
np.arange(1,26).reshape((5,5))

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15],
       [16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25]])

In [11]:
np.arange(1,13).reshape((6,2))

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10],
       [11, 12]])

In [12]:
np.arange(1,13).reshape((3,4))

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

- We can also specify the data type of the numpy array.

In [2]:
np.array([1, 2, 3, 4, 5], dtype = float)

array([1., 2., 3., 4., 5.])

In [3]:
np.array([1, 2, 3, 4, 5], dtype = bool)

array([ True,  True,  True,  True,  True])

# np.ones() and np.zeros()

- np.ones() and np.zeros() expects us to pass in the shape of the required matrix. 
- It is used to initialize the matrix. For example, jb hme neural networks me weights ko initial value dene ke liye we use np.ones()
- These methods are used to initialize a matrix.

In [4]:
np.ones((3, 2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

# np.random.random()
- This also expects us to pass the shape of the array.
- This gives us an array in which every element is a random number between 0 and 1.

In [5]:
np.random.random((3,4))

array([[0.59285804, 0.07746648, 0.25794081, 0.04058932],
       [0.3285476 , 0.69953381, 0.15488714, 0.46204566],
       [0.56683216, 0.77841539, 0.1339179 , 0.533695  ]])

# np.linspace()

In [6]:
np.linspace(-10, 10, 24) # gives us an array of 24 elements between -10 and 10(including both of 'em')

array([-10.        ,  -9.13043478,  -8.26086957,  -7.39130435,
        -6.52173913,  -5.65217391,  -4.7826087 ,  -3.91304348,
        -3.04347826,  -2.17391304,  -1.30434783,  -0.43478261,
         0.43478261,   1.30434783,   2.17391304,   3.04347826,
         3.91304348,   4.7826087 ,   5.65217391,   6.52173913,
         7.39130435,   8.26086957,   9.13043478,  10.        ])

In [7]:
np.linspace(-10, 10, 2)

array([-10.,  10.])

# np.identity()

In [8]:
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [13]:
np.identity(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [12]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [15]:
np.eye(3,5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.]])

# Attributes

In [16]:
a1 = np.arange(10) # vector
a2 = np.arange(12, dtype = float).reshape((3,4)) # matrix
a3 = np.arange(8).reshape((2,2,2)) # tensor

In [17]:
a1.ndim, a2.ndim, a3.ndim # ndim gives us the number of dimensions

(1, 2, 3)

In [18]:
# shape
print(a1.shape)
print(a2.shape)
print(a3.shape)

# The length of shape tuple is equal to ndim

(10,)
(3, 4)
(2, 2, 2)


In [19]:
# size
# gives us the number of items
print(a1.size)
print(a2.size)
print(a3.size)

10
12
8


In [20]:
# itemsize 
# gives us the memory occupied by each item of the nd array in bytes
print(a1.itemsize)
print(a2.itemsize)
print(a3.itemsize)

4
8
4


### changing the data type of nd array
- Sometimes we may need to change the default data type of a numpy array. In this case we use the function numpy astype() method

In [21]:
ages = np.array([10, 20, 30, 45, 67])

In [22]:
ages

array([10, 20, 30, 45, 67])

In [23]:
ages.dtype

dtype('int32')

We do not require 32 bit integer to store the ages.
So we will convert it into a 8 bit integer

In [26]:
ages = ages.astype(np.uint8)
ages

array([10, 20, 30, 45, 67], dtype=uint8)

In [27]:
ages.dtype

dtype('uint8')

# Array operations

- arithmetic operations (+, -, *, /, //, **)
- relational operations (==, > , >= , <, <= , !=)
- vector operations (adding two arrays, subtracting, mutliplying, dividing)

In [29]:
a1 = np.array([1,2,3,4])
a1

array([1, 2, 3, 4])

In [33]:
print(a1 + 5)
print(a1 - 5)
print(a1 * 5)
print(a1 / 5)
print(a1 // 5)
print(a1 ** 5)

[6 7 8 9]
[-4 -3 -2 -1]
[ 5 10 15 20]
[0.2 0.4 0.6 0.8]
[0 0 0 0]
[   1   32  243 1024]


In [34]:
print(a1 == 2)
print(a1 != 45)
print(a1 > 3)
print(a1 >= 3)
print(a1 < 45)
print(a1 <= 1)

[False  True False False]
[ True  True  True  True]
[False False False  True]
[False False  True  True]
[ True  True  True  True]
[ True False False False]


# Array functions

In [37]:
a = np.random.random((3,4))
a = np.round(a * 100)
a = a.astype(np.uint8)
a

array([[25, 40, 25, 80],
       [34, 44, 47, 54],
       [58, 80, 21,  8]], dtype=uint8)

In [38]:
# math functions
print(np.sum(a))
print(np.min(a))
print(np.max(a))
print(np.prod(a))

516
8
80
3025666048


In [39]:
# trigonometric functions
np.sin(a)

array([[-0.1323,  0.745 , -0.1323, -0.9937],
       [ 0.5293,  0.0177,  0.1236, -0.5586],
       [ 0.9927, -0.9937,  0.8364,  0.9893]], dtype=float16)

In [40]:
# statistical functions
print(np.std(a))
print(np.mean(a))
print(np.median(a))
print(np.var(a))

21.501937897160182
43.0
42.0
462.3333333333333


In [41]:
# we can also pass in the axis along which we want to perform the operation
np.sum(a, axis = 0) # axis = 0 means columns

array([117, 164,  93, 142], dtype=uint32)

In [42]:
a

array([[25, 40, 25, 80],
       [34, 44, 47, 54],
       [58, 80, 21,  8]], dtype=uint8)

In [43]:
np.min(a, axis = 1) # axis = 1 means rows

array([25, 34,  8], dtype=uint8)

- We also have functions which we can apply between 2 vectors or matrices.
- For example, matrix multiplication is reffered to as dot product.

In [46]:
a = np.arange(12).reshape((3,4))
b = np.arange(12, 24).reshape((4,3))
a, b

(array([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]]),
 array([[12, 13, 14],
        [15, 16, 17],
        [18, 19, 20],
        [21, 22, 23]]))

In [47]:
np.dot(a, b) # multiplied matrix a with matrix b

array([[114, 120, 126],
       [378, 400, 422],
       [642, 680, 718]])

- We also have logarithmic and exponential functions as well.

In [48]:
np.exp(a)

array([[1.00000000e+00, 2.71828183e+00, 7.38905610e+00, 2.00855369e+01],
       [5.45981500e+01, 1.48413159e+02, 4.03428793e+02, 1.09663316e+03],
       [2.98095799e+03, 8.10308393e+03, 2.20264658e+04, 5.98741417e+04]])

In [51]:
np.log(a)

  np.log(a)


array([[      -inf, 0.        , 0.69314718, 1.09861229],
       [1.38629436, 1.60943791, 1.79175947, 1.94591015],
       [2.07944154, 2.19722458, 2.30258509, 2.39789527]])

- round, floor, ceil

In [55]:
a = np.random.random((2,3)) * 100
a

array([[15.66873756, 42.1544118 , 10.19257288],
       [55.90902418, 68.16800253, 23.41172917]])

In [56]:
np.floor(a)

array([[15., 42., 10.],
       [55., 68., 23.]])

In [57]:
np.ceil(a)

array([[16., 43., 11.],
       [56., 69., 24.]])

In [58]:
np.round(a)

array([[16., 42., 10.],
       [56., 68., 23.]])

# Indexing and slicing

In [59]:
a1 = np.arange(1, 10)
a1

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [60]:
a1[0], a1[-2]

(1, 8)

In [61]:
a2 = np.arange(12).reshape((3,4))
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [62]:
a2[1, 2] # a2[row_number, col_number]

6

In [63]:
a3 = np.arange(8).reshape((2,2,2))
a3

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [64]:
a3[1,0,1]

5

In [65]:
a3[0,1,0]

2

In [66]:
# fetch 0
a3[0, 0, 0]

0

- Slicing

In [69]:
a1 = np.arange(5,11)

In [70]:
a1

array([ 5,  6,  7,  8,  9, 10])

In [71]:
a1[2:5]

array([7, 8, 9])

In [84]:
a2 = np.arange(12).reshape((3,4))

In [85]:
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [86]:
a2[2:, 0:3] 
# all rows after and including 2nd row and 
# all columns from 0th column to 2nd column(including)

array([[ 8,  9, 10]])

In [87]:
# give me the 3rd column
a2[::, 2]

array([ 2,  6, 10])

In [88]:
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [89]:
# i want 5, 6, 9, 10
a2[1:, 1:3]

array([[ 5,  6],
       [ 9, 10]])

In [90]:
# give me 0, 3, 8, 11 -- all corners
a2[0::2, 0::3]

array([[ 0,  3],
       [ 8, 11]])

In [91]:
# give me 1, 3, 9, 11
a2[::2, 1::2]

array([[ 1,  3],
       [ 9, 11]])

    Give the following output
        [
            [1,2,3],
            [5,6,7]
        ]


In [92]:
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [93]:
a2[:2, 1::]

array([[1, 2, 3],
       [5, 6, 7]])

# Mega question -- indexing and slicing

In [94]:
a3 = np.arange(27).reshape((3,3,3))
a3

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

    Q1. Give the following output
    [
        [ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]
    ]

In [95]:
a3[1]

array([[ 9, 10, 11],
       [12, 13, 14],
       [15, 16, 17]])

    Give the first and last 2D arrays

In [96]:
a3[0::2] # or a3[::2]

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

    Give the second row of first 2D array of a3

In [97]:
a3[0,1,::]

array([3, 4, 5])

    Give me the following output
    [
        [22, 23],
        [25, 26]
    ]

In [98]:
a3

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

In [99]:
a3[2, 1::, 1::]

array([[22, 23],
       [25, 26]])

    Q. Give the following output
    [
        [0, 2],
        [18, 20]
    ]

In [100]:
a3[::2, 0, ::2]

array([[ 0,  2],
       [18, 20]])

# Iterating over numpy arrays

In [103]:
a1 = np.arange(12)
a2 = np.arange(12).reshape((3,4))
a3 = np.arange(24).reshape((3,4,2))

In [105]:
for i in a1:
    print(i, end = ' ')

0 1 2 3 4 5 6 7 8 9 10 11 

In [106]:
for i in a2:
    print(i, end = ' ')

[0 1 2 3] [4 5 6 7] [ 8  9 10 11] 

In [113]:
for i in a3:
    print(i,end = "\n=================\n")

[[0 1]
 [2 3]
 [4 5]
 [6 7]]
[[ 8  9]
 [10 11]
 [12 13]
 [14 15]]
[[16 17]
 [18 19]
 [20 21]
 [22 23]]


    In order that we iterate over each and every element of the array, we use np.nditer() method.

In [114]:
for i in np.nditer(a3):
    print(i, end = ' ')

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 

In [117]:
for i in np.nditer(a2):
    print(i, end = ' ')

0 1 2 3 4 5 6 7 8 9 10 11 

# Reshaping an array

In [118]:
# reshape

In [119]:
# transpose
a2 = np.arange(12).reshape((3,4))
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [120]:
np.transpose(a2)

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [121]:
a2.T

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [122]:
# ravel -- it is used to flatten a nth-dimensional array

In [123]:
a2.ravel()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

# stacking

In [124]:
# horizontal stacking

In [126]:
a1 = np.arange(12).reshape((3,4))
a1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [129]:
a2 = np.arange(12, 27).reshape((3,5))
a2

array([[12, 13, 14, 15, 16],
       [17, 18, 19, 20, 21],
       [22, 23, 24, 25, 26]])

In [130]:
np.hstack((a1, a2))

array([[ 0,  1,  2,  3, 12, 13, 14, 15, 16],
       [ 4,  5,  6,  7, 17, 18, 19, 20, 21],
       [ 8,  9, 10, 11, 22, 23, 24, 25, 26]])

In [131]:
# vertical stacking

In [132]:
b1 = np.arange(4).reshape((2,2))
b1

array([[0, 1],
       [2, 3]])

In [134]:
b2 = np.arange(4, 16).reshape((6,2))
b2

array([[ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15]])

In [135]:
np.vstack((b1, b2))

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15]])

# hsplit and vsplit

- These functions are used to split a numpy array into equal parts.

- The divided parts are then returned in a list.

In [136]:
a = np.arange(1, 25).reshape((6,4))
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16],
       [17, 18, 19, 20],
       [21, 22, 23, 24]])

In [138]:
part_one, part_two = np.hsplit(a, 2)

In [139]:
part_one

array([[ 1,  2],
       [ 5,  6],
       [ 9, 10],
       [13, 14],
       [17, 18],
       [21, 22]])

In [140]:
part_two

array([[ 3,  4],
       [ 7,  8],
       [11, 12],
       [15, 16],
       [19, 20],
       [23, 24]])

    Similiarily, We can split the array vertically into equal parts.

In [141]:
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16],
       [17, 18, 19, 20],
       [21, 22, 23, 24]])

In [142]:
x, y, z = np.vsplit(a, 3)

In [143]:
x

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [144]:
y

array([[ 9, 10, 11, 12],
       [13, 14, 15, 16]])

In [145]:
z

array([[17, 18, 19, 20],
       [21, 22, 23, 24]])