## Creating Numpy Array

Creating Numpy array from  Python List


In [None]:
import numpy as np

In [4]:
lst = [2,4,2,5,2]
narray = np.array(lst)


Unlike list, Numpy array contains the data of same type. If data type does not match,Numpy will type cast the data

In [6]:
a = np.array([1.2,3,5,3,5])
print(a.dtype)

[1.2 3.  5.  3.  5. ]
float64


We can explicitly set the data type of Numpyarray as well

In [8]:
a = np.array([2,3,4,5.6], dtype='int32')


Multidimensional Numpy array and shape of array

In [12]:
l = [[1,2,4],[4,7,4]]
a = np.array(l)


Creating Numpy array from in-built utilities

In [20]:
a = np.zeros((3,3))
print(a,a.dtype)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]] float64


In [22]:
a = np.ones((2,2))
print(a)

[[1. 1.]
 [1. 1.]]


In [23]:
a = np.full((3,4), 2)
print(a)

[[2 2 2 2]
 [2 2 2 2]
 [2 2 2 2]]


In [24]:
# arange and linspace function
np.arange(0,10,2)

array([0, 2, 4, 6, 8])

In [26]:
np.linspace(1,5,3)

array([1., 3., 5.])

In [30]:
# Random numbers
np.random.random((3,3))

array([[0.70380172, 0.2574642 , 0.31729126],
       [0.1343205 , 0.1268172 , 0.07086981],
       [0.70233214, 0.1660791 , 0.08978648]])

In [49]:
np.random.randint(1,5,(3,))

array([4, 2, 4])

In [36]:
# Create empty array
np.empty((2,3))

array([[4.9e-324, 9.9e-324, 2.0e-323],
       [2.0e-323, 3.5e-323, 2.0e-323]])

## Data Manipulation

In [61]:
np.random.seed(0)

a1 = np.random.randint(5,size=5)
a2 = np.random.randint(5,size=(3,3))

array([3, 3, 3, 0, 1])

In [62]:
#shape, ndim, size
print(a2.shape)
print(a2.ndim)
print(a2.size)

(3, 3)
2
9


Array indexing and slicing

In [64]:
a1

array([3, 3, 3, 0, 1])

In [65]:
a1[2]

3

In [66]:
a1[-1]

1

In [67]:
a2

array([[1, 1, 0],
       [2, 4, 3],
       [3, 2, 4]])

In [69]:
a2[1,0]

2

In [72]:
a2[:3,1:2]

array([[1],
       [4],
       [2]])

In [75]:
a2[:3,::3] # All rows, every third column

array([[1],
       [2],
       [3]])

In [76]:
a2[::-1] # Reverse all the rows

array([[3, 2, 4],
       [2, 4, 3],
       [1, 1, 0]])

Access row and column of array

In [77]:
a2[:,2] # Third row

array([0, 3, 4])

In [78]:
a2[0,:] # First column

array([1, 1, 0])

## View of array

Numpy slicing returns the view of original numpy array while list slices are the copies of original array

In [86]:
a3 = np.random.randint(5,size=(5,5))
a3

array([[4, 3, 4, 4, 4],
       [3, 4, 4, 4, 0],
       [4, 3, 2, 0, 1],
       [1, 3, 0, 0, 1],
       [2, 4, 2, 0, 3]])

In [90]:
a3_view = a3[2:,3:]
a3_view

array([[0, 1],
       [0, 1],
       [0, 3]])

If we modify this subarray, change will also be reflected in original array

In [91]:
a3_view[2,1] = 10
a3_view

array([[ 0,  1],
       [ 0,  1],
       [ 0, 10]])

In [92]:
a3

array([[ 4,  3,  4,  4,  4],
       [ 3,  4,  4,  4,  0],
       [ 4,  3,  2,  0,  1],
       [ 1,  3,  0,  0,  1],
       [ 2,  4,  2,  0, 10]])

This behavior is useful while working with large dataset, as it reduces the overhead of copying the modified data

## Copy of array

In [93]:
a3_copy = a3[:1,:3].copy()
a3_copy

array([[4, 3, 4]])

In [94]:
a3_copy[0,1] = 30
a3_copy

array([[ 4, 30,  4]])

In [95]:
a3

array([[ 4,  3,  4,  4,  4],
       [ 3,  4,  4,  4,  0],
       [ 4,  3,  2,  0,  1],
       [ 1,  3,  0,  0,  1],
       [ 2,  4,  2,  0, 10]])

## Reshaping array

Reshape from 1D to 2D

In [99]:
a1.shape

(5,)

In [101]:
a1_rs = a1.reshape((5,1))
a1_rs

array([[3],
       [3],
       [3],
       [0],
       [1]])

In [102]:
a2

array([[1, 1, 0],
       [2, 4, 3],
       [3, 2, 4]])

In [103]:
a2.shape

(3, 3)

In [109]:
a2_rs = a2.reshape((1,3,3))

In [110]:
a2_rs.shape

(1, 3, 3)

In [113]:
a2_rs[0,1,1]

4

## Conctenate and Split array

Concatenate

In [118]:
# Same dimension
x = np.array([1,2,3])
y = np.array([10,20,30])
np.concatenate([x,y])

array([ 1,  2,  3, 10, 20, 30])

In [124]:
x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])
np.concatenate([x,y], axis = 1)

array([[1, 2, 5, 6],
       [3, 4, 7, 8]])

In [139]:
# Different dimnesion
x = np.array([1, 2, 3])
grid = np.array([[9, 8, 7],
                 [6, 5, 4]])

# vertically stack the arrays
np.vstack([x, grid])

array([[1, 2, 3],
       [9, 8, 7],
       [6, 5, 4]])

In [140]:
# horizontally stack the arrays
y = np.array([[99],
              [99]])
np.hstack([grid, y])

array([[ 9,  8,  7, 99],
       [ 6,  5,  4, 99]])

Splitting array

In [144]:
x = np.array([1,4,67,3,7,8,4,7,8,9])
x1,x2 = np.split(x,[4])
print(x1,x2)

[ 1  4 67  3] [7 8 4 7 8 9]


In [155]:
matrix = np.random.randint(1,10,(4,4))
matrix

array([[7, 9, 3, 4],
       [1, 1, 7, 1],
       [7, 4, 4, 9],
       [9, 9, 3, 4]])

In [156]:
l,r = np.hsplit(matrix,[2])
print('left',l)
print('right',r)

left [[7 9]
 [1 1]
 [7 4]
 [9 9]]
right [[3 4]
 [7 1]
 [4 9]
 [3 4]]


In [None]:
u,d = np.vsplit(matrix,[3])


## Broadcasting

Addition of two array

In [157]:
a = np.array([1,2,3])
b = np.array([5,4,3])
a+b

array([6, 6, 6])

In [158]:
b = 10
a+b

array([11, 12, 13])

In the above example value of b is broadcasted over the a
Another way to think of this as, b is converted to [10,10,10] and added to each element of a.
Specifically, this way of handling array different size if caled broadcasting in numpy

In [163]:
# Broadcasting over multi dimesnion
matrix = np.full((3,3),4)
matrix

array([[4, 4, 4],
       [4, 4, 4],
       [4, 4, 4]])

In [161]:
a

array([1, 2, 3])

In [165]:
matrix + a

array([[5, 6, 7],
       [5, 6, 7],
       [5, 6, 7]])

Here one dimensional array ``a`` is brodcasted over two dimensional to match the shape of ``matrix``

In [170]:
array = np.arange(3)
matrix = np.arange(3).reshape(3,1)
print(array)
print(matrix)
print(array + matrix)

[0 1 2]
[[0]
 [1]
 [2]]
[[0 1 2]
 [1 2 3]
 [2 3 4]]


<img src="../Fresher%20Training/broadcasting.png">

Example to udersrand rule

In [173]:
matrix = np.ones((2,3))
a = np.arange(3)

print(matrix)
print(a)
print(matrix.shape)
print(a.shape)

[[1. 1. 1.]
 [1. 1. 1.]]
[0 1 2]
(2, 3)
(3,)


According to rule one ``a`` has less dimension so new dimension ``a`` will be ``(1,3)``

Still dimensions are not matching so we broadcast ``a`` to match its dimension with ``matrix``
Now new dimension of ``a`` is ``(2,3)``. Image ``a`` as ``[[0,1,2],[0,1,2]]``

In [174]:
matrix + a

array([[1., 2., 3.],
       [1., 2., 3.]])

In [175]:
matrix = np.ones((3,2))

Rule 1 = Dimension of ``a`` will coverted to ``(1,3)``
Rule 2 = Dimension of ``a`` will be converted to ``(2,3)``

Now according to rule dimension still doesn't match for ``a`` and ``matrix``.

In [176]:
a + matrix

ValueError: operands could not be broadcast together with shapes (3,) (3,2) 

## Aggregation

Aggregation functions are used to summerize the data.
Let's find average height of US presidents

In [187]:
import csv
height = []
with open('president_heights.csv') as file:
    lines = csv.reader(file)
    next(lines)
    for line in lines:
        height.append(int(line[2]))
height = np.array(height)
type(height)
print(height)

[189 170 189 163 183 171 185 168 173 183 173 173 175 178 183 193 178 173
 174 183 183 168 170 178 182 180 183 178 182 188 175 179 183 193 182 183
 177 185 188 188 182 185]


In [188]:
# Computer stats
print("Mean", height.mean())
print("SD", height.std())
print("Minimum", height.min())
print("Max", height.max())

Mean 179.73809523809524
SD 6.931843442745892
Minimum 163
Max 193


Some more aggregators

In [191]:
print("50the percentile", np.percentile(height,50))
print("Median", np.median(height))

50the percentile 182.0
Median 182.0
