![](logo.png)
## Day Objectives

# Introduction to NumPy

The learning objectives of this section are:

* Create NumPy arrays
    * Convert lists and tuples to NumPy arrays 
    * Create (initialise) arrays
* Inspect the structure and content of arrays
* Subset, slice, index and iterate through arrays
* Compare computation times in NumPy and standard Python lists


### NumPy Basics

NumPy is a library written for scientific computing and data analysis. It stands for numerical python.

The most basic object in NumPy is the ```ndarray```, or simply an ```array```, which is an **n-dimensional, homogenous** array. By homogenous, we mean that all the elements in a NumPy array have to be of the **same data type**, which is commonly numeric (float or integer). 


- Numerical Python, popularly known as Numpy has been designed to carry out mathematical computations at a faster and easier rate.
- Further this library enriches the programming language Python by providing powerful  data structures like multi dimensional arrays beyond matrices and linear arrays.
- Besides that, Numpy provides a large library of high level mathematical functions to operate on these structures.


### Why Numpy when we have “Lists” ?
**Python has inbuilt data structure “List” which is also technically an array which allows different data types.**

The answer to this question  comes in following three aspects
+ Size – Numpy data structures take less space
+ Performance – They are inherently faster than lists.
+ Functionality – Scipy and Numpy have optimized functions.


Let's see some examples of arrays.

In NumPy, dimensions are called **axes**. In the 2-d array above, there are two axes, having two and three elements respectively. 

In NumPy terminology, for 2-D arrays:
* ```axis = 0``` refers to the rows
* ```axis = 1``` refers to the columns

<img src="numpy_axes.jpg" style="width: 600px; height: 400px">

### Advantages of NumPy 
1. Numpy is **much faster** than the standard python ways to do computations.
2. NumPy arrays are more compact than lists, i.e. they take much lesser storage space than lists

In [3]:
li = [34,45,67]
li

[34, 45, 67]

In [5]:
import numpy as np

In [7]:
# 1-D array using a list
n1 = np.array(li)
print(n1)

[34 45 67]


In [8]:
type(li)

list

In [10]:
print(type(n1))

<class 'numpy.ndarray'>


In [12]:
t = (1,2,3,4,5.6,6.7, "string")
n2 = np.array(t)
n2

array(['1', '2', '3', '4', '5.6', '6.7', 'string'], dtype='<U32')

In [13]:
# Creating 2-D Array
n_2d = np.array([[1,2,3,4],[5,6,7,8]])
n_2d

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [14]:
For i in range of li:
M= i*i

SyntaxError: invalid syntax (<ipython-input-14-e4623760311e>, line 1)

In [15]:
l1 = [1,3,5,7,9,11,13]
# Task:  get the squares of the elements in the list
l = []
for i in l1:
    l.append(i**2)
    
l

[1, 9, 25, 49, 81, 121, 169]

In [18]:
l=map(list(int,input().split()))
m=[]
for i in l:
    a=i*i
    l.append(a)


1,2,3,4,5


TypeError: list expected at most 1 argument, got 2

In [19]:
[i**2 for i in l1]

[1, 9, 25, 49, 81, 121, 169]

In [21]:
np.array(l1)**2

array([  1,   9,  25,  49,  81, 121, 169], dtype=int32)

In [22]:
l1**2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [24]:
len(dir(np))  # 622 methods inside numpy 

622

In [44]:
n1 = np.array([1,2,3,4,])
print(n1.ndim)
n1.shape

1


(4,)

In [43]:
n2 = np.array([[1,2,3],[4,5,6]])
print(n2.shape)
print(n2.ndim)
n2

(2, 3)
2


array([[1, 2, 3],
       [4, 5, 6]])

In [42]:
n3 = np.array([[[1,2,3],[4,5,6]]])
print(n3.shape)
print(n3.ndim)
n3

(1, 2, 3)
3


array([[[1, 2, 3],
        [4, 5, 6]]])

In [36]:
n1.dtype

dtype('int32')

In [37]:
n2.dtype

dtype('int32')

In [38]:
n2.ndim

2

In [39]:
n1.ndim

1

In [40]:
n3.ndim

3

In [49]:
m1 = np.eye(5, dtype = "int")
m1   # identity matrix

array([[1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1]])

In [52]:
m2 = np.ones((4,5), dtype = "str")
m2   # array filled with ones

array([['1', '1', '1', '1', '1'],
       ['1', '1', '1', '1', '1'],
       ['1', '1', '1', '1', '1'],
       ['1', '1', '1', '1', '1']], dtype='<U1')

In [54]:
m3 = np.zeros((3,3))
m3

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [57]:
m4 = np.full((3,2),-1)  # shape, value to be filled 
m4

array([[-1, -1],
       [-1, -1],
       [-1, -1]])

In [58]:
m5 = np.arange(10)
m5

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [59]:
for i in range(10):
    print(i,end = " ")

0 1 2 3 4 5 6 7 8 9 

In [60]:
m6 = np.arange(1,100,5) # startvalue, end value, skip/step value
m6

array([ 1,  6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61, 66, 71, 76, 81,
       86, 91, 96])

In [61]:
m7 = np.linspace(1,100, 5) # start value, end value, length of the parts
m7 

array([  1.  ,  25.75,  50.5 ,  75.25, 100.  ])

In [65]:
m7 = np.linspace(0,1, 10)
m7

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

In [71]:
m8 = np.tile(li, 4)
m8

array([34, 45, 67, 34, 45, 67, 34, 45, 67, 34, 45, 67])

In [70]:
li*4

[34, 45, 67, 34, 45, 67, 34, 45, 67, 34, 45, 67]

## Random data

In [88]:
m9 = np.random.randint(50, 100)
m9# it returns integer value with in the given boundary

51

In [102]:
m9 = np.random.randint(1,100, 3) # start value, end value, quantity 
m9

array([95, 96, 28])

In [104]:
m9 = np.random.randint(1,10,(4,5))
m9  # always start value inclusive and end value Exclusive

array([[1, 7, 8, 7, 9],
       [8, 4, 4, 1, 2],
       [1, 9, 5, 9, 6],
       [9, 6, 6, 6, 8]])

In [80]:
m10 = np.random.rand(100)
m10   # boundary is 0 to 1  , it generates the given number of  values in between 0 to 1

array([0.49769267, 0.92043254, 0.29829905, 0.86210256, 0.04442535,
       0.36291844, 0.05170278, 0.80246603, 0.91302518, 0.48679582,
       0.97056578, 0.27983458, 0.23880113, 0.94315232, 0.35151163,
       0.01098268, 0.29279544, 0.49516094, 0.45485789, 0.70213194,
       0.97432417, 0.38103569, 0.80927452, 0.07985277, 0.80501234,
       0.26044086, 0.65319595, 0.35565933, 0.23316658, 0.94631771,
       0.90264831, 0.09792537, 0.24346583, 0.95011337, 0.88185523,
       0.25291941, 0.5936349 , 0.28058733, 0.58318048, 0.83253507,
       0.98545566, 0.82502308, 0.2441291 , 0.4169867 , 0.44210151,
       0.1354737 , 0.95229778, 0.56708167, 0.04023841, 0.91666958,
       0.80115685, 0.9538497 , 0.69889756, 0.31307273, 0.47203081,
       0.96138368, 0.83969109, 0.11455224, 0.10930868, 0.40680912,
       0.20699351, 0.9322982 , 0.41489477, 0.64052022, 0.26237635,
       0.2416749 , 0.27065598, 0.86688321, 0.76237038, 0.29383135,
       0.99201174, 0.75769544, 0.01873807, 0.75461433, 0.73287

In [101]:
m11 = np.random.random((2,3))
m11   # it returns only one value from 0 to 1

array([[0.46795819, 0.0636768 , 0.17811271],
       [0.48967821, 0.69106277, 0.41914517]])

In [111]:
m12 = np.random.rand()
m12

0.6749177363682674

In [108]:
print(m9.ndim) # No. of Dimentions
print(m9.itemsize) # memory used by each element in bytes

2
4


In [113]:
m7.itemsize

8

In [114]:
m7

array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

In [115]:
m9.dtype # data type of the array

dtype('int32')

In [116]:
m7.shape

(10,)

## N-D array


In [126]:
n_array = np.random.randint(1, 100, (5,3)).reshape(1,15)
n_array

# 15 - 3*5 , 5*3, 1* 15, 15*1 

array([[93, 25, 47, 13, 37, 73, 22, 19, 98, 37, 28, 49, 80, 78, 55]])

In [123]:
n_array.ndim

2

In [127]:
n_array = np.random.randint(1, 100, (5,6))
n_array
# factors of 30 - 2*15, 15*2, 3*10, 10*3, 6*5

array([[76, 47,  4, 19, 75, 58],
       [90, 92, 25, 48, 36,  3],
       [72, 37, 39, 39, 78, 35],
       [81, 57, 86, 40, 32, 24],
       [58, 35,  4, 73, 48, 58]])

In [129]:
n_array = np.random.randint(1, 100, (5,6)).reshape(3,10)
n_array

array([[75, 17, 42, 79, 94, 94, 87, 46, 73, 74],
       [11, 92, 39, 21, 39, 24, 81, 31, 55, 24],
       [85, 39, 77, 75, 29, 12, 55, 17, 77, 63]])

In [130]:
n_array = np.random.randint(1, 100, (5,6)).reshape(10,3)
n_array

array([[60, 96, 80],
       [77, 71,  8],
       [56, 39, 84],
       [13,  1,  2],
       [54, 32,  2],
       [46, 23, 22],
       [79,  6, 26],
       [80, 57, 36],
       [59, 11, 43],
       [ 9, 28, 67]])

In [132]:
n_array = np.random.randint(1, 100, (5,6)).reshape(2,15)
n_array.ndim

2

In [135]:
n_array = np.random.randint(1, 100, (5,6)).reshape(1,3,10)
n_array.ndim

3

In [139]:
n_array = np.random.randint(1, 100, (5,6)).reshape(2,3,5)
print(n_array.ndim)
n_array

3


array([[[52, 62, 18, 26, 63],
        [66, 45, 96, 80, 54],
        [98, 57, 36, 15, 36]],

       [[98, 41, 50, 29, 64],
        [12, 50, 44, 97, 97],
        [39, 52, 29, 70, 83]]])

In [141]:
n_array = np.random.randint(1, 100, (5,6)).reshape(1,2,3,5)
n_array.ndim

4

In [142]:
n_array = np.random.randint(1, 100, (5,6)).reshape(3,10)
n_array

array([[13, 16, 71, 48, 88, 17,  2, 60,  8, 66],
       [24, 50, 98, 41, 92, 63, 70, 20, 73, 25],
       [21, 75, 68, 69, 34, 34, 18, 39, 73, 63]])

25 
10,3  - 30

## SubSets and Slicing, Indexing

In [143]:
m10

array([0.49769267, 0.92043254, 0.29829905, 0.86210256, 0.04442535,
       0.36291844, 0.05170278, 0.80246603, 0.91302518, 0.48679582,
       0.97056578, 0.27983458, 0.23880113, 0.94315232, 0.35151163,
       0.01098268, 0.29279544, 0.49516094, 0.45485789, 0.70213194,
       0.97432417, 0.38103569, 0.80927452, 0.07985277, 0.80501234,
       0.26044086, 0.65319595, 0.35565933, 0.23316658, 0.94631771,
       0.90264831, 0.09792537, 0.24346583, 0.95011337, 0.88185523,
       0.25291941, 0.5936349 , 0.28058733, 0.58318048, 0.83253507,
       0.98545566, 0.82502308, 0.2441291 , 0.4169867 , 0.44210151,
       0.1354737 , 0.95229778, 0.56708167, 0.04023841, 0.91666958,
       0.80115685, 0.9538497 , 0.69889756, 0.31307273, 0.47203081,
       0.96138368, 0.83969109, 0.11455224, 0.10930868, 0.40680912,
       0.20699351, 0.9322982 , 0.41489477, 0.64052022, 0.26237635,
       0.2416749 , 0.27065598, 0.86688321, 0.76237038, 0.29383135,
       0.99201174, 0.75769544, 0.01873807, 0.75461433, 0.73287

In [149]:
m10[ 10: 100: 5] # start value, end value, skip value

array([0.97056578, 0.01098268, 0.97432417, 0.26044086, 0.90264831,
       0.25291941, 0.98545566, 0.1354737 , 0.80115685, 0.96138368,
       0.20699351, 0.2416749 , 0.99201174, 0.63034184, 0.2124659 ,
       0.30410469, 0.94107502, 0.1636036 ])

In [159]:
n = np.arange(1,100,5)
n

array([ 1,  6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61, 66, 71, 76, 81,
       86, 91, 96])

In [162]:
n = np.arange(1,100,5).reshape(5,-1)
n

array([[ 1,  6, 11, 16],
       [21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76],
       [81, 86, 91, 96]])

In [165]:
# accessing rows
n[1]

array([21, 26, 31, 36])

In [166]:
n[1:3]

array([[21, 26, 31, 36],
       [41, 46, 51, 56]])

In [168]:
n[::2]  # even rows 

array([[ 1,  6, 11, 16],
       [41, 46, 51, 56],
       [81, 86, 91, 96]])

In [171]:
n[1 : : 1] # odd rows 

array([[21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76],
       [81, 86, 91, 96]])

In [174]:
n[0:4:1]

array([[ 1,  6, 11, 16],
       [21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76]])

In [175]:
n[1:3:1]

array([[21, 26, 31, 36],
       [41, 46, 51, 56]])

In [176]:
n[ : : ]

array([[ 1,  6, 11, 16],
       [21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76],
       [81, 86, 91, 96]])

In [178]:
n[1:3:]

array([[21, 26, 31, 36],
       [41, 46, 51, 56]])

In [179]:
n[::2]

array([[ 1,  6, 11, 16],
       [41, 46, 51, 56],
       [81, 86, 91, 96]])

In [180]:
n[1::2]# odd rows 1, 3

array([[21, 26, 31, 36],
       [61, 66, 71, 76]])

### accessing values

In [185]:
n

array([[ 1,  6, 11, 16],
       [21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76],
       [81, 86, 91, 96]])

In [186]:
n[0][0]

1

In [187]:
n[0][-1]

16

In [188]:
n[-1][-1]

96

In [189]:
n[-1][0]

81

In [190]:
n[1][2]

31

### Accessing Subsets

In [191]:
n

array([[ 1,  6, 11, 16],
       [21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76],
       [81, 86, 91, 96]])

In [198]:
n[0:2, 0:2] # rows slicing, columns slicing

array([[ 1,  6],
       [21, 26]])

In [201]:
n[3:, 2:]

array([[71, 76],
       [91, 96]])

### Accessing Columns


In [203]:
n

array([[ 1,  6, 11, 16],
       [21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76],
       [81, 86, 91, 96]])

In [202]:
n[:, 0]

array([ 1, 21, 41, 61, 81])

In [204]:
n[:, 0::2] # even columns

array([[ 1, 11],
       [21, 31],
       [41, 51],
       [61, 71],
       [81, 91]])

In [205]:
n[:, 1::2]

array([[ 6, 16],
       [26, 36],
       [46, 56],
       [66, 76],
       [86, 96]])

In [211]:
n.ndim

2

In [206]:
n[:, -1]

array([16, 36, 56, 76, 96])

In [208]:
n[-1,-1]

96

In [212]:
for i in n:
    for j in i:
        print(j)

1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96


# Statastical Operation

In [214]:
n.min()

1

In [215]:
n.max()

96

In [216]:
n.mean()

48.5

In [219]:
s = 0
for i in li:
    s += i
s/len(li)

48.666666666666664

In [220]:
np.array(li).mean() # Average 

48.666666666666664

In [223]:
np.median(n) # it given middle value after sorting

48.5

In [224]:
n.var()

831.25

In [225]:
n.std()

28.83140648667699

In [229]:
n.sort()
n

array([[ 1,  6, 11, 16],
       [21, 26, 31, 36],
       [41, 46, 51, 56],
       [61, 66, 71, 76],
       [81, 86, 91, 96]])

In [232]:
np.sin(100)

-0.5063656411097588

In [233]:
np.log(10)

2.302585092994046

In [234]:
np.log10(10)

1.0

In [237]:
np.power(10,2)

100

In [240]:
np.mod(11,2)

1

In [244]:
np.divide(10,3)

3.3333333333333335

In [245]:
dir(np)

['ALLOW_THREADS',
 'AxisError',
 'BUFSIZE',
 'CLIP',
 'DataSource',
 'ERR_CALL',
 'ERR_DEFAULT',
 'ERR_IGNORE',
 'ERR_LOG',
 'ERR_PRINT',
 'ERR_RAISE',
 'ERR_WARN',
 'FLOATING_POINT_SUPPORT',
 'FPE_DIVIDEBYZERO',
 'FPE_INVALID',
 'FPE_OVERFLOW',
 'FPE_UNDERFLOW',
 'False_',
 'Inf',
 'Infinity',
 'MAXDIMS',
 'MAY_SHARE_BOUNDS',
 'MAY_SHARE_EXACT',
 'MachAr',
 'NAN',
 'NINF',
 'NZERO',
 'NaN',
 'PINF',
 'PZERO',
 'RAISE',
 'SHIFT_DIVIDEBYZERO',
 'SHIFT_INVALID',
 'SHIFT_OVERFLOW',
 'SHIFT_UNDERFLOW',
 'ScalarType',
 'Tester',
 'TooHardError',
 'True_',
 'UFUNC_BUFSIZE_DEFAULT',
 'UFUNC_PYVALS_NAME',
 'WRAP',
 '_NoValue',
 '_UFUNC_API',
 '__NUMPY_SETUP__',
 '__all__',
 '__builtins__',
 '__cached__',
 '__config__',
 '__dir__',
 '__doc__',
 '__file__',
 '__getattr__',
 '__git_revision__',
 '__loader__',
 '__mkl_version__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 '_add_newdoc_ufunc',
 '_distributor_init',
 '_globals',
 '_mat',
 '_pytesttester',
 'abs',
 'absol

In [246]:
help(np.divide)

Help on ufunc object:

true_divide = class ufunc(builtins.object)
 |  Functions that operate element by element on whole arrays.
 |  
 |  To see the documentation for a specific ufunc, use `info`.  For
 |  example, ``np.info(np.sin)``.  Because ufuncs are written in C
 |  (for speed) and linked into Python with NumPy's ufunc facility,
 |  Python's help() function finds this page whenever help() is called
 |  on a ufunc.
 |  
 |  A detailed explanation of ufuncs can be found in the docs for :ref:`ufuncs`.
 |  
 |  Calling ufuncs:
 |  
 |  op(*x[, out], where=True, **kwargs)
 |  Apply `op` to the arguments `*x` elementwise, broadcasting the arguments.
 |  
 |  The broadcasting rules are:
 |  
 |  * Dimensions of length 1 may be prepended to either array.
 |  * Arrays may be repeated along dimensions of length 1.
 |  
 |  Parameters
 |  ----------
 |  *x : array_like
 |      Input arrays.
 |  out : ndarray, None, or tuple of ndarray and None, optional
 |      Alternate array object(s) in 

In [252]:
l1 = [i**2 for i in range(1,10000)] 
l2 = [j**2 for i in range(1,10000)]

# mulply both lists

mul_list = list(map(lambda x,y : x*y, l1,l2))
mul_list

  mul_list = list(map(lambda x,y : x*y, l1,l2))


[9216,
 36864,
 82944,
 147456,
 230400,
 331776,
 451584,
 589824,
 746496,
 921600,
 1115136,
 1327104,
 1557504,
 1806336,
 2073600,
 2359296,
 2663424,
 2985984,
 3326976,
 3686400,
 4064256,
 4460544,
 4875264,
 5308416,
 5760000,
 6230016,
 6718464,
 7225344,
 7750656,
 8294400,
 8856576,
 9437184,
 10036224,
 10653696,
 11289600,
 11943936,
 12616704,
 13307904,
 14017536,
 14745600,
 15492096,
 16257024,
 17040384,
 17842176,
 18662400,
 19501056,
 20358144,
 21233664,
 22127616,
 23040000,
 23970816,
 24920064,
 25887744,
 26873856,
 27878400,
 28901376,
 29942784,
 31002624,
 32080896,
 33177600,
 34292736,
 35426304,
 36578304,
 37748736,
 38937600,
 40144896,
 41370624,
 42614784,
 43877376,
 45158400,
 46457856,
 47775744,
 49112064,
 50466816,
 51840000,
 53231616,
 54641664,
 56070144,
 57517056,
 58982400,
 60466176,
 61968384,
 63489024,
 65028096,
 66585600,
 68161536,
 69755904,
 71368704,
 72999936,
 74649600,
 76317696,
 78004224,
 79709184,
 81432576,
 83174400,
 

In [253]:
n1 = np.array(l1)
n2 = np.array(l2)

n1*n2

array([       9216,       36864,       82944, ...,  1924121600,
        2108395520, -2002279424])

# Comparing Computational Time in Numpy and List

In [261]:
import time 
l1 = [i**2 for i in range(1,10000)] 
l2 = [j**2 for i in range(1,10000)]

t0 = time.time()
mul_list = list(map(lambda x,y : x*y, l1,l2))
mul_list
t1 = time.time()
print("Using List ", t1- t0)

Using List  0.02789139747619629


  mul_list = list(map(lambda x,y : x*y, l1,l2))


In [260]:
t0 = time.time()
n1*n2
t1 = time.time()
print("Using Numpy ", t1-t0)

Using Numpy  0.0009582042694091797
