# Numpy Module

![image.png](attachment:image.png)

<p>NumPy, which stands for Numerical Python, is a library consisting of multidimensional array objects and a collection of routines for processing those arrays. Using NumPy, mathematical and logical operations on arrays can be performed.</p>

### Credits
[tutorialspoint](https://www.tutorialspoint.com/numpy/index.htm)

### Numpy array Objects
creating a multi-dimensional array object,

![arrays.png](attachment:arrays.png)

In [1]:
import numpy as np

array1=[[11,12,13,14],
        [22,23,24,25],
       [33,34,35,36]]

array1_new=np.array(array1)

print(array1)

[[11, 12, 13, 14], [22, 23, 24, 25], [33, 34, 35, 36]]


In [2]:
print(array1_new)

[[11 12 13 14]
 [22 23 24 25]
 [33 34 35 36]]


In [4]:
x=[1,2,3,4]

array2=np.array(x)
#array2=np.asarray(x)
print(array2)

[1 2 3 4]


In [3]:
import numpy as np

myList=[[1, 2, 3],
        [4, 5, 6]]

array3 = np.array(myList,dtype=np.int)  
print(array3)

[[1 2 3]
 [4 5 6]]


<h3>shape and ndim</h3>

In [13]:
array1=[[11,12,13,14],
        [22,23,24,25],
       [33,34,35,36]]

array1_new=np.array(array1)

In [14]:
print(array1_new.ndim)

2


In [17]:
x=[1,2,3,4]
array2=np.array(x)
print(array2.ndim)

1


In [18]:
print(array1_new.shape)

(3, 4)


<h3>np.arrange() and np.resize()</h3>

In [19]:
print(array1_new)

[[11 12 13 14]
 [22 23 24 25]
 [33 34 35 36]]


In [3]:
print(array1_new.shape)

(3, 4)


In [4]:
array2=array1_new.reshape(4,3)
print(array2)

[[11 12 13]
 [14 22 23]
 [24 25 33]
 [34 35 36]]


In [5]:
array2=array1_new.reshape(6,2)
print(array2)

[[11 12]
 [13 14]
 [22 23]
 [24 25]
 [33 34]
 [35 36]]


In [6]:
array2=array1_new.reshape(12,1)
print(array2)

[[11]
 [12]
 [13]
 [14]
 [22]
 [23]
 [24]
 [25]
 [33]
 [34]
 [35]
 [36]]


In [8]:
array2=array1_new.reshape(1,12)
print(array2)

[[11 12 13 14 22 23 24 25 33 34 35 36]]


In [9]:
array=np.arange(-10,10,2)
print(array)

[-10  -8  -6  -4  -2   0   2   4   6   8]


<h3>Array Creation Routines</h3>

In [32]:
import numpy as np 
x = np.empty((3,2), dtype=np.int) 
print(x)

[[1065353216 1073741824]
 [1077936128 1082130432]
 [1084227584 1086324736]]


It creates an uninitialized array of specified shape and dtype. The elements in an array show random values as they are not initialized

In [33]:
# array of five zeros. Default dtype is float 
x = np.zeros(5) 
print(x)

[0. 0. 0. 0. 0.]


In [37]:
x = np.zeros((5,3),dtype=np.float) 
print(x)

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [38]:
x = np.zeros((5,3),dtype=np.int) 
print(x)

[[0 0 0]
 [0 0 0]
 [0 0 0]
 [0 0 0]
 [0 0 0]]


In [39]:
x = np.ones((5,3),dtype=np.int) 
print(x)

[[1 1 1]
 [1 1 1]
 [1 1 1]
 [1 1 1]
 [1 1 1]]


This routine are useful for converting Python sequence into ndarray.

In [40]:
x = np.arange(-10,10,2) 
print(x)

[-10  -8  -6  -4  -2   0   2   4   6   8]


This function returns an ndarray object containing evenly spaced values within a given range. The format of the function is as follows −

```numpy.arange(start, stop, step, dtype)```

<h3>Indexing & Slicing</h3>

In [42]:
array=np.random.randint(0,100,(10,5))
print(array)

[[11 60 99 41 45]
 [49 67 95 71 32]
 [23  4 66 14 43]
 [69 91 45 38 70]
 [81 84 68 47 45]
 [51 12 52 39 14]
 [22 84 55 82 97]
 [31 11 60 82 12]
 [ 6 63 36 33 18]
 [36 43 42 15 37]]


In [10]:
array1=np.random.randint(0,100,(4,3))
array2=np.random.randint(0,100,(4,3))

array3=array1+array2
print('array1:',array1)
print('array2:',array2)
print('array3:',array3)

array1: [[52 19 16]
 [92 76 88]
 [53 49 49]
 [ 6 60 61]]
array2: [[79 87 18]
 [78 37 22]
 [92 37  5]
 [38 21 37]]
array3: [[131 106  34]
 [170 113 110]
 [145  86  54]
 [ 44  81  98]]


In [13]:
array1=np.random.randint(0,100,(4,3))

array2=array1/10

print('array1:',array1)
print('array2:',array2)

array1: [[73 88 57]
 [10 93 12]
 [34  0 36]
 [58  1  9]]
array2: [[ 63  78  47]
 [  0  83   2]
 [ 24 -10  26]
 [ 48  -9  -1]]


In [14]:
array1=np.random.randint(0,100,(4,3))
array2=np.random.randint(0,100,(3,5))

print('array1:',array1)
print('array2:',array2)

array4=np.matmul(array1,array2)

print('array4:',array4)

array1: [[24 67  5]
 [17 56 99]
 [94 62 26]
 [21 84 66]]
array2: [[89 93 22 61 68]
 [26 95 10 24  2]
 [ 0 75 23 96 60]]
array4: [[ 3878  8972  1313  3552  2066]
 [ 2969 14326  3211 11885  7208]
 [ 9978 16582  3286  9718  8076]
 [ 4053 14883  2820  9633  5556]]


In [16]:
array=np.random.randint(0,100,(5,7))
print(array)

[[88  4  5 74 29  8 26]
 [78 85 11 53  4 99 32]
 [32 50 83 95  9 49 73]
 [20  1  6 26 12  9 97]
 [37 38 14  7 55 81 46]]


In [22]:
print(np.max(array))

99


In [23]:
print(np.argmax(array))

12


In [24]:
print(np.max(array,axis=0))

[88 85 83 95 55 99 97]


In [26]:
print(np.argmax(array,axis=0))

[0 1 2 2 4 1 3]


In [27]:
print(np.max(array,axis=1))

[88 99 95 97 81]


In [28]:
print(np.argmax(array,axis=1))

[0 5 3 6 5]


In [48]:
print(np.max(array))

98


In [51]:
print(np.argmax(array))

19


In [49]:
print(np.mean(array))

54.22


In [50]:
print(np.sum(array))

2711


In [45]:
print(array[0][2])

69


In [47]:
print(array[0:3,2:4])

[[69  4]
 [92 27]
 [70 83]]


In [53]:
print(np.mean(array))

56.16


In [54]:
print(np.min(array))

0


In [48]:
print(np.max(array))

96


In [49]:
print(np.argmax(array))

25


In [50]:
print(np.min(array,axis=0))

[27  7 33  1  6 20 76 27  0 27]


In [51]:
print(np.mean(array,axis=0))

[62.  43.4 69.  34.2 63.8 62.2 83.6 52.6 33.6 57.2]


In [52]:
print(np.mean(array))

56.16


In [31]:
array=np.random.randint(0,100,(5,10))
print(array)

[[94 95 84 11 82 80  3 47 46 74]
 [91 64 20 76 91 32 93 78 13 72]
 [68 98 39 60 29 63 70 51 55 55]
 [ 3 58 42 17 93 20 49 37 30 78]
 [66 35 31 92 48 30  1 59 22 72]]


In [32]:
print(array[0:2,0:4])

[[94 95 84 11]
 [91 64 20 76]]


In [33]:
print(array[:,0:4])

[[94 95 84 11]
 [91 64 20 76]
 [68 98 39 60]
 [ 3 58 42 17]
 [66 35 31 92]]


In [34]:
print(array[:,4])

[82 91 29 93 48]


In [35]:
print(array[:2,:])

[[94 95 84 11 82 80  3 47 46 74]
 [91 64 20 76 91 32 93 78 13 72]]


In [36]:
avg=np.mean(array[:2,:])
print(avg)

62.3


## Exercise 1 - Numpy and Pandas

In [2]:
import pandas as pd

dataset=pd.read_csv(r'C:\Users\Migara Liyanage\Desktop\Advaced Python Workshop\DAY 02\codes\student-marks.csv')

In [3]:
print(dataset)

        Name  Maths  Physics  Chemistry
0       John     45       22         98
1    William     44       23          2
2      Henry     22       89         35
3      Chris     88       55         23
4       Jack     45       89         45
5   Stephane     64       45         48
6     Harald     12       12         12
7     Hearmy     78       94         65
8     Sinsel     98       95         23
9     George     45       65         89
10     James     64       32         26
11      Jade     23       78         47
12     Tommy     45       98         14
13     Yonus     78       45         44
14     Randy     95       12         84
15       Rex     31       95         98
16    Tucsan     77       32         56
17     Amell     87       78         34
18       Vin     65       45         78
19    Hector     32       12         78


In [26]:
dataset=pd.read_csv(r'C:\Users\Migara Liyanage\Desktop\Advaced Python Workshop\DAY 02\codes\student-marks.csv').values
print(dataset)

[['John' 45 22 98]
 ['William' 44 23 2]
 ['Henry' 22 89 35]
 ['Chris' 88 55 23]
 ['Jack' 45 89 45]
 ['Stephane' 64 45 48]
 ['Harald' 12 12 12]
 ['Hearmy' 78 94 65]
 ['Sinsel' 98 95 23]
 ['George' 45 65 89]
 ['James' 64 32 26]
 ['Jade' 23 78 47]
 ['Tommy' 45 98 14]
 ['Yonus' 78 45 44]
 ['Randy' 95 12 84]
 ['Rex' 31 95 98]
 ['Tucsan' 77 32 56]
 ['Amell' 87 78 34]
 ['Vin' 65 45 78]
 ['Hector' 32 12 78]]


In [27]:
import numpy as np
avg=np.mean(dataset[:,1:],axis=1)
print(avg)

[55.0 23.0 48.666666666666664 55.333333333333336 59.666666666666664
 52.333333333333336 12.0 79.0 72.0 66.33333333333333 40.666666666666664
 49.333333333333336 52.333333333333336 55.666666666666664
 63.666666666666664 74.66666666666667 55.0 66.33333333333333
 62.666666666666664 40.666666666666664]


In [28]:
def findGrade(mark):
    if(mark>=0 and mark<35):
        grade='F'
    elif(mark>=35 and mark<45):
        grade='S'
    elif(mark>=45 and mark<65):
        grade='C'
    elif(mark>=65 and mark<75):
        grade='B'
    elif(mark>=75 and mark<=100):
        grade='A'
    return grade

In [29]:
mathsGrades=np.array([findGrade(i) for i in dataset[:,1]])

print(mathsGrades)
mathsGrades=mathsGrades.reshape(-1,1)
print(mathsGrades)

['C' 'S' 'F' 'A' 'C' 'C' 'F' 'A' 'A' 'C' 'C' 'F' 'C' 'A' 'A' 'F' 'A' 'A'
 'B' 'F']
[['C']
 ['S']
 ['F']
 ['A']
 ['C']
 ['C']
 ['F']
 ['A']
 ['A']
 ['C']
 ['C']
 ['F']
 ['C']
 ['A']
 ['A']
 ['F']
 ['A']
 ['A']
 ['B']
 ['F']]


In [30]:
phyGrades=np.array([findGrade(i) for i in dataset[:,2]])
phyGrades=phyGrades.reshape(-1,1)
print(phyGrades)

[['F']
 ['F']
 ['A']
 ['C']
 ['A']
 ['C']
 ['F']
 ['A']
 ['A']
 ['B']
 ['F']
 ['A']
 ['A']
 ['C']
 ['F']
 ['A']
 ['F']
 ['A']
 ['C']
 ['F']]


In [31]:
chemGrades=np.array([findGrade(i) for i in dataset[:,3]])
chemGrades=chemGrades.reshape(-1,1)
print(chemGrades)

[['A']
 ['F']
 ['S']
 ['F']
 ['C']
 ['C']
 ['F']
 ['B']
 ['F']
 ['A']
 ['F']
 ['C']
 ['F']
 ['S']
 ['A']
 ['A']
 ['C']
 ['F']
 ['A']
 ['A']]


In [32]:
avg=avg.reshape(-1,1).reshape(-1,1)
print(avg)

[[55.0]
 [23.0]
 [48.666666666666664]
 [55.333333333333336]
 [59.666666666666664]
 [52.333333333333336]
 [12.0]
 [79.0]
 [72.0]
 [66.33333333333333]
 [40.666666666666664]
 [49.333333333333336]
 [52.333333333333336]
 [55.666666666666664]
 [63.666666666666664]
 [74.66666666666667]
 [55.0]
 [66.33333333333333]
 [62.666666666666664]
 [40.666666666666664]]


In [33]:
print(dataset)

[['John' 45 22 98]
 ['William' 44 23 2]
 ['Henry' 22 89 35]
 ['Chris' 88 55 23]
 ['Jack' 45 89 45]
 ['Stephane' 64 45 48]
 ['Harald' 12 12 12]
 ['Hearmy' 78 94 65]
 ['Sinsel' 98 95 23]
 ['George' 45 65 89]
 ['James' 64 32 26]
 ['Jade' 23 78 47]
 ['Tommy' 45 98 14]
 ['Yonus' 78 45 44]
 ['Randy' 95 12 84]
 ['Rex' 31 95 98]
 ['Tucsan' 77 32 56]
 ['Amell' 87 78 34]
 ['Vin' 65 45 78]
 ['Hector' 32 12 78]]


In [34]:
dataset=np.append(dataset,mathsGrades,axis=1)
print(dataset)

[['John' 45 22 98 'C']
 ['William' 44 23 2 'S']
 ['Henry' 22 89 35 'F']
 ['Chris' 88 55 23 'A']
 ['Jack' 45 89 45 'C']
 ['Stephane' 64 45 48 'C']
 ['Harald' 12 12 12 'F']
 ['Hearmy' 78 94 65 'A']
 ['Sinsel' 98 95 23 'A']
 ['George' 45 65 89 'C']
 ['James' 64 32 26 'C']
 ['Jade' 23 78 47 'F']
 ['Tommy' 45 98 14 'C']
 ['Yonus' 78 45 44 'A']
 ['Randy' 95 12 84 'A']
 ['Rex' 31 95 98 'F']
 ['Tucsan' 77 32 56 'A']
 ['Amell' 87 78 34 'A']
 ['Vin' 65 45 78 'B']
 ['Hector' 32 12 78 'F']]


In [35]:
dataset=np.append(dataset,phyGrades,axis=1)
print(dataset)

[['John' 45 22 98 'C' 'F']
 ['William' 44 23 2 'S' 'F']
 ['Henry' 22 89 35 'F' 'A']
 ['Chris' 88 55 23 'A' 'C']
 ['Jack' 45 89 45 'C' 'A']
 ['Stephane' 64 45 48 'C' 'C']
 ['Harald' 12 12 12 'F' 'F']
 ['Hearmy' 78 94 65 'A' 'A']
 ['Sinsel' 98 95 23 'A' 'A']
 ['George' 45 65 89 'C' 'B']
 ['James' 64 32 26 'C' 'F']
 ['Jade' 23 78 47 'F' 'A']
 ['Tommy' 45 98 14 'C' 'A']
 ['Yonus' 78 45 44 'A' 'C']
 ['Randy' 95 12 84 'A' 'F']
 ['Rex' 31 95 98 'F' 'A']
 ['Tucsan' 77 32 56 'A' 'F']
 ['Amell' 87 78 34 'A' 'A']
 ['Vin' 65 45 78 'B' 'C']
 ['Hector' 32 12 78 'F' 'F']]


In [36]:
dataset=np.append(dataset,chemGrades,axis=1)
print(dataset)

[['John' 45 22 98 'C' 'F' 'A']
 ['William' 44 23 2 'S' 'F' 'F']
 ['Henry' 22 89 35 'F' 'A' 'S']
 ['Chris' 88 55 23 'A' 'C' 'F']
 ['Jack' 45 89 45 'C' 'A' 'C']
 ['Stephane' 64 45 48 'C' 'C' 'C']
 ['Harald' 12 12 12 'F' 'F' 'F']
 ['Hearmy' 78 94 65 'A' 'A' 'B']
 ['Sinsel' 98 95 23 'A' 'A' 'F']
 ['George' 45 65 89 'C' 'B' 'A']
 ['James' 64 32 26 'C' 'F' 'F']
 ['Jade' 23 78 47 'F' 'A' 'C']
 ['Tommy' 45 98 14 'C' 'A' 'F']
 ['Yonus' 78 45 44 'A' 'C' 'S']
 ['Randy' 95 12 84 'A' 'F' 'A']
 ['Rex' 31 95 98 'F' 'A' 'A']
 ['Tucsan' 77 32 56 'A' 'F' 'C']
 ['Amell' 87 78 34 'A' 'A' 'F']
 ['Vin' 65 45 78 'B' 'C' 'A']
 ['Hector' 32 12 78 'F' 'F' 'A']]


In [37]:
dataset=np.append(dataset,avg,axis=1)
print(dataset)

[['John' 45 22 98 'C' 'F' 'A' 55.0]
 ['William' 44 23 2 'S' 'F' 'F' 23.0]
 ['Henry' 22 89 35 'F' 'A' 'S' 48.666666666666664]
 ['Chris' 88 55 23 'A' 'C' 'F' 55.333333333333336]
 ['Jack' 45 89 45 'C' 'A' 'C' 59.666666666666664]
 ['Stephane' 64 45 48 'C' 'C' 'C' 52.333333333333336]
 ['Harald' 12 12 12 'F' 'F' 'F' 12.0]
 ['Hearmy' 78 94 65 'A' 'A' 'B' 79.0]
 ['Sinsel' 98 95 23 'A' 'A' 'F' 72.0]
 ['George' 45 65 89 'C' 'B' 'A' 66.33333333333333]
 ['James' 64 32 26 'C' 'F' 'F' 40.666666666666664]
 ['Jade' 23 78 47 'F' 'A' 'C' 49.333333333333336]
 ['Tommy' 45 98 14 'C' 'A' 'F' 52.333333333333336]
 ['Yonus' 78 45 44 'A' 'C' 'S' 55.666666666666664]
 ['Randy' 95 12 84 'A' 'F' 'A' 63.666666666666664]
 ['Rex' 31 95 98 'F' 'A' 'A' 74.66666666666667]
 ['Tucsan' 77 32 56 'A' 'F' 'C' 55.0]
 ['Amell' 87 78 34 'A' 'A' 'F' 66.33333333333333]
 ['Vin' 65 45 78 'B' 'C' 'A' 62.666666666666664]
 ['Hector' 32 12 78 'F' 'F' 'A' 40.666666666666664]]


In [39]:
np.save('new-dataset',dataset)

In [41]:
np.savetxt("dataset-new.csv", dataset, delimiter=",")

TypeError: Mismatch between array dtype ('object') and format specifier ('%.18e,%.18e,%.18e,%.18e,%.18e,%.18e,%.18e,%.18e')

In [46]:
df=pd.DataFrame(data=dataset)
print(df)

           0   1   2   3  4  5  6        7
0       John  45  22  98  C  F  A       55
1    William  44  23   2  S  F  F       23
2      Henry  22  89  35  F  A  S  48.6667
3      Chris  88  55  23  A  C  F  55.3333
4       Jack  45  89  45  C  A  C  59.6667
5   Stephane  64  45  48  C  C  C  52.3333
6     Harald  12  12  12  F  F  F       12
7     Hearmy  78  94  65  A  A  B       79
8     Sinsel  98  95  23  A  A  F       72
9     George  45  65  89  C  B  A  66.3333
10     James  64  32  26  C  F  F  40.6667
11      Jade  23  78  47  F  A  C  49.3333
12     Tommy  45  98  14  C  A  F  52.3333
13     Yonus  78  45  44  A  C  S  55.6667
14     Randy  95  12  84  A  F  A  63.6667
15       Rex  31  95  98  F  A  A  74.6667
16    Tucsan  77  32  56  A  F  C       55
17     Amell  87  78  34  A  A  F  66.3333
18       Vin  65  45  78  B  C  A  62.6667
19    Hector  32  12  78  F  F  A  40.6667


In [48]:
df=pd.DataFrame(data=dataset,columns=["name","maths","phy","chem","m-grade","p-grade","c-grade","avg"])
print(df)

        name maths phy chem m-grade p-grade c-grade      avg
0       John    45  22   98       C       F       A       55
1    William    44  23    2       S       F       F       23
2      Henry    22  89   35       F       A       S  48.6667
3      Chris    88  55   23       A       C       F  55.3333
4       Jack    45  89   45       C       A       C  59.6667
5   Stephane    64  45   48       C       C       C  52.3333
6     Harald    12  12   12       F       F       F       12
7     Hearmy    78  94   65       A       A       B       79
8     Sinsel    98  95   23       A       A       F       72
9     George    45  65   89       C       B       A  66.3333
10     James    64  32   26       C       F       F  40.6667
11      Jade    23  78   47       F       A       C  49.3333
12     Tommy    45  98   14       C       A       F  52.3333
13     Yonus    78  45   44       A       C       S  55.6667
14     Randy    95  12   84       A       F       A  63.6667
15       Rex    31  95  

In [50]:
df.to_csv('new-dataset.csv',index=False)