# Numpy

**Objective:** My aim for this notebook is to learn about numpy library of python

Before doing anything we gotta ensure to import numpy library.

In [1]:
import numpy as np

Now we can try some basic numpy functions.
Lets start with creating a numpy array.

In [2]:
a1= np.array([1,2,3,4])
print(a1)

[1 2 3 4]


In the above example, we have used a very small number of elements.
In case we want to create, lets say an array with thousand elements ranging from 0 to 99. We can use another numpy function *arange()*. Which is similar to the range() function of python.

In [3]:
a2= np.arange(100)
print(a2)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]


&emsp;
&emsp;
&emsp;

## Difference between python list and numpy array

Main benefits of using numpy array over python list are:
1. numpy array is faster than python list
2. numpy array occupies less memory
3. convenient

### Numpy array vs list speed  
Now to prove that numpy array operations are faster than list. Lets import time library to measure the time consumed by each of them to add two sequence of 1000000 size.

In [4]:
import time

In [5]:
size= 1000000

l1= list(range(size))
l2= list(range(size))

arr1= np.arange(size)
arr2= np.arange(size)

We have created two python lists of million size l1 and l2. And created two numpy array of million size as well.  
Now lets try to add l1 & l2 and measure the time taken to do so. 

In [6]:
start= time.time()
result= [(a + b) for a, b in zip(l1,l2)]
stop= time.time()
timeTaken= (stop- start)* 1000
print("Time taken to add two lists",timeTaken,"ms")

Time taken to add two lists 85.19792556762695 ms


Now lets measure the time required to add to numpy arrays of same size as list.

In [7]:
start= time.time()
result= arr1+ arr2
stop= time.time()
timeTaken= (stop- start)* 1000
print("Time taken to add two numpy arrays is ", timeTaken, "ms")

Time taken to add two numpy arrays is  14.988899230957031 ms


### Numpy array vs list space   
Ok now lets take a look at the amount of memory space required by both list and array to store same data.

In [8]:
import sys

In [9]:
print("Size of list l1   : ", sys.getsizeof(1)* len(l1))
print("Size of array arr1: ", arr1.size* arr1.itemsize)

Size of list l1   :  28000000
Size of array arr1:  4000000


So clearly list is occupying more memory

### 2D array

In [10]:
a2d= np.array([[1,2,3],[4,5,6],[7,8,9]])
print(a2d)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


### Some numpy functions:  
1. ndim
2. dtype
3. itemsize
4. arange
5. linspace
6. size
7. shape
8. reshape
9. ravel
10. min
11. max
12. sum
13. sqrt
14. std

**ndim** function retruns the dimention of array

In [11]:
print(a2d.ndim)

2


**dtype** function returns data type of array elements

In [12]:
print(a2d.dtype)

int32


In [13]:
print(a2d.itemsize)

4


We can also initialize an array with different data type

In [14]:
a2d= np.array([[1,2,3],[4,5,6],[7,8,9]], dtype= float)
print(a2d)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


In [15]:
print(a2d.dtype)

float64


In [16]:
print(a2d.itemsize)

8


**size** function returns the total number of elements in array

In [17]:
print(a2d.size)

9


**shape** function returns the no of rows and columns

In [18]:
print(a2d.shape)

(3, 3)


**linspace** function is used to generate linearly spaced elements between any two numbers

In [19]:
arr3= np.linspace(1,5,10)
print(arr3)

[1.         1.44444444 1.88888889 2.33333333 2.77777778 3.22222222
 3.66666667 4.11111111 4.55555556 5.        ]


In [20]:
arr4= np.arange(1,5,0.5)
print(arr4)

[1.  1.5 2.  2.5 3.  3.5 4.  4.5]


### The difference between linspace and arange  
In linspace we specify the number of elements as third argument, and the function returns that many linearly spaced elements between the given range.  
Whereas in arange we specify the increment size as third argument. As shown in above example.

`reshape()` function returns a new array with same items but different dimentions as per requirement

In [21]:
print(a2d.reshape(9, 1))

[[1.]
 [2.]
 [3.]
 [4.]
 [5.]
 [6.]
 [7.]
 [8.]
 [9.]]


`ravel()` function return a new array with all elements in a single row

In [22]:
print(a2d.ravel())

[1. 2. 3. 4. 5. 6. 7. 8. 9.]


*Note:* Original array stays intact

In [23]:
print(a2d)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


**min** function returns the smallest element of array

In [24]:
print(a2d.min())

1.0


**max** function returns the largest element of array

In [25]:
print(a2d.max())

9.0


**sum** function returns the sum of all elements if no arg passed.  
if axis= 0, returns the sum of columns  
if axis= 1, returns the sum of rows

In [26]:
print(a2d.sum())
print(a2d.sum(axis= 0))    #Sum of column elements
print(a2d.sum(axis= 1))    #Sum of rows elements

45.0
[12. 15. 18.]
[ 6. 15. 24.]


**sqrt** function returns an array with square root of each element

In [27]:
print(np.sqrt(a2d))

[[1.         1.41421356 1.73205081]
 [2.         2.23606798 2.44948974]
 [2.64575131 2.82842712 3.        ]]


In [28]:
print(a2d)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


**std** function returns the standard deviation of array

In [29]:
print(np.std(a2d))

2.581988897471611


## Slicing arrays

Slicing arrays is similar to that of slicing list

In [32]:
arr1= np.array([23,12,64])
print(arr1[0:2])              #2 is not included

[23 12]


### Slicing 2D array

In [33]:
a2d= np.array([[7,8,9],[3,4,7],[2,9,4]])
print(a2d)

[[7 8 9]
 [3 4 7]
 [2 9 4]]


In [35]:
print(a2d[0])   #Prints 0th row

[7 8 9]


In [38]:
print(a2d[:,0]) #Prints 0th column

[7 3 2]


In [41]:
print(a2d[1,2]) #Prints element at 1st row 2nd column
                #Note: python index starts from 0

7


In [42]:
print(a2d[1:3,1:3]) #prints elements of 1st & 2nd row from 1st and 2nd columns

[[4 7]
 [9 4]]


### Stacking arrays

In [44]:
arr1= np.arange(6).reshape(2,3);
arr2= np.arange(6,12).reshape(2,3);
print(arr1, "\n")
print(arr2)

[[0 1 2]
 [3 4 5]] 

[[ 6  7  8]
 [ 9 10 11]]


**hstack**: horizontally stacks two arrays

In [47]:
print(np.hstack((arr1, arr2)))

[[ 0  1  2  6  7  8]
 [ 3  4  5  9 10 11]]


**vstack**: vertically stacks two arrays

In [48]:
print(np.vstack((arr1,arr2)))

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]


### Spliting arrays

In [49]:
arr= np.arange(40).reshape(4,10)
print(arr)

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]
 [20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]]


In [53]:
x= np.hsplit(arr, 2)
print(x[0])
print("\n")
print(x[1])

[[ 0  1  2  3  4]
 [10 11 12 13 14]
 [20 21 22 23 24]
 [30 31 32 33 34]]


[[ 5  6  7  8  9]
 [15 16 17 18 19]
 [25 26 27 28 29]
 [35 36 37 38 39]]


In [55]:
y= np.vsplit(arr, 2)
print(y[0], "\n")
print(y[1])

[[ 0  1  2  3  4  5  6  7  8  9]
 [10 11 12 13 14 15 16 17 18 19]] 

[[20 21 22 23 24 25 26 27 28 29]
 [30 31 32 33 34 35 36 37 38 39]]


### Boolean arrays

In [58]:
a= np.arange(12).reshape(3,4)
print(arr)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [59]:
b= a>4
print(b)

[[False False False False]
 [False  True  True  True]
 [ True  True  True  True]]


In [60]:
print(a[b])

[ 5  6  7  8  9 10 11]


In [62]:
a[b]= -1
print(a)

[[ 0  1  2  3]
 [ 4 -1 -1 -1]
 [-1 -1 -1 -1]]
