**Module-1:**

# Numpy-Basics

**Numpy** is a general-purpose array-processing package. It provides a high-performance multidimensional array object, and tools for working with these arrays. It is the fundamental package for scientific computing with Python.Using **NumPy**, mathematical and logical operations on arrays can be performed. 

In this module we will see the basics and various functions of **Numpy**.

Explanations will be given with each codes

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## 1.Importing Libraries

Let's start by importing the libraries:

In [2]:
import sys
import numpy as np
print('Modules Imported')

Modules Imported


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 2.Basic Numpy Arays

We can create a numpy array by using the function **np.array()**

In [4]:
np.array([10, 20, 30, 40]) 

array([10, 20, 30, 40])

Let's generate and store some **numpy arrays** in variables.

In [7]:
a = np.array([10, 20, 30, 40])
b = np.array([0, .25, 10, 5.5, 20])

In [6]:
a[0], a[1] # Shows first and 2nd element of the array

(1, 2)

In [42]:
a[0:] # Shows the elements from the given range till end

array([10, 20, 30, 40])

In [43]:
a[1:3] # Shows element within the given rangle

array([20, 30])

In [44]:
a[1:-1]

array([20, 30])

In [49]:
a[::3] # Here 3 is the number of element the array should skip to show the later element

array([10, 40])

In [50]:
b

array([ 0.  ,  0.25, 10.  ,  5.5 , 20.  ])

In [51]:
b[0], b[2], b[-1]

(0.0, 10.0, 20.0)

In [52]:
b[[0, 2, -1]]

array([ 0., 10., 20.])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 3.Array Types

We can see the type of data stored in a **numpy array**. The data type maybe **float** or **integer**

In [8]:
a

array([10, 20, 30, 40])

In [9]:
a.dtype 

dtype('int64')

In [56]:
b

array([ 0.  ,  0.25, 10.  ,  5.5 , 20.  ])

In [57]:
b.dtype

dtype('float64')

In [3]:
np.array([1, 2, 3, 4], dtype=np.float) #Represents the data as float

array([1., 2., 3., 4.])

In [60]:
np.array([1, 2, 3, 4], dtype=np.int8) #Represents the data as int

array([1, 2, 3, 4], dtype=int8)

In [61]:
c = np.array(['a', 'b', 'c']) 

In [64]:
c.dtype #  Gives little-endian 1 character string. 

dtype('<U1')

In [65]:
d = np.array([{'a': 1}, sys])

In [66]:
d.dtype # 'O'     (Python) objects

dtype('O')

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 4.Dimension & Shapes

We can generate matrices using **Numpy**. We can also findout their dimensions and size using appropiate functions

In [67]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])

In [22]:
A.shape #prints a tuple showing the size of the matrix(2 rows x 3 columns )

(2, 3)

In [24]:
A.ndim # Prints the dimension of the matrix

2

In [26]:
A.size # Prints the total elements inside the matrix

6

In [68]:
B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])

In [28]:
B

array([[[12, 11, 10],
        [ 9,  8,  7]],

       [[ 6,  5,  4],
        [ 3,  2,  1]]])

In [30]:
B.shape # Prints shape (2 rows 2 datasets 3 columns)

(2, 2, 3)

In [31]:
B.ndim

3

In [32]:
B.size

12

In [33]:
C = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4]
    ]
])

In [34]:
C.dtype 

dtype('O')

In [38]:
C.shape # If the shape isn't consistent it becomes a regular python object

(2,)

In [36]:
C.size

2

In [37]:
type(C[0])

list

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 5.Indexing & Slicing of Matrices

In this section we'll see how to index and slice specific elements in a matrix using **Numpy**

In [69]:
# Square matrix
A = np.array([
#.   0. 1. 2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])

In [70]:
A[1] # Prints 2nd row

array([4, 5, 6])

In [71]:
A[1][0] #Prints element of the 2nd row and 1st column

4

In [72]:
A[1, 0] 

4

In [76]:
A[0:2]  # Prints first (n-1) rows where n=2

array([[1, 2, 3],
       [4, 5, 6]])

In [88]:
A[:, :2] # Prints all rows and n columns(n=2) but if n= null then all values are printed

array([[1, 2],
       [4, 5],
       [7, 8]])

In [91]:
A[:2, :2] # Prints n rows and n columns where n=2

array([[1, 2],
       [4, 5]])

In [92]:
A[:2, 2:] # Prints first two rows and skips first two columns

array([[3],
       [6]])

In [93]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [96]:
A[1] = np.array([10, 10, 10]) # Replaces the elements of the 2nd row

In [97]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

In [100]:
A[2] = 99  # Replaces all elements of the 3rd row

In [99]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [99, 99, 99]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 6.Summary Statistics

We can calculate statistical values like **mean**,**sum**,**standard deviation** etc using **Numpy**

In [3]:
a = np.array([11, 21, 31, 41])

In [4]:
a.sum()

104

In [5]:
a.mean()

26.0

In [6]:
a.std()

11.180339887498949

In [7]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [8]:
A.mean()

5.0

In [9]:
A.std()

2.581988897471611

In [10]:
A.sum(axis=0) #axis=0 represents columns

array([12, 15, 18])

In [11]:
A.sum(axis=1) #axis=1 represent rows

array([ 6, 15, 24])

In [12]:
A.mean(axis=0)

array([4., 5., 6.])

In [13]:
A.mean(axis=1)

array([2., 5., 8.])

In [14]:
A.std(axis=0)

array([2.44948974, 2.44948974, 2.44948974])

In [15]:
A.std(axis=1)

array([0.81649658, 0.81649658, 0.81649658])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 7.Broadcasting & Vectorized Operations

In [19]:
a = np.arange(5)

In [18]:
a

array([0, 1, 2, 3, 4])

 **np.arange()** function returns ndarray object containing evenly spaced values within given range

In [20]:
b = np.arange(22,44,3) # np.aranage(start,stop,step)

In [21]:
b

array([22, 25, 28, 31, 34, 37, 40, 43])

In [22]:
a + 10 #adds 10 with each value of a

array([10, 11, 12, 13, 14])

In [23]:
a * 10 #adds 10 with each value of a

array([ 0, 10, 20, 30, 40])

In [None]:
a

In [24]:
a += 100

In [25]:
a

array([100, 101, 102, 103, 104])

In [26]:
l = [0, 1, 2, 3] 

In [27]:
[i * 10 for i in l] #Multiplies 10 with each value in l

[0, 10, 20, 30]

In [28]:
a = np.arange(4)

In [29]:
a

array([0, 1, 2, 3])

In [30]:
b = np.array([10, 10, 10, 10])

In [31]:
b

array([10, 10, 10, 10])

In [32]:
a + b

array([10, 11, 12, 13])

In [33]:
a * b

array([ 0, 10, 20, 30])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 8.Boolean Arrays

Let's see some application of Boolean Arrays in **Numpy**

In [34]:
a = np.arange(4)

In [35]:
a

array([0, 1, 2, 3])

In [36]:
a[0], a[-1]

(0, 3)

In [37]:
a[[0, -1]]

array([0, 3])

In [38]:
a[[True, False, False, True]] #Prints only the elements in a which are set to True

array([0, 3])

In [39]:
a

array([0, 1, 2, 3])

In [41]:
a >= 2 #returns True to the values which are equal or greater than 2 in a

array([False, False,  True,  True])

In [42]:
a[a >= 2] #returns values which are equal or greater than 2 in a

array([2, 3])

In [43]:
a.mean()

1.5

In [44]:
a[a > a.mean()]

array([2, 3])

In [45]:
a[~(a > a.mean())] #returns values which are not equal or greater than the mean value of a in a

array([0, 1])

In [47]:
a[(a == 0) | (a == 1)] ##returns values which are equal to 0 or 1 in a

array([0, 1])

In [49]:
a[(a <= 2) & (a % 2 == 0)] #returns values which are less than equal to 2 and divisible by 2

array([0, 2])

In [54]:
A = np.random.randint(100, size=(3, 3)) #Generates a random matrix of size(3,3) within range 100

In [55]:
A

array([[12, 89, 57],
       [45, 67, 80],
       [ 3, 53, 52]])

In [56]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]

array([12, 57, 67,  3, 52])

In [57]:
A > 30

array([[False,  True,  True],
       [ True,  True,  True],
       [False,  True,  True]])

In [58]:
A[A > 30]

array([89, 57, 45, 67, 80, 53, 52])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 9.Linear Algebra

Some **Linear Algebra** application using ***Numpy***

In [59]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])

In [60]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])

In [61]:
A.dot(B)

array([[20, 14],
       [56, 41],
       [92, 68]])

In [62]:
A @ B

array([[20, 14],
       [56, 41],
       [92, 68]])

In [63]:
B.T

array([[6, 4, 2],
       [5, 3, 1]])

In [64]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [65]:
B.T @ A

array([[36, 48, 60],
       [24, 33, 42]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 10.System Performance

A question might arise why we need **Numpy** if we can store values in python list or dictionaries. We can see the difference in their system performance.

**Numpy arrays** occupies less size and executes faster 

In [66]:
# An integer in Python is > 24bytes
sys.getsizeof(1)

28

In [67]:
# Longs are even larger
sys.getsizeof(10**100)

72

In [68]:
# Numpy size is much smaller
np.dtype(int).itemsize

8

In [69]:
# Numpy size is much smaller
np.dtype(np.int8).itemsize

1

In [70]:
np.dtype(float).itemsize

8

In [71]:
# A one-element list
sys.getsizeof([1])

64

In [72]:
# An array of one element in numpy
np.array([1]).nbytes

8

In [73]:
l = list(range(100000))

In [74]:
a = np.arange(100000)

In [75]:
%time np.sum(a ** 2)

CPU times: user 2.23 ms, sys: 517 µs, total: 2.74 ms
Wall time: 1.53 ms


333328333350000

In [76]:
%time sum([x ** 2 for x in l])

CPU times: user 34.1 ms, sys: 4.28 ms, total: 38.4 ms
Wall time: 39.2 ms


333328333350000

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 11.Useful Numpy Functions

Let's see some of the important **Numpy** functions:

### random function

In [89]:
np.random.random(size=2) #creates an array of specified shape and fills it with random values of size 2

array([0.37850587, 0.23713147])

In [88]:
np.random.normal(size=2) #creates an array of specified shape and fills it with random values which is actually a part of Normal(Gaussian)Distribution. 

array([-0.0290857,  0.1714271])

In [79]:
np.random.rand(2, 4) #prints random numbers of size(2,4) i.e 2 rows and 4 columns

array([[0.59897236, 0.30840274, 0.58102975, 0.32843094],
       [0.4054352 , 0.15695813, 0.80142212, 0.96106198]])

### reshape function

In [92]:
np.arange(10).reshape(2, 5) # Prints random values within 10 of size (2,5)

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [93]:
np.arange(10).reshape(5, 2)

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

### linspace function

In [97]:
np.linspace(0, 1, 5) #np.linspace(start,end,number of values)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [96]:
np.linspace(0, 1, 20)

array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
       0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
       0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
       0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ])

In [100]:
np.linspace(0, 1, 20, False)

array([0.  , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 ,
       0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])

### zeros,ones & empty functions

In [102]:
np.zeros(5) # prints 5 zero values

array([0., 0., 0., 0., 0.])

In [103]:
np.zeros((3, 3)) # prints 0 values of size(3,3)

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [104]:
np.zeros((3, 3), dtype=np.int) #prints 0 values of integer datatype

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

In [106]:
np.ones(5) # prints 5 one values

array([1., 1., 1., 1., 1.])

In [108]:
np.ones((3, 3)) # prints one values of size(3,3)

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

#### numpy.empty(shape, dtype = float, order = ‘C’) : Return a new array of given shape and type, with random values. empty, unlike zeros, does not set the array values to zero, and may therefore be marginally faster.

In [109]:
np.empty(5)

array([1., 1., 1., 1., 1.])

In [112]:
np.empty((2, 2))

array([[0.25, 0.5 ],
       [0.75, 1.  ]])

### Identity and eye function

In [117]:
np.identity(3) #Prints an identity matrix

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [119]:
np.eye(3, 3) #Prints identity matrix of size(3,3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [121]:
np.eye(8, 4) #Prints identity matrix of size(8,4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [124]:
np.eye(8, 4, k=1) #k means Diagonal we require; k=1 means diagonal starts from 2nd row

array([[0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [125]:
np.eye(8, 4, k=-3)

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

In this module we have seen the basics of Numpy and some of it's important function. However there are huge number of syntax and functions in numpy that cannot be covered in a single Module.

## 12.More Resources :
   

[Numpy Documentation]( https://numpy.org/doc/)

[Numpy Tutorial](https://www.tutorialspoint.com/numpy/index.htm)

[Python Numpy Tutorial by Edureka](https://www.edureka.co/blog/python-numpy-tutorial/)

**Disclaimer:** All the content of this module is not entirely created by me, some codes are a compilation of multiple courses that I have done online.

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

[Next Module: Python Pandas](https://github.com/ffarhaaan/Data-Visualization-Using-Python-Libraries/blob/master/M02-1-pandas-series.ipynb)