# Numpy

- NumPy is a general-purpose array-processing package.
- Numpy was created to work with multidmensional arrays.
- It is the fundamental package for scientific computing with Python.
- It is open-source library.

# Why Use NumPy?
- In Python we have lists that serve the purpose of arrays, but they are slow to process.

- NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

- The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.

- Arrays are very frequently used in data science, where speed and resources are very important.

In [1]:
# Installing Numpy

!pip install numpy



In [2]:
# Importing Numpy Library

import numpy as np

**Array Data Structure (Homogeneous)**

In [3]:
# Array Creation

x = np.array(10)

In [4]:
type(x)

numpy.ndarray

In [5]:
# To check dimension of array

x.ndim

0

In [6]:
# Zero dimensional array

x.ndim

0

In [7]:
# One dimensional array (Note: One sqaure bracket for one dimension, two [[]] for two dimension and so on)

x1 = np.array([10,20,30,40,50])

In [8]:
print(type(x1))
x1.ndim

<class 'numpy.ndarray'>


1

In [9]:
# Two dimensional array

x2 = np.array([[10,20,30,40,50,60,70,80]])

In [10]:
print(type(x2))
x2.ndim

<class 'numpy.ndarray'>


2

In [11]:
# Three dimensional array

x3 = np.array([[[10,20,30,40,50,60,70,80,90,100]]])

In [12]:
print(type(x3))
x3.ndim

<class 'numpy.ndarray'>


3

In [13]:
# Change the data type of the array elements.

In [14]:
b = np.array([1,2,3,4,5,6,7,8,9,10], dtype = 'str')
b

array(['1', '2', '3', '4', '5', '6', '7', '8', '9', '10'], dtype='<U2')

In [15]:
b.astype(float)

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

In [16]:
b = np.array([1,2,3,4,5,6,7,8,9,10], dtype = 'float')
b

array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

---
# Other ways of creating an array
1. arange
2. linspace
3. zeros
4. ones
5. random

**1. arange**

Return evenly spaced values within a given interval.

**Syntax:** np.arange([start,] stop[, step,], dtype=None, *, like=None)

In [17]:
c =np.arange(10,21)
c

array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20])

In [18]:
a = np.arange(11,22,2)
a

array([11, 13, 15, 17, 19, 21])

**2. linspace**

Return evenly spaced numbers over a specified interval.

**Syntax:** np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)

In [19]:
d = np.linspace(10,15,20)
d

array([10.        , 10.26315789, 10.52631579, 10.78947368, 11.05263158,
       11.31578947, 11.57894737, 11.84210526, 12.10526316, 12.36842105,
       12.63157895, 12.89473684, 13.15789474, 13.42105263, 13.68421053,
       13.94736842, 14.21052632, 14.47368421, 14.73684211, 15.        ])

In [20]:
# retstep = True:  To measure the how much space difference is there in linspace numbers

d = np.linspace(10,15,20, retstep=True)
d

(array([10.        , 10.26315789, 10.52631579, 10.78947368, 11.05263158,
        11.31578947, 11.57894737, 11.84210526, 12.10526316, 12.36842105,
        12.63157895, 12.89473684, 13.15789474, 13.42105263, 13.68421053,
        13.94736842, 14.21052632, 14.47368421, 14.73684211, 15.        ]),
 0.2631578947368421)

**3. zeros**

Return a new array of given shape and type, filled with zeros.

**Syntax:** np.zeros(shape, dtype=float, order='C', *, like=None)

In [21]:
e = np. zeros([3,5])
#      Shape: [-,-] = [rows, columns]
e

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

**4. ones**

Return a new array of given shape and type, filled with ones.

**Syntax:** np.ones(shape, dtype=None, order='C', *, like=None)

In [22]:
f = np.ones([5,5])
#    Shape: [-,-] = [rows, columns]
f

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

**5. random**

**a) random.rand:**
Random values in a given shape.
**Syntax:** rand(d0, d1, ..., dn)

**b) random.randint:**
Return random integers from `low` (inclusive) to `high` (exclusive).
**Syntax:** randint(low, high=None, size=None, dtype=int)

**c) random.randn:**
Return a sample (or samples) from the "standard normal" distribution.
**Syntax:** randn(d0, d1, ..., dn)

In [23]:
# a) random.rand:

g = np.random.rand(2,2)
g

array([[0.38079846, 0.12556806],
       [0.58457361, 0.99527529]])

In [24]:
# b) random.randint:

h = np.random.randint(1,5,10)
h

array([1, 3, 2, 4, 2, 3, 3, 1, 4, 2])

In [25]:
# c) random.randn:

i = np.random.randn(5,6)
i

array([[-0.21073067, -0.20251317,  2.08839104,  0.67346845, -0.40649727,
        -0.11586246],
       [-1.82452285, -0.303617  , -1.01656049, -1.05157801, -2.16462306,
         0.47269716],
       [-1.2758518 , -0.30571794, -0.01661663,  0.47347985, -0.78197861,
         1.70632551],
       [ 1.01239957,  0.62072309, -1.50498057,  0.56258243,  1.49152525,
         0.41387234],
       [ 0.597619  ,  0.7447676 ,  1.21270265, -0.76060033, -0.42470843,
        -0.4937675 ]])

---
# Slicing and Indexing of arrays

In [26]:
j = np.array([1,2,3,4,5,6,7,8,9])
j

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [27]:
j.ndim

1

In [28]:
# Extract 5 num

j[4]

5

In [29]:
# extract num 7 to 9

j[6:9]

array([7, 8, 9])

In [30]:
# Indexing with two dimensional array

k = np.array([[1,2,3],[4,5,6]])
k

array([[1, 2, 3],
       [4, 5, 6]])

In [31]:
'''Index no.            0  1  2
                        |  |  |
        0--     array([[1, 2, 3],
        1--            [4, 5, 6]])'''

'Index no.            0  1  2\n                        |  |  |\n        0--     array([[1, 2, 3],\n        1--            [4, 5, 6]])'

In [32]:
# Extract 5 num

k[1,1]

5

In [33]:
# Extract num 2 & 5

k[0:, 1]

array([2, 5])

In [34]:
# Extract num 2,3 & 5,6

k[0:3,1:3]

array([[2, 3],
       [5, 6]])

In [35]:
# Extract num 1 & 6

k[[0,1], [0,2]]

array([1, 6])

---
# Reshaping the arrays 

In [36]:
l = np.random.rand(5,5)
l

array([[0.33910841, 0.09393642, 0.78759086, 0.01950257, 0.72099745],
       [0.87877631, 0.42512153, 0.41971349, 0.08224356, 0.17107548],
       [0.72815847, 0.39789729, 0.42221106, 0.4583429 , 0.92057646],
       [0.49252414, 0.4976895 , 0.63264508, 0.45535686, 0.98914392],
       [0.00790143, 0.4794656 , 0.79378548, 0.51790568, 0.32152241]])

In [37]:
l.shape

(5, 5)

In [38]:
# reshape

l.reshape(1,25)

array([[0.33910841, 0.09393642, 0.78759086, 0.01950257, 0.72099745,
        0.87877631, 0.42512153, 0.41971349, 0.08224356, 0.17107548,
        0.72815847, 0.39789729, 0.42221106, 0.4583429 , 0.92057646,
        0.49252414, 0.4976895 , 0.63264508, 0.45535686, 0.98914392,
        0.00790143, 0.4794656 , 0.79378548, 0.51790568, 0.32152241]])

In [39]:
l.reshape(25,1)

array([[0.33910841],
       [0.09393642],
       [0.78759086],
       [0.01950257],
       [0.72099745],
       [0.87877631],
       [0.42512153],
       [0.41971349],
       [0.08224356],
       [0.17107548],
       [0.72815847],
       [0.39789729],
       [0.42221106],
       [0.4583429 ],
       [0.92057646],
       [0.49252414],
       [0.4976895 ],
       [0.63264508],
       [0.45535686],
       [0.98914392],
       [0.00790143],
       [0.4794656 ],
       [0.79378548],
       [0.51790568],
       [0.32152241]])

In [40]:
# reshape the array automatically by writting -1 (use when u don't konw the possible combinations)
# -1 is for unkonwn dimension

l.reshape(5,-1)

array([[0.33910841, 0.09393642, 0.78759086, 0.01950257, 0.72099745],
       [0.87877631, 0.42512153, 0.41971349, 0.08224356, 0.17107548],
       [0.72815847, 0.39789729, 0.42221106, 0.4583429 , 0.92057646],
       [0.49252414, 0.4976895 , 0.63264508, 0.45535686, 0.98914392],
       [0.00790143, 0.4794656 , 0.79378548, 0.51790568, 0.32152241]])

In [41]:
l.reshape(-1,1)

array([[0.33910841],
       [0.09393642],
       [0.78759086],
       [0.01950257],
       [0.72099745],
       [0.87877631],
       [0.42512153],
       [0.41971349],
       [0.08224356],
       [0.17107548],
       [0.72815847],
       [0.39789729],
       [0.42221106],
       [0.4583429 ],
       [0.92057646],
       [0.49252414],
       [0.4976895 ],
       [0.63264508],
       [0.45535686],
       [0.98914392],
       [0.00790143],
       [0.4794656 ],
       [0.79378548],
       [0.51790568],
       [0.32152241]])

---
# Aggregate Functions

- The Python numpy aggregate functions are sum, min, max, mean, average, product, median, standard deviation, variance,  percentile, and corrcoef.

In [42]:
l

array([[0.33910841, 0.09393642, 0.78759086, 0.01950257, 0.72099745],
       [0.87877631, 0.42512153, 0.41971349, 0.08224356, 0.17107548],
       [0.72815847, 0.39789729, 0.42221106, 0.4583429 , 0.92057646],
       [0.49252414, 0.4976895 , 0.63264508, 0.45535686, 0.98914392],
       [0.00790143, 0.4794656 , 0.79378548, 0.51790568, 0.32152241]])

In [43]:
# Min functon:

l.min()

0.007901426875032258

In [44]:
# Max functon:

l.max()

0.9891439172769714

In [45]:
# Mean(average) functon:

l.mean()

0.4821276942047011

In [46]:
# Standard deviation functon:

l.std()

0.27207210772839446

---
# Stacking 

**- Stacking is used to join 2 different arrays.**

Two types of method:
- Vertical stacking
- Horizontal stacking

In [47]:
m = np.array([5,6,7,8,9])
n = np.array([10,20,30,40,50])

print(m)
print(n)

[5 6 7 8 9]
[10 20 30 40 50]


In [48]:
# Vertical stacking

o = np.vstack([m,n])
o

array([[ 5,  6,  7,  8,  9],
       [10, 20, 30, 40, 50]])

In [49]:
# Horizontal stacking

p = np.hstack([m,n])
p

array([ 5,  6,  7,  8,  9, 10, 20, 30, 40, 50])