# Introduction to Numpy

## Agenda
- Fundamentals of Numpy
  - Advantages
  - Installation
- Numpy Array Object
  - Creating Numpy Arrays.
- Attributes of Numpy arrays
- Numpy array Functions
- Arithmetic Operations using Numpy
- Statistical Operations
- String Functions in numpy
- Numpy Array Indexing
- Numpy Array Slicing

## Fundamentals of Numpy
Numpy(Numerical Python) is a free and open source library that is mostly for mathematical operations.
- it is used for working with arrays.
- it consists of multidimensional arrays , and function.
### Advantages
-  numpy is faster and better functions to deal with mathematical problems.

# Installation
To install numpy on any python environment use the code below
$$ \text{pip install numpy} $$


In [1]:
# install numpy
!pip install numpy



In [2]:
#pip install --upgrade numpy

In [3]:
# import numpy 
import numpy as np

import time

In [4]:
# How numpy is faster

In [5]:
# using numpy to create a data of 100000000 floating point numbers and calculate their mean

In [6]:
data = np.random.rand(100000000) # this generates random floating point numbers
data[:5]

array([0.78901607, 0.41642967, 0.45156971, 0.62186502, 0.22927207])

In [7]:
# mean calculation using numpy mean function
start = time.time()
mn =  np.mean(data)
stop = time.time()
print(f'Time Elasped {stop-start}')

Time Elasped 0.4806184768676758


In [8]:
# mean calculation using tradition python math
start = time.time()
#mn_p =  sum(data)/len(data)
stop = time.time()
print(f'Time Elasped {stop-start}')

Time Elasped 0.0


## Numpy Array Object
An array is a data structure that arranges data of similar datatypes(Homogeneous) in a contiguous memory location. Each value inside the array is called as `element`, that can be accessed or modified using the `index` <br>
The arrays are often called ndarray(n dimensional array) in numpy. There are two ways by which an ndarray is craeted in numpy
- use numpy builtin function `array`
  - syntax is `np.array(list of values/scalar)`
- use numpy array generative functions


### Create Numpy arrays and see the attributes

In [9]:
# Create a 0 Dimensional array.(use a scalar value in array function)
arr0 = np.array(5)
print(arr0)
print(hex(id(arr0)))
# Attributes
print(arr0.data)
print(arr0.ndim) # gives the dimensions
print(arr0.itemsize)  # gives the memory allocated in bytes to each element
print(arr0.shape)  # gives the number of values in each dimension as a tuple
print(arr0.size) # gives the total number of elements in the array
print(arr0.dtype) # gives the data type of the elements of the array


5
0x17445a33990
<memory at 0x0000017445A98540>
0
8
()
1
int64


In [10]:
# Create a 1 Dimensional array.(use a list of  values in array function)
arr1 = np.array([1,2,3,4,5])
print(arr1)
print(hex(id(arr1)))
# Attributes
print(arr1.data)
print(arr1.ndim) # gives the dimensions
print(arr1.itemsize)  # gives the memory allocated in bytes to each element
print(arr1.shape)  # gives the number of values in each dimension as a tuple
print(arr1.size) # gives the total number of elements in the array
print(arr1.dtype) # gives the data type of the elements of the array

[1 2 3 4 5]
0x17445a33ab0
<memory at 0x0000017445A7E680>
1
8
(5,)
5
int64


In [11]:
# Create a 2 Dimensional array.(use a list of lists values in array function)
arr2 = np.array([[1,2,3,4,5], [11,12,13,14,15]])
print(arr2)
print(hex(id(arr2)))
# Attributes
print(arr2.data)
print(arr2.ndim) # gives the dimensions
print(arr2.itemsize)  # gives the memory allocated in bytes to each element
print(arr2.shape)  # gives the number of values in each dimension as a tuple
print(arr2.size) # gives the total number of elements in the array
print(arr2.dtype) # gives the data type of the elements of the array

[[ 1  2  3  4  5]
 [11 12 13 14 15]]
0x17445a33bd0
<memory at 0x0000017445A92810>
2
8
(2, 5)
10
int64


In [12]:
# Create a 3 Dimensional array.(use a list of lists of lists values in array function)
arr3 = np.array([[[1,2,3,4,5], [11,12,13,14,15]], [[21,22,23,24,25], [31,32,33,34,35]]])
print(arr3)
print(hex(id(arr3)))
# Attributes
print(arr3.data)
print(arr3.ndim) # gives the dimensions
print(arr3.itemsize)  # gives the memory allocated in bytes to each element
print(arr3.shape)  # gives the number of values in each dimension as a tuple
print(arr3.size) # gives the total number of elements in the array
print(arr3.dtype) # gives the data type of the elements of the array

[[[ 1  2  3  4  5]
  [11 12 13 14 15]]

 [[21 22 23 24 25]
  [31 32 33 34 35]]]
0x174352b62b0
<memory at 0x00000174374EAF20>
3
8
(2, 2, 5)
20
int64


# Date  - 19-03-2025

In [13]:
# Numpy Array Functions

In [14]:
import numpy as np

In [15]:
arr =  np.array([[1,2,3,4], [4,5,6,7], [7,8,9,2]])
print(arr)

[[1 2 3 4]
 [4 5 6 7]
 [7 8 9 2]]


In [16]:
# transpose - converts the rows to columns
print(arr.shape)
arr = arr.transpose()
print(arr)
print(arr.shape)

(3, 4)
[[1 4 7]
 [2 5 8]
 [3 6 9]
 [4 7 2]]
(4, 3)


In [17]:
# reshape - allows to change the shape of the array to a desired shape with a constraint of keeping array size same

print(arr.size)
arr =  arr.reshape(3,2,2)
print(arr)
print(arr.shape)
print(arr.size)

12
[[[1 4]
  [7 2]]

 [[5 8]
  [3 6]]

 [[9 4]
  [7 2]]]
(3, 2, 2)
12


In [18]:
arr.reshape(1,3,2,2)

array([[[[1, 4],
         [7, 2]],

        [[5, 8],
         [3, 6]],

        [[9, 4],
         [7, 2]]]])

In [19]:
# flatten -  it converts any dimensional array of any shape  to a one dimensional array

In [20]:
print(arr)

[[[1 4]
  [7 2]]

 [[5 8]
  [3 6]]

 [[9 4]
  [7 2]]]


In [21]:
print(arr.reshape(12,))

[1 4 7 2 5 8 3 6 9 4 7 2]


In [22]:
print(arr.flatten())

[1 4 7 2 5 8 3 6 9 4 7 2]


In [23]:
# ravel -  a numpy function converts the arry to one dimensional form with choice of order, default is Row Major(C)
print(np.ravel(arr, order = 'C'))  #- # Row Major -  C language like Parsing
print(np.ravel(arr, order = 'F')) # - # Column Major - Fortran like Parsing
print(np.ravel(arr, order = 'A'))
print(np.ravel(arr, order = 'K'))

[1 4 7 2 5 8 3 6 9 4 7 2]
[1 5 9 7 3 7 4 8 4 2 6 2]
[1 4 7 2 5 8 3 6 9 4 7 2]
[1 4 7 2 5 8 3 6 9 4 7 2]


## Numpy Arithmetic Operations
- addition (+)
- subtraction (-)
- multiplication (*)
- division (/)
- Modulus(%)
- power (**)
- floor division (//)

In [24]:
a = np.array([30, 10, 20])
b = np.array([2,3,5])

In [25]:
print(a)
print(b)
# Addition
print('Addition')
print(a+b) #Pythonic
print(np.add(a,b)) # numpy function 
# Subtration
print('Subtarction')
print(a-b) #Pythonic
print(np.subtract(a,b)) # numpy function 
# Multiplication
print('Multiplication')
print(a*b) #Pythonic
print(np.multiply(a,b)) # numpy function 
# Division
print('Divide')
print(a/b) #Pythonic
print(np.divide(a,b)) # numpy function 
# Modulus
print('Modulus')
print(a%b) #Pythonic
print(np.mod(a,b)) # numpy function 
# power
print('Power')
print(a**b) #Pythonic
print(np.power(a,b)) # numpy function 
# floor division
print('Floor Division')
print(a//b) #Pythonic
print(np.floor_divide(a,b)) # numpy function 

[30 10 20]
[2 3 5]
Addition
[32 13 25]
[32 13 25]
Subtarction
[28  7 15]
[28  7 15]
Multiplication
[ 60  30 100]
[ 60  30 100]
Divide
[15.          3.33333333  4.        ]
[15.          3.33333333  4.        ]
Modulus
[0 1 0]
[0 1 0]
Power
[    900    1000 3200000]
[    900    1000 3200000]
Floor Division
[15  3  4]
[15  3  4]


## Statistical Functions
- mean
- median
- standard Deviation
- variance
- percentiles
- covariance
- correlation

In [26]:
x =  np.array([1,3,2,34,5,52,2,7,3,8,2,4,7,2])
print(f'array is {x}')
print(f'Mean is {np.mean(x)}')
print(f'Average is {np.average(x)}')
print(f'Median is {np.median(x)}')
print(f'Standard Deviation is {np.std(x)}')
print(f'Variance is {np.var(x)}')
print(f'Minimum Value {np.min(x)}')
print(f'25th Percentile {np.percentile(x, 25)}')
print(f'50th Percentile {np.percentile(x, 50)}')
print(f'75th Percentile {np.percentile(x, 75)}')
print(f'Maximum Value {np.max(x)}')



array is [ 1  3  2 34  5 52  2  7  3  8  2  4  7  2]
Mean is 9.428571428571429
Average is 9.428571428571429
Median is 3.5
Standard Deviation is 14.276425551608227
Variance is 203.81632653061226
Minimum Value 1
25th Percentile 2.0
50th Percentile 3.5
75th Percentile 7.0
Maximum Value 52


In [27]:
x =  np.array([1,3,2,34,5,52,2,7,3,8,2,4,7,2])
y =  np.array([ 6,  8,  8, 16,  9, 19, 13, 18,  4, 12, 13,  9,6,7])
print('Covariance Matrix')
print(np.cov(x,y))
print()
print('Correlation Matrix')
print(np.corrcoef(x,y))

Covariance Matrix
[[219.49450549  46.58241758]
 [ 46.58241758  21.95604396]]

Correlation Matrix
[[1.         0.67101643]
 [0.67101643 1.        ]]


In [28]:
## String Functions in numpy

In [29]:
x =  np.array(['Hello', 'Welcome'])
y = np.array(['World', 'Learners'])

In [30]:
print(np.char.add(x,y)) # performs element wise concatenation of string arrays

['HelloWorld' 'WelcomeLearners']


In [31]:
str1 =  'Hello How are you?'
print(np.char.replace(str1, 'H', 'h'))
print(np.char.capitalize(str1))
print(np.char.title(str1))

hello how are you?
Hello how are you?
Hello How Are You?


In [32]:
print(str1.title())

Hello How Are You?


## Miscellaneous Functions

# arange() 
- it Generates array to values within a specified interval with fixed specified gap between two consecutive values. The syntax is
   $$np.arange(start = 0, stop, step =1)$$
- in arange the stop value is exclusive
<br>
np.arange(10) --> np.arange(start = 0, stop =10, step =1) <br>
np.arange(0,10) --> np.arange(start = 0, stop =10, step =1) <br>
np.arange(0,10,1) --> np.arange(start = 0, stop =10, step =1) <br>
np.arange(4,10) --> np.arange(start = 4, stop =10, step =1) <br>
np.arange(0,10,2) --> np.arange(start = 0, stop =10, step =2) <br>
np.arange(4,10,3) --> np.arange(start = 4, stop =10, step =3) <br>

In [33]:
print(np.arange(10))
print(np.arange(0,10))
print(np.arange(0,10,1))
print(np.arange(4,10))
print(np.arange(0,10,2))
print(np.arange(4,10,3))
print(np.arange(0,10,-2))
print(np.arange(10,0,-2))

[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4 5 6 7 8 9]
[4 5 6 7 8 9]
[0 2 4 6 8]
[4 7]
[]
[10  8  6  4  2]


# linspace() 
- it Generates array of values within a specified interval with fixed count of values which are equidistant. The syntax is
   $$np.linspace(start, stop, count =50)$$
- in linspace the stop value is inclusive by default
<br>
np.linspace(0,10, 5) <br>

In [34]:
print(np.linspace(0,10,4))
print(np.linspace(0,10))
print(np.linspace(0,10,4, endpoint =  False))


[ 0.          3.33333333  6.66666667 10.        ]
[ 0.          0.20408163  0.40816327  0.6122449   0.81632653  1.02040816
  1.2244898   1.42857143  1.63265306  1.83673469  2.04081633  2.24489796
  2.44897959  2.65306122  2.85714286  3.06122449  3.26530612  3.46938776
  3.67346939  3.87755102  4.08163265  4.28571429  4.48979592  4.69387755
  4.89795918  5.10204082  5.30612245  5.51020408  5.71428571  5.91836735
  6.12244898  6.32653061  6.53061224  6.73469388  6.93877551  7.14285714
  7.34693878  7.55102041  7.75510204  7.95918367  8.16326531  8.36734694
  8.57142857  8.7755102   8.97959184  9.18367347  9.3877551   9.59183673
  9.79591837 10.        ]
[0.  2.5 5.  7.5]


In [35]:
np.arange(0,10.01,0.2040816299999999) 

array([0.        , 0.20408163, 0.40816326, 0.61224489, 0.81632652,
       1.02040815, 1.22448978, 1.42857141, 1.63265304, 1.83673467,
       2.0408163 , 2.24489793, 2.44897956, 2.65306119, 2.85714282,
       3.06122445, 3.26530608, 3.46938771, 3.67346934, 3.87755097,
       4.0816326 , 4.28571423, 4.48979586, 4.69387749, 4.89795912,
       5.10204075, 5.30612238, 5.51020401, 5.71428564, 5.91836727,
       6.1224489 , 6.32653053, 6.53061216, 6.73469379, 6.93877542,
       7.14285705, 7.34693868, 7.55102031, 7.75510194, 7.95918357,
       8.1632652 , 8.36734683, 8.57142846, 8.77551009, 8.97959172,
       9.18367335, 9.38775498, 9.59183661, 9.79591824, 9.99999987])

# Random Submodule
it is dedictated for generating random sequence of numbers.

In [36]:
# rand() -  generates array of floating point number of desired shape in a half open interval [0,1)

print(np.random.rand()) # generate 0darray of random floating number between 0,1(exclusive)
print(np.random.rand(5)) # generate 1darray of 5 random floating number between 0,1(exclusive)
print(np.random.rand(3,3)) # generate 2darray of shape (3,3) of 9 random floating number between 0,1(exclusive)

0.6190782059764083
[0.74948558 0.8457466  0.72481439 0.96761174 0.90407213]
[[0.6867945  0.00561837 0.88541375]
 [0.86408279 0.60902117 0.41515026]
 [0.01437327 0.07872579 0.86782603]]


In [37]:
# random() -  generates array of floating point number of desired shape in a half open interval [0,1)

print(np.random.random()) # generate 0darray of random floating number between 0,1(exclusive)
print(np.random.random(5)) # generate 1darray of 5 random floating number between 0,1(exclusive)
print(np.random.random((3,3))) # generate 2darray of shape (3,3) of 9 random floating number between 0,1(exclusive)

0.6666374582595443
[0.08929097 0.23481345 0.18290064 0.64420625 0.64828378]
[[0.79640908 0.72162398 0.35904063]
 [0.49252818 0.30545899 0.51707008]
 [0.20568215 0.6263797  0.73269251]]


In [38]:
# random() -  generates array of floating point number falling in normal distribution
a =  np.random.randn(1000000)
#print(a) 
print(np.mean(a))
print(np.median(a))


0.0008748848156047747
0.0006133398442438408


In [39]:
# randint() -  generates array of integer number of desired shape in a specified interval
# Syntax =  np.random.randint(startval, stopvalue)

print(np.random.randint(1)) # generate 0darray of random integer number
print(np.random.randint(5)) # generate 0darray of  random integer number between 0,5(exclusive)
print(np.random.randint(3,10)) # generate 0darray of  random integer number between 3,10(exclusive)
print(np.random.randint(3,10, size= (3,3))) # generate 2darray(shape (3,3) of  random integer number between 3,10(exclusive)


0
3
3
[[8 6 4]
 [6 8 4]
 [3 5 5]]


In [None]:
# seed, index, slice