# Essentials of NumPy

# 1.0 NumPy
NumPy is a Python library that is commonly used for 
scientific computing and data analysis. It stands for 
Numerical Python and provides fast and efficient numerical 
computation on large datasets. NumPy is built on top of the 
C language, which makes it faster than pure Python code. 

## 1.1 Creating Numpy Arrays
NumPy arrays are similar to Python lists, but they are more 
efficient when it comes to numerical computation. They 
can be one-dimensional or multi-dimensional and can hold 
homogeneous elements.

Below are some functions that can 
be used to create NumPy arrays:

### np.arrays() 
To create an array from a list of tuples

In [7]:
import numpy as np 

lst = [1, 2, 3, 4, 5, 6]
# Creating an array from list
arr = np.array(lst)
arr

array([1, 2, 3, 4, 5, 6])

### dtype 
To check data type

In [8]:
lst = [1, 2, 3, 4, 5, 6]

# Creating an array of floats from list
arr = np.array(lst, dtype = float)
arr

array([1., 2., 3., 4., 5., 6.])

In [60]:
names = [["Jon", "Mary", "Paul"], ["Peter", "Ben", "Saul"]]

arr1 = np.array(names)
arr1

array([['Jon', 'Mary', 'Paul'],
       ['Peter', 'Ben', 'Saul']], dtype='<U5')

### ndim 
To check number of dimentions of an array

In [11]:
# Checking for number of dimensions in the array
arr1.ndim

2

### np.arrange()
To create an array with regularly spaced values within a given range. Similar to python arrange() function when applied to numbers.

In [15]:
arr2 = np.arange(0, 50, 5)
arr2

array([ 0,  5, 10, 15, 20, 25, 30, 35, 40, 45])

We have generated an array of values from 0 to 50 (50 
excluded), spaced by 5. 

### np.zeros()
Creats an array with all elements set to zero. It allows us to set the shape and data type of the array

In [16]:
arr3 = np.zeros((2,3), dtype = float)
arr3

array([[0., 0., 0.],
       [0., 0., 0.]])

### np.ones()  
The np.ones() function works similarly to the np.zeros() 
function. The only difference is that the np.ones() function 
creates an array with all elements set to one (1). 

In [24]:
arr3 = np.ones((2,3), dtype = float)
arr3

array([[1., 1., 1.],
       [1., 1., 1.]])

### numpy.random.Generator.integers 
The numpy.random.Generator.integers function is used by 
the NumPy library for generating random integers. It 
allows the generation of random integers within a specified 
range or from a specified set of values. 


In [25]:
# Create a random number generator
rng = np.random.default_rng()

# Generate random integers from 0 to 9 
rng.integers(low=0, high=10, size=(2, 4))

array([[9, 0, 9, 0],
       [4, 6, 4, 6]])

### numpy.random.Generator.random 
This method creates an array of random floats.

In [29]:
# Create a random number generator
rng = np.random.default_rng(seed = 24)

# Generate random floats between 0 and 1 
rng.random((2, 4))

array([[0.33026884, 0.40517732, 0.57473782, 0.50639977],
       [0.56421251, 0.56968731, 0.87411653, 0.08643046]])

A seed sets the starting point for the random number generator, which makes the output predictable and repeatable. When you use the same seed, NumPy produces the exact same sequence of “random” numbers every time. This is essential for debugging, teaching, and running experiments that need consistent results.

## 1.2 Accessing Array Elements

Once you have created a NumPy array, you may want to 
access its elements. NumPy arrays are indexed using 
integers starting from zero (0). Here are some ways to 
access NumPy array elements:

### Slicing
Slicing in NumPy lets you extract a range of elements from an array using the colon (:), just like Python lists.

In [31]:
names = [["Jon", "Mary", "Paul"], ["Peter", "Ben", "Saul"]]

arr1 = np.array(names)
arr1

array([['Jon', 'Mary', 'Paul'],
       ['Peter', 'Ben', 'Saul']], dtype='<U5')

In [33]:
# From the above, to select the names "Mary" and "Paul" from the array.

select_mary_paul = arr1[0,1:]
select_mary_paul

array(['Mary', 'Paul'], dtype='<U5')

arr1[0, 1:] means “go to row 0, start at index 1, and take everything after it,” which selects "Mary" and "Paul" from that row.

### Fancy Indexing 
This allows you to select elements based on a list of indices. 

In [35]:
# Creating an elements indices 
select_peter = np.array([1, 0])
select_paul = np.array([0, 2])

# Using fancy indexing to select
select_peter_paul = arr1[select_peter, select_paul]
select_peter_paul


array(['Peter', 'Paul'], dtype='<U5')

Fancy indexing works by creating separate index arrays for the rows and columns you want to extract, then passing them to the original array to pull out the matching elements. The only rule is that these index arrays must have the same shape so NumPy can pair each row index with its corresponding column index correctly.

### Boolean Indexing
**Boolean indexing** filters an array by creating a second array of `True/False` values based on a condition, where True marks the elements you want to keep. NumPy then uses this Boolean array to return only the elements that satisfy the condition, such as selecting all even numbers and removing the odd ones.

In [37]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9,]])

# Create a boolean mask based on the condition 
filter_array = arr % 2 != 0 

# select the elements from the original array based on the filter 
arr[filter_array]

array([1, 3, 5, 7, 9])

## 1.3 Array Manipulation

###  np.reshape() 
To change the shape of an array

In [38]:
# Creating an array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Flatten the array using reshape
np.reshape(arr, 9)

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

9 is the number of elements in the original array. 
By passing the number 9, we are telling the reshape 
function to return a 1-dimensional array. 

In [41]:
names = [["Jon", "Mary", "Paul"], ["Peter", "Ben", "Saul"]]

arr1 = np.array(names)
arr1

array([['Jon', 'Mary', 'Paul'],
       ['Peter', 'Ben', 'Saul']], dtype='<U5')

In [43]:
# reshaping array to a (3, 2) shape
new_array = np.reshape(arr1, (3, 2))
new_array

array([['Jon', 'Mary'],
       ['Paul', 'Peter'],
       ['Ben', 'Saul']], dtype='<U5')

### np.concatenate()
Joins two or more arrays together

In [49]:
arr1 = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
arr2 = np.array([[100, 110, 120], [130, 140, 150]])

# Concatenate on axis-0
arr3 = np.concatenate((arr1, arr2), axis=0)
arr3

array([[ 10,  20,  30],
       [ 40,  50,  60],
       [ 70,  80,  90],
       [100, 110, 120],
       [130, 140, 150]])

Note that for concatenation to work, the arrays must have the same number of dimensions. If we were to try to join these two arrays on axis 1 (rows), we would get an error because the two arrays do not have an equal number of rows. 

axis=0 means:   
“Stack them vertically — one on top of the other.”
To stack vertically, the columns must match.
They do (both have 3 columns), so it works.

* *Why concatenation on axis=1 fails*  

axis=1 means:
“Join them side‑by‑side — left and right.”        
To join side‑by‑side, the rows must match.   
But:      
arr1 has 3 rows      
arr2 has 2 rows       
They don’t match → NumPy raises an error.

### np.split()
Splits an array into smaller arrays

In [54]:
names = [["Jon", "Mary", "Paul"], ["Peter", "Ben", "Saul"]]

# Creating array
arr1 = np.array(names)

# using split to split array into two parts
split_one, split_two = np.split(arr1, 2)
split_one

array([['Jon', 'Mary', 'Paul']], dtype='<U5')

In [55]:
split_two

array([['Peter', 'Ben', 'Saul']], dtype='<U5')

### np.transpose()
flips an array so that rows become columns and columns become rows, like turning a table on its side.
It reorganizes the data without changing the values, only changing how the data is oriented, which is essential for operations like matrix multiplication where shapes must match.

In [57]:
arr1 = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
arr2 = np.array([[100, 110, 120], [130, 140, 150]])

# Performing a dot operation
np.dot(arr1, arr2.transpose())

array([[ 6800,  8600],
       [16700, 21200],
       [26600, 33800]])

To make the dot operation possible, we transpose arr2.
Transposing flips arr2 from a 2×3 matrix into a 3×2 matrix, which satisfies the rule that:
   
    the number of columns in the first matrix must equal the number of rows in the second.

So:     
100 110 120   
130 140 150   

becomes     

100 130     
110 140     
120 150     


Now the inner dimensions match (3 columns in arr1 and 3 rows in arr2(Transposed), allowing the dot product to work.

## 1.4 Mathematical Functions

Mathematical functions that can be used on arrays:

### np.add() and np.subtract() 
np.add() performs element‑wise addition between two arrays, while np.subtract() performs element‑wise subtraction, operating on matching positions in each array.

In [62]:
arr1 = np.array([[1, 2, 4], [6, 7, 8]])
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# adding two arrays
np.add(arr1, arr2)

array([[ 4,  5, 10],
       [10, 12, 15]])

In [63]:
# Subtracting two arrays
np.subtract(arr1, arr2)

array([[-2, -1, -2],
       [ 2,  2,  1]])

### np.multiply() and np.divide()
np.multiply() performs element‑wise multiplication between two arrays, and np.divide() performs element‑wise division; both operations require the arrays to be broadcastable to a compatible shape.

In [67]:
arr1 = np.array([[1, 2, 4], [6, 7, 8]])
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# multiplying two arrays
np.multiply(arr1, arr2)

array([[ 3,  6, 24],
       [24, 35, 56]])

In [68]:
arr1 = np.array([[1, 2, 4], [6, 7, 8]])
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# dividing two arrays
np.divide(arr1, arr2)

array([[0.33333333, 0.66666667, 0.66666667],
       [1.5       , 1.4       , 1.14285714]])

###  np.power() and  np.sqrt() 
This np.power() function raises an array to a given power. 
The np.sqrt() function calculates the square root of each 
element in an array. 

In [70]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# Raising array to power 2
np.power(arr2, 2)

array([[ 9,  9, 36],
       [16, 25, 49]])

In [71]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# Calculating the square-root of array
np.sqrt(arr2)

array([[1.73205081, 1.73205081, 2.44948974],
       [2.        , 2.23606798, 2.64575131]])

## 1.5 Statistical Functions
NumPy also provides a wide range of statistical functions that can be used on arrays. These include:

### np.mean() 
calculates the mean of an array. By 
default, this function will flatten the array and calculate the 
mean of the flattened array.  You can also specify the axis 
on which you want the mean to be calculated. 

In [82]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# Calculating the mean of each row
np.mean(arr2, axis=1)

array([4.        , 5.33333333])

### np.median()  


In [85]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# Calculating the median of each row
np.median(arr2, axis=1)

array([3., 5.])

Note : On Axis       
axis=0 = vertical (works down columns)          
axis=1 = horizontal (works across rows)   
axis=None = means “ignore all axes and operate on the entire array as one flat list.” (arr2, axis=None) = (arr2)

### np.std() 
Standard deviation is a measure of how spread out the data is. It tells how much data vary from the average(mean)   

* *Memory Trick* : Standard deviation = square root of the average squared distance from the mean.   
If the numbers are close together → std is small  
If the numbers are spread out → std is larger

Standard deviation = square root of variance

In [95]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# Calculating the std of each column
np.std(arr2, axis=0)

array([0.5, 1. , 0.5])

### np.var()  
Variance measures the average of the squared differences from the mean.

In [88]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# Calculating the var of each column
np.var(arr2, axis=0)

array([0.25, 1.  , 0.25])

### np.min() and np.max()  
This np.min() calculates the minimum value of an array, 
and this np.max() calculates the maximum value of an 
array.

In [89]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

#Calculating the min of each column
np.min(arr2, axis=0)

array([3, 3, 6])

In [97]:
arr2 = np.array([[3, 3, 6], [4, 5, 7]])

# Calculating the max of the whole array
np.max(arr2, axis=None)

np.int64(7)