# 1.Introduction to numpy

What is NumPy?  
Numpy stands for Numerical Python. It's a library that simplifies numerical and mathematical operations in python.  
It provides a high-performance, multidimensional array object (ndarray) and tools for working with these arrays.  
Numpy is efficient because it uses vectorized operations, reducing the need for loops in python code.  

Why is NumPy important?  
Essential for machine learning, data science, and scientific computing because of it's speed and simplicity.  
Many popular libraries like pandas, TensorFlow. and Pytorch are build on top of NumPy.

# Installation

Install NumPy via pip:

In [5]:
pip install numpy

Note: you may need to restart the kernel to use updated packages.



# 2.Array in NumPy

ndarray overview:  

The core object of NumPy is the N-dimensional array (ndarray), which allows you to store data of the same type.  

These arrays are more efficient than python lists for numerical operations.  

Homogeneous data (all elements are of the same type).

# Creating array

From list:

In [8]:
# import numpy as np
import numpy as np
arr = np.array([1,2,3])
print(arr)

[1 2 3]


Zeros and Ones

In [9]:
np.zeros((3,3))  # creating a 3x3 matrix of zeros

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [11]:
np.ones((2,2))  # Creating a 2x2 matrix of ones

array([[1., 1.],
       [1., 1.]])

Range-Based arrays:

In [12]:
np.arange(0,10,2)  # Numbers from 0 to 10 with step 2

array([0, 2, 4, 6, 8])

In [13]:
np.linspace(0,1,5)  # 5 evenly spaced numbers between 0 and 1

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

# Array Indexing

Array indexing is the same as accessing an array element.  
  
You can access an array element by referring to its index number.  
  
The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.

In [15]:
arr1 = np.array([[1,2,3],[5,6,7]])
print(arr1)

[[1 2 3]
 [5 6 7]]


In [29]:
print(f"3rd element on 2nd row is {arr1[1,2]}")  # Access element (row 1, column 2)

3rd element on 2nd row is 7


In [30]:
arr2 = np.array([1,2,3,4,5])
print(f"The 4th element is {arr2[3]}")  # Get the fourth element

The 4th element is 4


Negative indexing

Use negative indexing to access an array from the end([-1])

In [35]:
arr3 = np.array([1,2,3,4,5,8])
print(f"Access last element from array {arr3[-1]}")

Access last element from array 8


# Array Slicing

Slicing in python means taking element from one given index to another given index.  
  
We pass slice instead of index like this: [start:end]  

We can also define the step like this: [start:end:step]  

if we don't pass end it consider length of array in that dimension.  

In [36]:
# Slice element from index 2 to index 6 from following array:
arr = np.array([1,2,3,4,5,6,7,8])
print(arr[2:7])

[3 4 5 6 7]


In [40]:
# Slice element from negative index 
arr = np.array([4,3,6,8,4,8,5,9])
print(arr[-5:-2])

[8 4 8]


In [41]:
# Return every other element from entire array:
print(arr[::2])

[4 6 4 5]


In [43]:
# Slicing 2D Arrays
# From the second element, slice element from index 2 to index 5 (not encluded):
arr = np.array([[1,2,3,4,5,6],[7,8,9,10,11,12]])
print(arr[1,2:6])

[ 9 10 11 12]


In [45]:
# From the first element, slice first 3 numbers
print(arr[0,:3])

[1 2 3]


In [53]:
# Slice numbers from both elements
print(arr[0:3,2:6])

[[ 3  4  5  6]
 [ 9 10 11 12]]


# 3.Array Operations

Operations like addition, subtraction, multiplication, and division can be applied directly to arrays. These are called element-wise operations, as they happen for each element.  

They make calculations faster and simpler bby removing the need for loops.

NumPy performs element wise operaions on arrays:

In [21]:
a = np.array([1,2,3])
b = np.array([4,5,6])
print("Addition of two array is ",a + b)  # Addition
print("Subtraction of two array is ",a - b)  # Subtraction
print("Multiplication of two array is ",a * b)  # Multiplication
print("Division of two array is ",a/b)    # Division

Addition of two array is  [5 7 9]
Subtraction of two array is  [-3 -3 -3]
Multiplication of two array is  [ 4 10 18]
Division of two array is  [0.25 0.4  0.5 ]


# Broadcasting:

Broadcasting lets you perform operations on arrays of different shapes. if shapes don't match, NumPy "stretches" one array so that the operation can happen.

In [56]:
a = np.array([[1,2],[3,4]])
b = np.array([10,20])
print(a+b) # Broadcasting add the b to each row of the a

[[11 22]
 [13 24]]


# 4.Array Shape and Reshape

# Shape

The shape of an array is the number of elements in each dimension.  
  
* Get the shape of an array  
 NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.

In [61]:
# print the shape of 3d array
arr = np.array([[3,4,3],[5,6,3]])
print(arr.shape)  # This array has 2 rows and 3 columns

(2, 3)


# Reshape

Reshaping means changing the shape of an array.  

The shape of an array is the number of elements in each dimension.  

by reshaping we can add or remove dimensions or change number of elements in each dimension.

In [65]:
# Reshape from 1D to 2D
arr = np.array([1,2,3,4,5,6,7,8])
newarr = arr.reshape(4,2)
print(newarr)
print(f"Dimension of new array is {newarr.ndim}")

[[1 2]
 [3 4]
 [5 6]
 [7 8]]
Dimension of new array is 2


In [69]:
# Reshape from 1D to 3D
arr = np.array([1,2,3,4,5,6,7,8,9,10,11,12])
newarr = arr.reshape(3,2,2)
print(newarr)
print("Shape of new array is ",newarr.shape)
print("Dim of new array is ",newarr.ndim)

[[[ 1  2]
  [ 3  4]]

 [[ 5  6]
  [ 7  8]]

 [[ 9 10]
  [11 12]]]
Shape of new array is  (3, 2, 2)
Dim of new array is  3


# 5.Array manipulation

Array manipulation in python refers to techniques used to access, modify, reshape, combine, split, or analyze data stored in arrays. NumPy, a powerful library for numerical computaion, offers a variety of funcions and tools to make these operations easy and efficient.

Why is array manipulation important?  
* Simplifies handling and transforming large datasets.  
* Essential for reshaping arrays into formats suitable for machine learning models.  
* Enables advanced options like filtering, stackind and broadcasting.  
* optimized for performance, making it faster and more memory efficient than python lists.

Slicing array
* To quickly access subsets of data without writing loops.

Indexing array
* To access individual elements or specific sunsets of data.

Fancy indexing
* Fancy indecing uses arrays of indices to access specific elements.  
* We use fancy indexing to extract non-contiguous elements efficiently.

In [78]:
arr  = np.array([10,20,30,40,50])
indices = [0,2,4]
print(arr[indices])  # Extract the elements at specific indices

# Now by using fancy indices in 2D array 
matrix = np.array([[1,2],[3,4],[5,6]])
print(matrix)
print(matrix[[0,2],[1,0]])  # Select the elements at (0,1) 0th row and 1st column which is 2 and (2,0) 2nd row and 0th column which is 5

[10 30 50]
[[1 2]
 [3 4]
 [5 6]]
[2 5]


Masking (Logical indexing)
* Masking creates a Boolean conndition to filter elements from an array.
* We use masking to ectract elements that satisfy a particular condition.

In [79]:
arr = np.array([12,43,76,86,96,4,6])

# Create mask for elements greater than 30
mask = arr > 30
print(mask) 

[False  True  True  True  True False False]


Explanation : A boolean array is created indicating which elements satisfy the condition.

In [80]:
print(arr[mask])

[43 76 86 96]


Explanation :  Filter the array to include only elements greater than 30.

Reshaping array
* Reshaping changes the structure (shape) of an array without altering the data.
* It is used to prepare data for algorithms or change its dimensionality for compatibility.

# 6. Mathematical Functions in NumPy

NumPy provides a rich set of mathematical functions that help in performing various operations on arrays. These are primarily of two types:  

* Universal functions (ufuncs): These are element-wise functions that operate on arrays. They are optimized to work efficiently with NumPy arrays, providing vectorized operations without the need for explicit loops.  

* Statistical functions: These functions help analyze data by calculating various statistics (mean, median, variance, etc.) from arrays.  

* Linear Algebra Functions: These are used to perform operations specific to linear algebra such as matrix multiplication, inverse, eigenvalues, etc.  

Universal Functions (ufuncs)  

Ufuncs are functions that perform element-wise operations. Some commonly used ufuncs include:  

* np.add(), np.subtract(), np.multiply(), np.divide(): These functions perform element-wise addition, subtraction, multiplication, and division, respectively.
* Other functions: np.sqrt(), np.sin(), np.exp(), etc., are used for more specific operations like square root, sine, and exponential operations.

In [83]:
# Element-wise addition
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
add = np.add(a, b)
sub = np.subtract(a,b)
mul = np.multiply(a,b)
div = np.divide(a,b)
print("Addition ",add)
print("Subtraction ",sub)
print("multiplication ",mul)
print("Division ",div)

# Element-wise square root
arr = np.array([4, 9, 16])
sqrt_result = np.sqrt(arr)
print("Square root ",sqrt_result) 

Addition  [5 7 9]
Subtraction  [-3 -3 -3]
multiplication  [ 4 10 18]
Division  [0.25 0.4  0.5 ]
Square root  [2. 3. 4.]


Statistical Functions  

NumPy provides a suite of statistical functions to compute common statistics:
* np.mean(): Compute the average of elements.
* np.median(): Compute the median of elements.
* np.std(): Compute the standard deviation of elements.
* np.var(): Compute the variance of elements.
* np.min() and np.max(): Compute the minimum and maximum values, respectively.

In [88]:
data = np.array([1, 2, 3, 4, 5])

# Mean
mean_val = np.mean(data)
print(f"Mean: {mean_val}")  

# Median
median = np.median(data)
print(f"Median: {median}")

# Standard Deviation
std_val = np.std(data)
print(f"Standard Deviation: {std_val}")  

# Variance
var = np.var(data)
print(f"Variance: {var}")

# Minimum
min = np.min(data)
print(f"Minimum: {min}")

# Maximum
max = np.max(data)
print(f"Maximum: {max}")

Mean: 3.0
Median: 3.0
Standard Deviation: 1.4142135623730951
Variance: 2.0
Minimum: 1
Maximum: 5


Linear Algebra Functions  

NumPy also provides a set of functions specific to linear algebra, such as:  
* np.dot(): Matrix multiplication or dot product of two arrays.
* np.linalg.inv(): Computes the inverse of a matrix.
* np.linalg.eig(): Computes the eigenvalues and eigenvectors of a matrix.

In [90]:
# Matrix multiplication (dot product)
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = np.dot(A, B)
print(result)  

# Inverse of a matrix
matrix = np.array([[1, 2], [3, 4]])
inverse = np.linalg.inv(matrix)
print(inverse)  

[[19 22]
 [43 50]]
[[-2.   1. ]
 [ 1.5 -0.5]]


# 7. Random Number Generation in NumPy

NumPy provides functionality for generating random numbers and performing various random operations, which is crucial for tasks such as Monte Carlo simulations, random sampling, or initializing machine learning model parameters.  

Key Functions:  
* np.random.rand(): Generates random numbers from a uniform distribution in the range [0, 1].
* np.random.randint(): Generates random integers within a specified range.
* np.random.randn(): Generates random numbers from a standard normal distribution.
* np.random.seed(): Sets the seed for random number generation to ensure reproducibility.

In [91]:
import numpy as np

# Generate 5 random floats between 0 and 1
rand_floats = np.random.rand(5)
print(rand_floats)

# Generate 5 random integers between 10 and 50
rand_ints = np.random.randint(10, 50, 5)
print(rand_ints)

# Set seed for reproducibility
np.random.seed(42)
rand_vals_with_seed = np.random.rand(3)
print(rand_vals_with_seed)

[0.72394861 0.59770188 0.86929118 0.42084416 0.41691031]
[30 43 47 32 41]
[0.37454012 0.95071431 0.73199394]


Random Sampling, Shuffling, and Permutations:
* np.random.shuffle(): Randomly shuffles the elements of an array.
* np.random.permutation(): Randomly permutes a sequence or returns a permuted copy of an array.

In [93]:
# Shuffling an array
arr = np.array([1, 2, 3, 4, 5])
np.random.shuffle(arr)
print(arr) 

# Permuting a sequence
arr_permutation = np.random.permutation([1, 2, 3, 4, 5])
print(arr_permutation) 


[4 2 3 1 5]
[2 1 4 5 3]


# 8. Broadcasting in NumPy

What is Broadcasting? Broadcasting in NumPy refers to the ability to perform operations on arrays of different shapes. NumPy applies broadcasting rules to automatically expand the smaller array to match the shape of the larger one without making a copy of the data. This leads to more efficient memory usage and computation.  

How Broadcasting Works:  
When performing an operation between two arrays with different shapes, NumPy compares their shapes element by element. The smaller array is "broadcast" to match the larger array's shape, provided that the arrays meet certain conditions.  



Broadcasting Rules:
* If the arrays have different ranks, the shape of the smaller array is padded with ones on the left side until both arrays have the same rank.
* The two arrays are compatible if in all dimensions, the size of one of the arrays is either 1 or equal to the size of the other array.
* If these conditions are not met, NumPy will raise a ValueError.

In [94]:
# Array with shape (3, 1)
A = np.array([[1], [2], [3]])

# Array with shape (1, 4)
B = np.array([10, 20, 30, 40])

# Broadcasting A and B together
result = A + B
print(result)

[[11 21 31 41]
 [12 22 32 42]
 [13 23 33 43]]


In this case, the smaller array B is broadcasted across the rows of A, and the element-wise addition is performed between each corresponding element.

In [95]:
# Array with shape (3, 2)
A = np.array([[1, 2], [3, 4], [5, 6]])

# Array with shape (2, 2)
B = np.array([[1, 1], [2, 2]])

# This will raise a ValueError because the shapes are not compatible for broadcasting
result = A + B


ValueError: operands could not be broadcast together with shapes (3,2) (2,2) 