# NumPy

## Table of content 

### 1 Introduction 

* What is NumPy and why is it useful?
* Applications of NumPy in real-world scenarios
* Comparison with Python built-in lists

### 2 Installation and importing 

* Installing NumPy using pip
* Importing NumPy as np

### 3 NumPy arrays
* Creating NumPy arrays (1D, 2D, and 3D)
* Array attributes (shape, size, dtype)
* Indexing and slicing arrays
* Reshaping and flattening arrays

### 4 Array operations 

* Arithmetic operations (addition, subtraction, multiplication, division)
* Broadcasting
* Element-wise operations (sqrt, log, sin, cos)
* Matrix operations (dot product, cross product, transpose)

### 5 Array manipulation 

* Concatenation and splitting
* Stacking and unstacking
* Inserting and deleting elements

### 6 Statistical functions
* Descriptive statistics (mean, median, mode, variance, standard deviation)
* Basic operations on arrays (sum, max, min, argmax, argmin)

### 7 Random number generation 

* Generating random numbers (uniform, normal, binomial distributions)
* Setting a random seed
* Random sampling

### 8 Assigment 




## Introduction

### What is NumPy and why is it useful?

NumPy, short for Numerical Python, is a powerful Python library for numerical computing. 

It provides support for multi-dimensional arrays and a large collection of mathematical functions to perform operations on these arrays.

NumPy is the foundation for many other scientific and data analysis libraries in Python, such as SciPy, Pandas, and Scikit-learn.

Some advantages of NumPy include:

* Efficient array operations: NumPy uses optimized C code under the hood, which makes it much faster than Python's built-in lists for numerical operations.


* Broadcasting: NumPy allows you to perform operations on arrays with different shapes and sizes, making your code more concise and easier to read.


* Mathematical functions: NumPy provides a wide range of mathematical functions that can be applied element-wise on arrays, simplifying complex mathematical operations.

### Applications of NumPy in real-world scenarios

 NumPy is widely used in various fields, including:

* Data analysis: NumPy provides the foundation for data manipulation and analysis libraries like Pandas.

    
* Machine learning: Libraries such as Scikit-learn and TensorFlow use NumPy for their core operations.


* Image processing: NumPy arrays can represent images, enabling efficient image manipulation and processing.


* Scientific computing: NumPy is used in various scientific domains, such as physics, biology, and engineering, for numerical 
simulations and modeling.

    
* Finance: NumPy can be used for financial modeling, risk analysis, and portfolio optimization.

### Comparison with Python built-in lists

While Python built-in lists are flexible and easy to use, they have some limitations when it comes to numerical computing. Here are some key differences between NumPy arrays and Python lists:

* Performance: NumPy arrays are more efficient and faster than Python lists for numerical operations due to their underlying C implementation and contiguous memory allocation.

    
* Array operations: NumPy arrays support element-wise operations and broadcasting, making it easier to perform complex mathematical operations on arrays.


* Data types: Python lists can store elements of different data types, whereas NumPy arrays can only store elements of the same data type. This restriction allows NumPy arrays to be more memory-efficient and perform operations faster.


* Multidimensional support: NumPy arrays natively support multi-dimensional data, whereas Python lists require nested lists to represent multi-dimensional data, which can be less efficient and harder to work with.


    
In conclusion, NumPy is a powerful and essential library for numerical computing in Python. It offers numerous advantages over Python's built-in lists, making it a popular choice for various scientific and data analysis applications.

## Installation and importing

NB : If you are working with Jupyter Notebook or Google Colab, NumPy should already be installed.

### Installing NumPy using pip
Before you can use NumPy, you need to install it. The easiest way to install NumPy is using pip, the Python package manager. To install NumPy, open your terminal (or command prompt on Windows) and type the following command:

This command will download and install the latest version of NumPy. If you're using a Python environment manager like conda, you can also install NumPy using the following command:

conda install numpy


### Importing NumPy as np

Once you have installed NumPy, you can start using it in your Python script or notebook. It is a common convention to import NumPy with the alias np. To import NumPy, add the following line at the beginning of your script or notebook:

In [1]:
import numpy as np


##  NumPy arrays

### Creating NumPy Arrays (1D, 2D, and 3D)

NumPy arrays are the fundamental data structures used for numerical computing in Python. Let's create some 1D, 2D, and 3D arrays:

1D array is also known as a vector or a one-dimensional vector.


2D array is also known as a matrix or a two-dimensional matrix.


3D array is also known as a tensor or a three-dimensional tensor.

#### 1D array is also known as a vector or a one-dimensional vector.

In [3]:
import numpy as np

# Creating a 1D array from a list
array_1d = np.array([1, 2, 3, 4, 5])
print("1D array:", array_1d)


1D array: [1 2 3 4 5]


#### 2D array is also known as a matrix or a two-dimensional matrix.

In [4]:
# Creating a 2D array from a nested list
array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("2D array:\n", array_2d)


2D array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]


#### 3D array is also known as a tensor or a three-dimensional tensor.

In [5]:
# Creating a 3D array from a nested list
array_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("3D array:\n", array_3d)


3D array:
 [[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


### Array Attributes (shape, size, dtype)

NumPy arrays have several useful attributes:

* shape: the dimensions of the array
    
* size: the total number of elements in the array
    
* dtype: the data type of the elements in the array

In [6]:
print("Shape:", array_2d.shape)  # (3, 3)
print("Size:", array_2d.size)    # 9
print("Data type:", array_2d.dtype)  # int64 (or int32 on some systems)


Shape: (3, 3)
Size: 9
Data type: int32


### Indexing and Slicing Arrays

You can access the elements of a NumPy array using indices, similar to Python lists:

In [9]:
# Creating a 1D array from a list
array_1d = np.array([1, 2, 3, 4, 5])
print("1D array:", array_1d)


# Indexing a 1D array
print(array_1d[0])  # 1

print(array_1d[-1]) # 5

# Slicing a 1D array
print(array_1d[1:4])  # [2 3 4]


1D array: [1 2 3 4 5]
1
5
[2 3 4]


For multi-dimensional arrays, use a comma-separated tuple of indices:

In [10]:
# Creating a 2D array from a nested list
array_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("2D array:\n", array_2d)

# Indexing a 2D array
print(array_2d[1, 2])  # 6

# Slicing a 2D array
print(array_2d[0:2, 1:])  # [[2 3]
                          #  [5 6]]


2D array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
6
[[2 3]
 [5 6]]


### Reshaping and Flattening Arrays
You can change the shape of an array using the reshape method:




In [11]:
# Reshaping a 1D array to a 2D array
reshaped_array = array_1d.reshape(5, 1)
print("Reshaped array:\n", reshaped_array)


Reshaped array:
 [[1]
 [2]
 [3]
 [4]
 [5]]


To flatten an array, you can use the flatten method or the ravel function:

In [12]:
# Flattening a 2D array
flattened_array = array_2d.flatten()
print("Flattened array:", flattened_array)

# Alternatively, use the ravel function
flattened_array = np.ravel(array_2d)
print("Flattened array using ravel:", flattened_array)


Flattened array: [1 2 3 4 5 6 7 8 9]
Flattened array using ravel: [1 2 3 4 5 6 7 8 9]


## Array operations 

### Arithmetic Operations

Arithmetic operations in NumPy can be performed element-wise on arrays.

#### Addition

In [14]:
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

addition = a + b
print("Addition:", addition)


Addition: [5 7 9]


#### Subtraction

In [15]:
subtraction = a - b
print("Subtraction:", subtraction)


Subtraction: [-3 -3 -3]


#### Multiplication

In [16]:
multiplication = a * b
print("Multiplication:", multiplication)


Multiplication: [ 4 10 18]


#### Division

In [17]:
division = a / b
print("Division:", division)


Division: [0.25 0.4  0.5 ]


### Broadcasting
Broadcasting allows you to perform arithmetic operations on arrays of different shapes, as long as they are compatible.

#### Broadcasting Rules
NumPy follows these broadcasting rules:

* If the arrays have a different number of dimensions, the smaller shape is padded with ones on its left side.
* If the shape of the arrays does not match in any dimension, NumPy tries to stretch the dimensions with size 1 to match the other array's size in that dimension.
* If any dimension sizes still do not match, a broadcasting error is raised.


In [18]:
c = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
d = np.array([1, 0, 1])

broadcasted_sum = c + d
print("Broadcasted sum:\n", broadcasted_sum)


Broadcasted sum:
 [[ 2  2  4]
 [ 5  5  7]
 [ 8  8 10]]


### Element-wise Operations

#### Square Root

In [19]:
sqrt_array = np.sqrt(a)
print("Square root:", sqrt_array)


Square root: [1.         1.41421356 1.73205081]


#### Logarithm

In [20]:
log_array = np.log(a)
print("Natural logarithm:", log_array)


Natural logarithm: [0.         0.69314718 1.09861229]


#### Sine

In [21]:
sin_array = np.sin(a)
print("Sine:", sin_array)


Sine: [0.84147098 0.90929743 0.14112001]


####  Cosine

In [22]:
cos_array = np.cos(a)
print("Cosine:", cos_array)


Cosine: [ 0.54030231 -0.41614684 -0.9899925 ]


### Matrix Operations

NumPy supports various matrix operations, such as dot product, cross product, and transpose.

#### Dot Product:
The dot product, also known as the scalar product, is a mathematical operation between two vectors that results in a scalar. It is computed by multiplying the corresponding components of the two vectors and summing up the products. T


In [23]:
dot_product = np.dot(a, b)
print("Dot product:", dot_product)


Dot product: 32


#### Cross Product:

The cross product, also known as the vector product, is a mathematical operation between two vectors that results in a vector. It is computed by taking the determinant of a 3x3 matrix formed by the components of the two vectors and then creating a new vector with the resulting components.

In [25]:
cross_product = np.cross(a, b)
print("Cross product:", cross_product)


Cross product: [-3  6 -3]


#### Transpose:

The transpose of a matrix is a new matrix obtained by flipping the rows and columns of the original matrix. 

In [26]:
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

transpose = matrix.T

print("Transpose:\n", transpose)


Transpose:
 [[1 4 7]
 [2 5 8]
 [3 6 9]]


## Concatenating and Splitting Arrays

### Concatenating Arrays
You can concatenate multiple arrays along a specific axis using np.concatenate:

In [27]:
import numpy as np

array1 = np.array([[1, 2, 3], [4, 5, 6]])
array2 = np.array([[7, 8, 9], [10, 11, 12]])

# Concatenate along rows (axis=0)
concatenated_rows = np.concatenate((array1, array2), axis=0)
print("Concatenated rows:\n", concatenated_rows)


Concatenated rows:
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


### Splitting Arrays

You can split an array into multiple subarrays using np.split, np.hsplit, or np.vsplit:

In [28]:
# Split the concatenated array into two equal parts
split_arrays = np.split(concatenated_rows, 2, axis=0)
print("Split arrays:", split_arrays)

# Horizontal split (equivalent to np.split with axis=1)
hsplit_arrays = np.hsplit(array1, 3)
print("Horizontal split:", hsplit_arrays)

# Vertical split (equivalent to np.split with axis=0)
vsplit_arrays = np.vsplit(array1, 2)
print("Vertical split:", vsplit_arrays)


Split arrays: [array([[1, 2, 3],
       [4, 5, 6]]), array([[ 7,  8,  9],
       [10, 11, 12]])]
Horizontal split: [array([[1],
       [4]]), array([[2],
       [5]]), array([[3],
       [6]])]
Vertical split: [array([[1, 2, 3]]), array([[4, 5, 6]])]


### Stacking and Unstacking Arrays

#### Stacking Arrays
You can stack arrays vertically using np.vstack or horizontally using np.hstack:

In [29]:
# Vertical stacking
vstacked = np.vstack((array1, array2))
print("Vertical stacking:\n", vstacked)

# Horizontal stacking
array3 = np.array([[1, 4], [2, 5], [3, 6]])
array4 = np.array([[7, 10], [8, 11], [9, 12]])
hstacked = np.hstack((array3, array4))
print("Horizontal stacking:\n", hstacked)


Vertical stacking:
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
Horizontal stacking:
 [[ 1  4  7 10]
 [ 2  5  8 11]
 [ 3  6  9 12]]


### Inserting and Deleting Elements
#### Inserting Elements
You can insert elements into an array using np.insert:

In [32]:
array = np.array([1, 2, 3, 4, 5])
inserted = np.insert(array, 2, [10, 20, 30])
print("Inserted elements:", inserted)  # [ 1  2 10 20 30  3  4  5]


Inserted elements: [ 1  2 10 20 30  3  4  5]


#### Deleting Elements
You can delete elements from an array using np.delete:

In [34]:
deleted = np.delete(array, [1, 2])
print("Deleted elements:", deleted)  # [1 3 5]


Deleted elements: [1 4 5]


## Statistical functions 

### Descriptive Statistics (Mean, Median, Mode, Variance, Standard Deviation)

NumPy provides several functions for computing basic descriptive statistics on arrays.

In [36]:
import numpy as np

data = np.array([4, 5, 1, 2, 7, 2, 6, 3, 5, 2])


#### Mean

The mean is the average value of a set of data. It is calculated by adding up all the values in the set and dividing by the total number of values. The mean is sensitive to extreme values and outliers.

In [42]:
mean = np.mean(data)
print("Mean:", mean)  # 3.7


Mean: 3.7


#### Median
The median is the middle value in a set of data when the data is arranged in order from least to greatest. If there is an even number of values in the set, the median is the average of the two middle values. The median is not sensitive to extreme values and outliers.

In [43]:
median = np.median(data)
print("Median:", median)  # 3.5


Median: 3.5


#### Mode

The mode is the value that occurs most frequently in a set of data. A set of data can have more than one mode, or it can have no mode if no value occurs more than once.


While NumPy does not have a built-in function for calculating the mode, you can use the scipy.stats module:

In [44]:
from scipy import stats

mode = stats.mode(data)
print("Mode:", mode.mode[0], "Frequency:", mode.count[0])  # Mode: 2 Frequency: 3


Mode: 2 Frequency: 3


  mode = stats.mode(data)


#### Variance,

The variance measures the spread or variability of a set of data. It is calculated by finding the average of the squared differences between each value and the mean. A high variance indicates that the data is spread out, while a low variance indicates that the data is clustered closely around the mean.


In [45]:
variance = np.var(data)
print("Variance:", variance)  # 3.61


Variance: 3.6100000000000003


####  Standard Deviation

The standard deviation is another measure of the spread or variability of a set of data. It is the square root of the variance. A high standard deviation indicates that the data is more spread out, while a low standard deviation indicates that the data is clustered closely around the mean.


In [46]:
std_dev = np.std(data)
print("Standard Deviation:", std_dev)  # 1.9


Standard Deviation: 1.9000000000000001


### Basic Operations on Arrays (sum, max, min, argmax, argmin)

#### sum

The sum of an array is the total of all its elements. This is calculated by adding up all the elements in the array.

In [47]:
array_sum = np.sum(data)
print("Sum:", array_sum)  # 37


Sum: 37


#### max

The max of an array is the largest value in the array. This is calculated by finding the highest value among all the elements in the array.

In [48]:
array_max = np.max(data)
print("Max:", array_max)  # 7


Max: 7


#### min

The min of an array is the smallest value in the array. This is calculated by finding the lowest value among all the elements in the array.

In [49]:
array_min = np.min(data)
print("Min:", array_min)  # 1


Min: 1


#### argmax

The argmax of an array is the index of the element with the highest value in the array. This is useful when you need to find the position of the maximum value in an array

In [50]:
array_argmax = np.argmax(data)
print("Argmax:", array_argmax)  # 4 (index of the maximum value)


Argmax: 4


#### argmin

The argmin of an array is the index of the element with the lowest value in the array. This is useful when you need to find the position of the minimum value in an array.

In [51]:
array_argmin = np.argmin(data)
print("Argmin:", array_argmin)  # 2 (index of the minimum value)


Argmin: 2


##  Random number generation 

### Generating random numbers (uniform, normal, binomial distributions)

NumPy provides a random module for generating random numbers from various probability distributions.

#### Uniform Distribution

To generate random numbers from a uniform distribution, use np.random.rand for values between 0 and 1, or np.random.uniform for a custom range:

In [52]:
import numpy as np

# Generate a single random number between 0 and 1
uniform_random = np.random.rand()
print("Uniform random number:", uniform_random)

# Generate an array of random numbers from a uniform distribution
uniform_array = np.random.rand(5)
print("Uniform random array:", uniform_array)

# Generate random numbers between a custom range
custom_uniform_array = np.random.uniform(5, 10, size=5)
print("Custom uniform random array:", custom_uniform_array)


Uniform random number: 0.32941327240975127
Uniform random array: [0.02959948 0.2213208  0.45456477 0.49110147 0.3578427 ]
Custom uniform random array: [8.64902108 5.26314412 5.06969645 8.31049514 7.38806587]


#### Normal Distribution
To generate random numbers from a normal distribution, use np.random.randn for a standard normal distribution (mean 0, standard deviation 1), or np.random.normal for a custom distribution:

In [54]:
# Generate a single random number from a standard normal distribution
normal_random = np.random.randn()
print("Normal random number:", normal_random)

# Generate an array of random numbers from a standard normal distribution
normal_array = np.random.randn(5)
print("Normal random array:", normal_array)

# Generate random numbers from a custom normal distribution (mean 5, standard deviation 2)
custom_normal_array = np.random.normal(5, 2, size=5)
print("Custom normal random array:", custom_normal_array)


Normal random number: 1.314997099478541
Normal random array: [-0.95727423  0.74239927  0.83080033 -0.64724782  0.01144382]
Custom normal random array: [9.28539906 7.43227066 5.67089528 3.93418276 7.94359399]


### Setting a Random Seed

To obtain reproducible results, set a random seed using np.random.seed:
 


In [55]:
np.random.seed(42)

random_array = np.random.rand(5)
print("Random array with seed 42:", random_array)


Random array with seed 42: [0.37454012 0.95071431 0.73199394 0.59865848 0.15601864]


### Random Sampling
To randomly sample elements from an array, use np.random.choice:

In [56]:
data = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

# Sample 5 elements without replacement
sample_no_replacement = np.random.choice(data, 5, replace=False)
print("Sample without replacement:", sample_no_replacement)

# Sample 5 elements with replacement
sample_with_replacement = np.random.choice(data, 5, replace=True)
print("Sample with replacement:", sample_with_replacement)


Sample without replacement: [1 2 6 4 5]
Sample with replacement: [5 2 8 6 2]


## Recap of Covered Topics


In this curriculum, we've introduced you to the following NumPy topics:

Introduction to NumPy


Installation and importing


NumPy arrays


Basic array operations


Array manipulation


Basic statistical functions


Random number generation


Throughout these topics, you've learned about creating and manipulating arrays, performing arithmetic and element-wise operations, computing descriptive statistics, and generating random numbers.

# NumPy Assignment

### Title: NumPy Assignment

In this assignment, you'll practice using NumPy to solve a problem. You have been given a dataset that represents the height (in centimeters) of 10 students in a class.



Dataset:

In [57]:
 heights = np.array([160, 155, 172, 165, 180, 158, 175, 163, 171, 168])

Your task is to:

* Calculate the mean, median, and standard deviation of the students' heights.


* Create a new array with the heights normalized, i.e., each height value should be transformed to have a mean of 0 and a standard deviation of 1.


* Determine the tallest and shortest students in the class using NumPy's argmax and argmin functions.


* Calculate the difference between the tallest and shortest students' heights.


* Randomly select 5 heights from the dataset without replacement.


Complete the assignment by writing Python code using NumPy. Remember to first import NumPy and use its functions to perform the required calculations. Good luck!

Note: The purpose of this assignment is to test your understanding of NumPy concepts covered in this curriculum. Try to complete the assignment without referring back to the curriculum, but if you get stuck, feel free to review the topics and examples provided.

# Solution 

In [58]:
import numpy as np

# Dataset
heights = np.array([160, 155, 172, 165, 180, 158, 175, 163, 171, 168])

# 1. Calculate the mean, median, and standard deviation of the students' heights
mean_heights = np.mean(heights)
median_heights = np.median(heights)
std_dev_heights = np.std(heights)

print(f"Mean: {mean_heights:.2f}, Median: {median_heights}, Standard Deviation: {std_dev_heights:.2f}")

# 2. Create a new array with the heights normalized
normalized_heights = (heights - mean_heights) / std_dev_heights
print("Normalized Heights:", normalized_heights)

# 3. Determine the tallest and shortest students in the class
tallest_student_idx = np.argmax(heights)
shortest_student_idx = np.argmin(heights)
print(f"Tallest student index: {tallest_student_idx}, Shortest student index: {shortest_student_idx}")

# 4. Calculate the difference between the tallest and shortest students' heights
height_diff = heights[tallest_student_idx] - heights[shortest_student_idx]
print(f"Height difference: {height_diff} cm")

# 5. Randomly select 5 heights from the dataset without replacement
np.random.seed(42)  # Optional: for reproducibility
random_selection = np.random.choice(heights, size=5, replace=False)
print("Randomly selected heights:", random_selection)


Mean: 166.70, Median: 166.5, Standard Deviation: 7.54
Normalized Heights: [-0.88891945 -1.55229217  0.70317509 -0.22554673  1.76457144 -1.15426854
  1.10119872 -0.49089581  0.57050054  0.17247691]
Tallest student index: 4, Shortest student index: 1
Height difference: 25 cm
Randomly selected heights: [171 155 158 160 163]
