---

### 1: Introduction to Numpy and Arrays

---

**Introduction to Numpy**

_Notes:_
Numpy (Numerical Python) is one of the most fundamental packages for numerical computations in Python. It provides support for large multidimensional arrays, as well as an assortment of mathematical functions to operate on these arrays.

In [1]:
# Importing the Numpy module
import numpy as np


_Analogy:_
Think of Numpy as a powerful calculator that not only performs basic arithmetic but can also handle large data sets and matrices.

---

**Creating Arrays**

_Notes:_
Arrays are the main data structure used in Numpy. They can be thought of as lists in Python but with additional features and optimized for numerical computation.

- **From lists and tuples:**
  - **Use case:** You have data in a Python-native format (like lists or tuples) that you want to perform numerical operations on or use with other libraries/tools that accept Numpy arrays.

In [9]:
# Creating a one-dimensional array from a list
arr_from_list = np.array([1, 2, 3, 4])

# Creating a two-dimensional array from a list of lists
arr_2d = np.array([[1, "2"], [3, 4]])

# Creating an array from a tuple
arr_from_tuple = np.array((5, 6, 7, 8))


In [10]:
print(arr_from_list)
print(arr_2d)
print(arr_from_tuple)

[1 2 3 4]
[['1' '2']
 ['3' '4']]
[5 6 7 8]


b. **Using Numpy functions**
  - **zeros()**:
    - **Use case:** Initializing arrays for algorithms where the initial values should be zero, e.g., in some iterative algorithms.
  - **ones()**:
    - **Use case:** Initializing arrays for algorithms where the initial values should be one, e.g., for creating masks in image processing.
  - **arange()**:
    - **Use case:** When you need a sequence of numbers with a specific step size, e.g., for creating time steps in simulations.
  - **linspace()**:
    - **Use case:** When you want a specified number of equally spaced points between two values, e.g., evaluating a function at regular intervals.

In [24]:
# Creating an array of zeros
zero_arr = np.zeros(5)

# Creating a 2x3 matrix of ones
ones_matrix = np.ones((3, 4, 2))

# Creating an array with a range of numbers
range_arr = np.arange(5, 10)  # [5, 6, 7, 8, 9]

# Generating linearly spaced numbers between two values
linspace_arr = np.linspace(0, 1, 10)  # [0.  , 0.25, 0.5 , 0.75, 1.  ]


In [26]:
zero_arr
ones_matrix
range_arr
linspace_arr.shape

(10,)

_Analogy:_
Think of creating a Numpy array as organizing a group of data points in rows (like a spreadsheet). The Numpy functions are shortcuts for creating these organized groups without manually entering each data point.

---

**Attributes of Numpy Arrays**

_Notes:_
Each Numpy array has attributes that provide information about its structure and type.

- **Shape**:
  - **Use case:** When you need to know the dimensions of your data. Especially useful in machine learning where specific shapes of arrays are required for model training (e.g., (samples, features)).
  
- **Number of dimensions (ndim)**:
  - **Use case:** Useful to distinguish between scalars, vectors (1D arrays), matrices (2D arrays), and higher-dimensional structures, especially when writing generalized code that should work for arrays of any dimension.
  
- **Size**:
  - **Use case:** When you need to ascertain the total number of elements in your array, e.g., to check if two arrays have the same size before combining them.
  
- **Data type (dtype)**:
  - **Use case:** Ensuring compatibility with other libraries/tools, optimizing memory usage, or when certain operations require specific data types.

In [27]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Shape of the array: shows the dimension
print("Shape:", arr.shape)  # (3, 3)

# Number of dimensions
print("Number of Dimensions:", arr.ndim)  # 2

# Total number of elements
print("Size:", arr.size)  # 9

# Data type of the elements
print("Data Type:", arr.dtype)  # int64 (or another integer type, depending on the platform)


Shape: (3, 3)
Number of Dimensions: 2
Size: 9
Data Type: int64



_Analogy:_
If the array is like a spreadsheet, then these attributes are like the specifications of the spreadsheet (number of columns, rows, cell format, etc.).

---


**Basic Operations**

a. **Arithmetic operations**

_Notes:_
You can perform element-wise operations on arrays easily.

- **Arithmetic operations**:
  - **Use case:** Any situation where you want to perform element-wise operations between arrays, like adding two sets of data points or multiplying matrices. For instance, if you have two datasets representing sales from two different months and want to get their combined sales, you'd use addition.
  
- **Broadcasting**:
  - **Use case:** Operations between arrays of different shapes. A common example is when you have data in a 2D array (like a matrix of pixel values in an image) and you want to adjust all values using a 1D array (like adjusting RGB values using a single vector). Another example is scaling features in a machine learning dataset where you subtract the mean (a 1D array) from all rows of a 2D data matrix.

In [32]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Addition:", a + b)    # [5, 7, 9]
print("Subtraction:", a - b) # [-3, -3, -3]
print("Multiplication:", a * b)  # [4, 10, 18]
print("Division:", a / b)  # [0.25, 0.4, 0.5]


Addition: [5 7 9]
Subtraction: [-3 -3 -3]
Multiplication: [ 4 10 18]
Division: [0.25 0.4  0.5 ]


b. **Broadcasting**

_Notes:_
Broadcasting is a powerful mechanism that allows Numpy to work with arrays of different shapes. In simpler terms, it lets you perform operations between a smaller array and a larger array.

- **Arithmetic operations**:
  - **Use case:** Any situation where you want to perform element-wise operations between arrays, like adding two sets of data points or multiplying matrices. For instance, if you have two datasets representing sales from two different months and want to get their combined sales, you'd use addition.
  
- **Broadcasting**:
  - **Use case:** Operations between arrays of different shapes. A common example is when you have data in a 2D array (like a matrix of pixel values in an image) and you want to adjust all values using a 1D array (like adjusting RGB values using a single vector). Another example is scaling features in a machine learning dataset where you subtract the mean (a 1D array) from all rows of a 2D data matrix.

In [33]:
# Broadcasting in action
c = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
d = np.array([1, 0, 1])

# This will add the d array to each row of c
result = c + d
result


array([[ 2,  2,  4],
       [ 5,  5,  7],
       [ 8,  8, 10]])

_Analogy:_
Imagine you're applying a discount (like subtracting a fixed amount) to all items in a price list. Instead of manually subtracting the discount from each item, broadcasting lets you apply the discount to all items at once.


---

### 2: Advanced Array Manipulation

The essence of these array manipulations is to efficiently prepare, transform, and retrieve data for a myriad of applications. Whether you're performing complex mathematical operations, data analysis, or preparing data for machine learning models, these tools are fundamental in ensuring that data is in the right shape and form.

---

**1. Indexing and Slicing**

**Situations & Use Cases:**
- **Basic Slicing**: When you want to access specific parts of your dataset. E.g., you might only want the first 10 entries from a time series dataset to visualize the initial trend.
  
- **Boolean Indexing**: Useful in filtering data based on a condition. E.g., from an array of grades, you might want to quickly find all grades above 90.
  
- **Fancy Indexing**: When you have specific indices of data you're interested in. E.g., in a simulation, you might only want to analyze the results at specific time intervals given by their indices.

In [41]:
import numpy as np

# Create a simple 1D array
arr = np.arange(10)
print(arr)  # Output: [0 1 2 3 4 5 6 7 8 9]

# Basic slicing
print(arr[5])    # Output: 5
print(arr[2:5])  # Output: [2 3 4]

# Step in slicing
print(arr[1:8:2])  # Output: [1 3 5 7]

# Slicing in 2D arrays (analogous to matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr2[1, 2])  # Output: 6  (second row, third column)


[0 1 2 3 4 5 6 7 8 9]
5
[2 3 4]
[1 3 5 7]
6


*Note:* Think of slicing as cutting a cake. If you slice from 2 to 5, you're taking pieces 2, 3, and 4. The end index is exclusive.


In [42]:
# Boolean indexing
print(arr[arr > 5])  # Output: [6 7 8 9]

# Fancy indexing
indices = np.array([1, 3, 5])
print(arr[indices])  # Output: [1 3 5]


[6 7 8 9]
[1 3 5]


*Big Picture:* This method of array manipulation allows direct, efficient access to array's data, making operations streamlined and fast.

---

**2. Reshaping Arrays**

**Situations & Use Cases:**
- **Reshape**: 
  - Changing the representation of image data. E.g., an RGB image might be stored as a 3D array, but for certain operations, you'd want to reshape it into a long 1D array.
  - Preparing data for machine learning models. Many models require data in a specific shape.

- **Ravel (Flatten)**: 
  - When you need a flat representation of your data. E.g., when you want to serialize your data or prepare it for a model that requires 1D input.

In [46]:
# Reshape
print(arr)
print("")
arr_reshaped = arr.reshape(2, 5)
print(arr_reshaped)
# Output:
# [[0 1 2 3 4]
#  [5 6 7 8 9]]

# Ravel - Flattening the array
flat = arr_reshaped.ravel()
print(flat)  # Output: [0 1 2 3 4 5 6 7 8 9]


[0 1 2 3 4 5 6 7 8 9]

[[0 1 2 3 4]
 [5 6 7 8 9]]
[0 1 2 3 4 5 6 7 8 9]


*Note:* Always be sure that the dimensions you're reshaping into make sense. For example, a size-10 array can be reshaped into (2, 5) or (5, 2) but not (3, 4).

---


**3. Stacking and Splitting Arrays**

**Situations & Use Cases:**
- **Vertical and Horizontal Stacking**: 
  - Combining two datasets. E.g., if you have data from two sensors recording the same type of data, but in different arrays, and you want to analyze them together.
  - Adding features (columns) or samples (rows) to an existing dataset.

- **Splitting Arrays**: 
  - Cross-validation in machine learning, where you'd divide a dataset into 'k' chunks.
  - Distributing parts of a dataset to different processors in parallel computing.

In [None]:
# Vertical and Horizontal Stacking
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

vstacked = np.vstack((a, b))
print(vstacked)
# Output:
# [[1 2 3]
#  [4 5 6]]

hstacked = np.hstack((a, b))
print(hstacked)  # Output: [1 2 3 4 5 6]


*Analogy:* Stacking is like stacking Lego blocks on top of each other (vstack) or side-by-side (hstack).


In [None]:
# Splitting arrays
x = np.arange(9.0)
np.split(x, 3)
# Output: [array([0., 1., 2.]), array([3., 4., 5.]), array([6., 7., 8.])]


*Note:* The number provided to split tells how many equal sized chunks should be made.

---

**4. Transposing and Swapping Axes**

**Situations & Use Cases:**
- **Transposing**:
  - Linear algebra operations, such as matrix multiplication, where you need to multiply a matrix by its transpose.
  - Switching between row-major and column-major data layouts. E.g., some libraries or languages might expect data in a different orientation.
  
- **Swapping Axes**:
  - Reordering dimensions in multidimensional arrays. This is commonly seen in data related to images, where different libraries expect channels (like RGB) in different orders.

In [None]:
arr2 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
transposed = arr2.T
print(transposed)
# Output:
# [[1 4 7]
#  [2 5 8]
#  [3 6 9]]


*Big Picture:* Transposing is often used in matrix algebra and linear transformations. It swaps rows with columns.