# **Numpy Python Module**

NumPy, short for Numerical Python, is an open-source Python library designed to handle numerical computations efficiently. It provides support for multi-dimensional arrays (like tables and matrices), allowing you to work with large datasets and perform mathematical operations on them quickly. NumPy includes a wide range of mathematical functions, such as calculating averages, sums, and applying complex operations like matrix multiplication or statistical analysis, all with minimal code. It is widely used in scientific computing, data analysis, and machine learning because of its speed and ease in handling large volumes of numerical data, making it an essential tool in these fields.

## Applications

- Data Analysis: NumPy allows you to create and manipulate data (in the form of arrays), filter it, and perform operations like calculating the mean, standard deviation, etc.

- Machine Learning & AI: Libraries like TensorFlow and PyTorch use NumPy to manage input data, model parameters, and process output values.

- Array Manipulation: NumPy supports creating, resizing, slicing, indexing, stacking, splitting, and combining arrays.

- Finance & Economics: NumPy is used for financial analysis, including portfolio optimization, risk assessment, time series analysis, and statistical modeling.

- Image & Signal Processing: NumPy helps in processing and analyzing images and signals for various applications.

- Data Visualization: While NumPy itself doesn’t generate visualizations, it works with libraries like Matplotlib and Seaborn to create charts and graphs from numerical data.


## Why is NumPy Faster Than Lists?

| **Aspect**        | **NumPy**                                                                                           | **Python List**                                                                                                  |
|-------------------|-----------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
| **Memory Storage** | NumPy stores all data in one continuous block, making it faster to access and use less memory.                   | Python lists store references to objects, which can slow things down and use more memory.                        |
| **Data Types**     | NumPy arrays have elements of the **same type**, which helps save memory and makes things faster.               | Python lists can hold elements of **different types**, which takes up more memory and makes things slower.      |
| **Operations**     | NumPy can do math on entire arrays at once, making it faster.                                                   | Python lists need loops to do math on each element one by one, which is slower.                                |
| **Efficiency**     | NumPy is written in **C**, making it much faster for numerical tasks.                                          | Python lists are slower because they are handled by Python’s slower byte-code.                                  |
| **Memory Usage**   | NumPy uses less memory because all elements are the same type and stored in one place.                         | Python lists use more memory because each element is a separate object.                                         |
| **Broadcasting**   | NumPy allows you to do operations on arrays of different sizes without copying data, making it faster.         | Python lists can't do this, so operations on different sized lists take longer.                                |
| **Performance**    | NumPy is faster because it uses memory better and does things more efficiently.                                | Python lists are slower because their data is scattered around in memory.                                      |
| **Functionality**  | NumPy has many built-in math functions for arrays, making it perfect for complex calculations.                 | Python lists only have basic functions and can’t handle complex math easily.                                   |


In [1]:
import numpy as np

array = np.array([1, 2, 3, 4, 5])
print(array.dtype)  # Output: <class 'numpy.ndarray'>
array

int64


array([1, 2, 3, 4, 5])

## NumPy ndarray:
- ndarray (short for N-dimensional array) is the **core object** in NumPy.

- It represents a collection of items (elements) that are all of the same type.

- Each element in the ndarray can be accessed using a zero-based index.

- All elements of an ndarray take the same size block in memory.

- The type of the elements is defined by a special object called **dtype** (data type), which specifies how much memory each element takes and what type it is (e.g., integer, float).

## Relationship Between ndarray, dtype, and Array Scalar Types:
- **ndarray**: This is the main object that holds the array of data.

- **dtype**: This is the data type object that defines the type of each element within the ndarray (e.g., int32, float64).

- **Array Scalar Type**: When you extract an element from an ndarray (like by slicing), the element is converted into a Python object of a specific array scalar type, which corresponds to the dtype of the ndarray.

### **Creating an ndarray with numpy.array()**
The numpy.array() function is the most common way to create an ndarray. It takes in any object that can expose the array interface or any sequence (like lists, tuples, or nested sequences). Here's the basic syntax:

In [2]:
np.array(object, dtype=None, copy=True, order=None, subok=False, ndmin=0)

array(<class 'object'>, dtype=object)

## parameters taken are as follows


| **Sr.No.** | **Parameter** | **Description** | **Example** |
|------------|---------------|-----------------|-------------|
| 1 | **`object`** | This is the data (like a list, tuple, or another array) that you want to convert into a NumPy array. It is the most important argument. | `np.array([1, 2, 3, 4])` creates an array from a list. |
| 2 | **`dtype`** | Specifies the desired data type for the array elements (e.g., `int`, `float`). If not specified, NumPy tries to infer the data type. refer official doc for knowing about suported dtypes | `np.array([1, 2, 3], dtype=np.float32)` creates an array with `float32` type. |
| 3 | **`copy`** | By default, this is `True`, meaning a new copy of the data is made. If set to `False`, NumPy will use a reference to the original data. | `np.array([1, 2, 3], copy=False)` creates a reference to the input list. |
| 4 | **`order`** | Specifies how the array is stored in memory. `'C'` is for row-major (C-style), and `'F'` is for column-major (Fortran-style). | `np.array([[1, 2], [3, 4]], order='F')` creates a 2D array in column-major order. |
| 5 | **`subok`** | If `True`, sub-classes of `ndarray` (like masked arrays) are preserved when returning the array. If `False` (default), the result is always a base `ndarray`. | `np.array([1, 2, 3], subok=True)` keeps any subclass of `ndarray`. |
| 6 | **`ndmin`** | Specifies the minimum number of dimensions for the resulting array. If the input has fewer dimensions, extra dimensions are added. | `np.array([1, 2, 3], ndmin=2)` converts the 1D array to a 2D array with shape `(1, 3)`. |


## Example: Create a One-dimensional Array

In [3]:

a = np.array([1, 2, 3])
a

array([1, 2, 3])

## Example: Create a Multi-dimensional Array

In [4]:

a = np.array([[1, 2], [3, 4]])
print(a)

a = np.array([range(i,i+2) for i in [1,3]])
a

[[1 2]
 [3 4]]


array([[1, 2],
       [3, 4]])

## Example: Specify Minimum Dimensions

In [5]:

a = np.array([1, 2, 3, 4, 5], ndmin=2)
a

array([[1, 2, 3, 4, 5]])

## Example: Specify Data Type

In [6]:

a = np.array([1, 2, 3], dtype=complex)
a

array([1.+0.j, 2.+0.j, 3.+0.j])

# **Indexing Scheme**

The indexing scheme in NumPy determines how elements in an ndarray are located in memory using a combination of shape and strides.

## Shape and Strides in NumPy
**Shape**:

- The shape of an ndarray represents the size of the array along each dimension. It is a tuple of integers.

- Example: For a 2x3 array, the shape would be (2, 3). This means there are 2 rows and 3 columns.

**Strides**:

- Strides refer to the number of bytes you need to move in memory to access the next element in each dimension. It tells NumPy how to step across the dimensions of the array when moving through its elements.

- For instance, in a 2D array, the stride value for each dimension tells you how many bytes to move from one element to the next element in the same row (along the row axis) or to the next element in the same column (along the column axis).

# **Row-major and Column-major Orders**

**Row-major Order (C-style):**

- In row-major order, the last index changes the fastest. This means that elements in the same row are stored next to each other in memory.

- For example, in a 2x3 array like:

[[1, 2, 3],
 [4, 5, 6]]

The elements will be stored in memory as:
1, 2, 3, 4, 5, 6.

**Column-major Order (FORTRAN-style):**

- In column-major order, the first index changes the fastest. This means that elements in the same column are stored next to each other in memory.

- For example, in the same 2x3 array:

[[1, 2, 3],
 [4, 5, 6]]

The elements will be stored in memory as:
1, 4, 2, 5, 3, 6.


**Example**

Following is a basic example to demonstrate the usage of the memory layout −


In [7]:


# Creating a 2x3 array in row-major order
a = np.array([[1, 2, 3], [4, 5, 6]])
print(a)
print("Shape:", a.shape)
print("Strides:", a.strides)

[[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Strides: (24, 8)


The shape of the array is (2, 3), indicating it has 2 rows and 3 columns. The strides are (24, 8), meaning that to move to the next row, we need to skip 24 bytes (since each element is an 8-byte integer, and there are 3 columns), and to move to the next column, we need to skip 8 bytes (the size of one integer)

------------------------------------------

# ***Creating Numpy arrays***

In NumPy, you can create arrays using several built-in functions, each of which serves a different purpose depending on the type of array you want to create. Here’s a breakdown of some of the most commonly used functions for creating NumPy arrays:

### 1. **`numpy.array()`**
- Converts a list or other sequence into a NumPy array.
- **Example**: `np.array([1, 2, 3])` turns a list into an array.

### 2. **`numpy.zeros()`**
- Creates an array filled with zeros.
- **Example**: `np.zeros((2, 3))` creates a 2x3 array of zeros.

### 3. **`numpy.ones()`**
- Creates an array filled with ones.
- **Example**: `np.ones((3, 2))` creates a 3x2 array of ones.

### 4. **`numpy.arange()`**
- Creates an array with numbers in a specified range, with a specific step size.
- **Example**: `np.arange(0, 10, 2)` creates an array with numbers from 0 to 10, stepping by 2 (i.e., `[0, 2, 4, 6, 8]`).

### 5. **`numpy.linspace()`**
- Creates an array with evenly spaced numbers over a specified range.
- **Example**: `np.linspace(0, 1, 5)` creates 5 evenly spaced numbers between 0 and 1.

### 6. **`numpy.logspace()`**
- The numpy.logspace() function generates an array with values that are evenly spaced on a log scale.
- **Example**: `np.logspace(1, 10, 10, base=2)` creates an array of 10 values evenly spaced on a logarithmic scale from 21 to 210 with base 2

### 7. **`numpy.random.rand()`**
- Creates an array of random numbers between 0 and 1.
- **Example**: `np.random.rand(2, 3)` creates a 2x3 array of random numbers.

### 8. **`numpy.empty()`**
- Creates an array without initializing the values, meaning the array will contain random data initially.
- **Example**: `np.empty((2, 2))` creates an empty 2x2 array with uninitialized values.

### 9. **`numpy.full()`**
- Creates an array filled with a specific value that you provide.
- **Example**: `np.full((3, 3), 7)` creates a 3x3 array filled with the value `7`.


### **Example: Creating a 1D & 2D NumPy Array uing list as input for numpy.array()**

In [8]:


# Creating a 1D array from a list
# syntax - numpy.array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0, like=None)

my_list = [1, 2, 3, 4, 5] #my_list - object list
my_array = np.array(my_list)

print("1D Array:", my_array)

# Creating a 2D array from a list of lists
arr = np.array([[1, 2, 3], [4, 5, 6]])

print("2D Array(2x3):\n", arr)


1D Array: [1 2 3 4 5]
2D Array(2x3):
 [[1 2 3]
 [4 5 6]]


### **Example: Creating a NumPy Array with values initialized to zeroes**

In [9]:


# Creating an array of zeros 

# syntax - numpy.zeros(shape, dtype=float, order='C')

# here order desides whether to save in row major or column major memory layout indexing
arr = np.zeros(5)
print(arr)

[0. 0. 0. 0. 0.]


### **Example: Creating a 1D & 2D NumPy Array with values initialized to one**

In [10]:


# Creating an array of ones 
# syntax - numpy.ones(shape, dtype=None, order='C')
arr = np.ones(3)
print("1D Array:", arr)

# Creating 2D array of ones 
array_2d = np.ones((2, 3), dtype=np.int32, order='F')
print("2D Array(2x3):\n", array_2d)
print("strides for 2d for int32 dtype with Fortran style indexing: ", array_2d.strides)

1D Array: [1. 1. 1.]
2D Array(2x3):
 [[1 1 1]
 [1 1 1]]
strides for 2d for int32 dtype with Fortran style indexing:  (4, 8)


### **Using numpy.arange() Function**

The numpy.arange() function generates a sequence of numbers in a specified range. You can define three parameters:

- start: The starting value of the sequence (defaults to 0 if not provided).

- stop: The end value of the sequence (exclusive, not included in the result).

- step: The interval between consecutive values (defaults to 1 if not provided).

It creates an array with evenly spaced numbers based on the start, stop, and step values.

### **Example for numpy.arrange()**

In [11]:


# Providing just the stop value
array1 = np.arange(10)
print("array1:", array1)

# Providing start, stop and step value
array2 = np.arange(1, 10, 2)
print("array2:",array2)

array1: [0 1 2 3 4 5 6 7 8 9]
array2: [1 3 5 7 9]


### **Using numpy.linspace() Function:**
The numpy.linspace() function generates a sequence of evenly spaced numbers over a specified range. Unlike numpy.arange(), which generates numbers based on a specified step size, numpy.linspace() generates a fixed number of points between a given start and stop value.

**Parameters:**
- start: The starting value of the sequence.

- stop: The end value of the sequence (inclusive by default).

- num: The number of samples to generate. This is the total number of equally spaced points between start and stop (defaults to 50).

- endpoint: If True (default), stop is included in the sequence. If False, stop is excluded.

- retstep: If True, returns a tuple of the array and the step size used to generate it.

- dtype: The data type of the returned array.

- axis: The axis in the result along which the numbers are spaced.

In [12]:


# syntax - numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)

# Creating an array of 10 evenly spaced values from 0 to 5
array1 = np.linspace(0, 5, num=10, dtype=np.float16)
print("array1:",array1)


# Creating an array with 5 values from 1 to 2, excluding the endpoint
array2 = np.linspace(1, 2, num=5, endpoint=False)
print("array2:",array2)

# Creating an array and returning the step value
array3, step = np.linspace(0, 10, num=5, retstep=True)
print("array3:",array3)
print("Step size:", step)

array1: [0.     0.5557 1.111  1.667  2.223  2.777  3.334  3.889  4.445  5.    ]
array2: [1.  1.2 1.4 1.6 1.8]
array3: [ 0.   2.5  5.   7.5 10. ]
Step size: 2.5


### Using `numpu.logspace()` Function:

Generates numbers that are evenly spaced on a logarithmic scale, which is useful when your data spans several orders of magnitude (like 10, 100, 1000...).

**Syntax & parameters**
```python
np.logspace(start, stop, num=50, endpoint=True, base=10.0, dtype=None, axis=0)
```

| Parameter | Description |
|----------|-------------|
| `start` | Exponent of the starting value (`base^start`). |
| `stop` | Exponent of the ending value (`base^stop`). |
| `num` | Number of samples to generate (default = 50). |
| `endpoint` | If `True`, include `base**stop`; otherwise exclude it. |
| `base` | Base of the log scale (default = 10). |
| `dtype` | Data type of the resulting array. |
| `axis` | The axis along which the output is stored (for multi-dimensional use). |




In [13]:
# microphone's response from 2 Hz to 1024 Hz, and you want even log spacing (not linear), using base 2:
# Frequencies from 2^1 to 2^10 (i.e., 2 to 1024)

frequencies = np.logspace(1, 10, num=10, base=2)
print("Log-spaced frequencies:", frequencies)

Log-spaced frequencies: [   2.    4.    8.   16.   32.   64.  128.  256.  512. 1024.]


### **Using numpy.random.rand() Function**:

The `numpy.random.rand()` function creates an array filled with random values sampled from a uniform distribution over the interval [0, 1). This means that the values generated will lie between 0 and 1 (excluding 1).

### **Example**


In [14]:


# syntax - numpy.random.rand(d0, d1, ..., dn)

# 1. Generating a single random float
random_float = np.random.rand()
print("Single random float:", random_float)

# 2. Generating a 1D array of 5 random floats
array_1d = np.random.rand(5)
print("1D array of random values:", array_1d)

# 3. Generating a 2D array (2 rows, 3 columns) of random floats
array_2d = np.random.rand(2, 3)
print("2D array of random values:\n", array_2d)

# 4. Generating a 3D array (2 matrices of 3x4) of random floats
array_3d = np.random.rand(2, 3, 4)
print("3D array of random values:\n", array_3d)


Single random float: 0.2603166646906496
1D array of random values: [0.07097053 0.16760147 0.88601334 0.83066785 0.92509942]
2D array of random values:
 [[0.48173098 0.45225485 0.55657073]
 [0.47911143 0.44992784 0.77219552]]
3D array of random values:
 [[[0.77844212 0.74015932 0.46742119 0.80845376]
  [0.49726735 0.6288721  0.47225338 0.76884297]
  [0.71903181 0.704893   0.38439389 0.32950579]]

 [[0.77915593 0.45735061 0.74538303 0.93700316]
  [0.94294513 0.9343461  0.96552908 0.42186477]
  [0.05668961 0.50843357 0.05458259 0.28101145]]]


### **Using numpy.empty() Function**:

- **`numpy.empty()`** creates an array with the specified shape.
- The values inside the array are **random** and uninitialized, depending on the memory state.
- It is useful when you need to create an array **quickly** without filling it with values immediately.
- Unlike `numpy.zeros()` (which fills the array with zeros) or `numpy.ones()` (which fills the array with ones), `numpy.empty()` does not initialize the array.
- It is faster and saves memory because it doesn’t waste time initializing the array.
- Best used when you plan to **fill** the array with your own data later.

### **Example**

In [15]:


# syntax - numpy.empty(shape, dtype=float, order='C')

empty_array = np.empty((2, 3))
print("array init with random values:\n",empty_array)

array init with random values:
 [[0.48173098 0.45225485 0.55657073]
 [0.47911143 0.44992784 0.77219552]]


### Explanation of `numpy.full()` Function

The `numpy.full()` function is used to create a NumPy array of a specific shape and fill it with a **specified value**. This is useful when you want to initialize an array with a specific number across all its elements.

### **paramters:**

- **shape**: Specifies the dimensions of the array (e.g., (2, 3) for a 2x3 array).

- **fill_value**: The value with which the array is filled. This value will be assigned to all elements of the array.

- **dtype**: Optional. Specifies the data type of the array elements. The default is None, meaning the data type is inferred from the fill value.

- **order**: Optional. Specifies the memory layout order ('C' for row-major, 'F' for column-major). It is not commonly used unless you have specific requirements.

In [16]:
# syntax - numpy.full(shape, fill_value, dtype=None, order='C')

array1 = np.full((2, 3), 5)
print(array1)

[[5 5 5]
 [5 5 5]]


### **Using `numpy.meshgrid()` Function:**

Creates coordinate matrices from 1D arrays. Essential for:

- Evaluating 2D/3D functions

- Plotting surface or contour plots

- 3D simulations and space mapping

**Syntax and Parameters**
```python
np.meshgrid(*xi, copy=True, sparse=False, indexing='xy')
```

| Parameter   | Description                                                                 |
|-------------|-----------------------------------------------------------------------------|
| `*xi`       | One or more 1D arrays (e.g., `x`, `y`, `z`).                                 |
| `copy`      | If `True`, data is copied. If `False`, a view may be returned.              |
| `sparse`    | If `True`, returns sparse meshgrid (saves memory).                          |
| `indexing`  | `'xy'` (default for plotting), or `'ij'` (matrix-style indexing).           |




In [17]:
# 2D Grid Example:

x = np.arange(1, 4)  # [1, 2, 3]
y = np.arange(1, 3)  # [1, 2]

X, Y = np.meshgrid(x, y)

print("X Grid:\n", X)
print("Y Grid:\n", Y)


X Grid:
 [[1 2 3]
 [1 2 3]]
Y Grid:
 [[1 1 1]
 [2 2 2]]


This represents all (x, y) positions on the grid:
```
(1,1), (2,1), (3,1)
(1,2), (2,2), (3,2)
```

In [18]:
# 3D Grid Example:

x = np.arange(1, 4)
y = np.arange(1, 3)
z = np.arange(1, 3)

X, Y, Z = np.meshgrid(x, y, z, indexing='ij')

print("X Grid:\n", X)
print("Y Grid:\n", Y)
print("Z Grid:\n", Z)


"""
X varies in x-direction (1 to 3)

Y varies in y-direction (1 to 2)

Z varies in z-direction (1 to 2)

Each element in (X[i,j,k], Y[i,j,k], Z[i,j,k]) forms one 3D point.
"""


X Grid:
 [[[1 1]
  [1 1]]

 [[2 2]
  [2 2]]

 [[3 3]
  [3 3]]]
Y Grid:
 [[[1 1]
  [2 2]]

 [[1 1]
  [2 2]]

 [[1 1]
  [2 2]]]
Z Grid:
 [[[1 2]
  [1 2]]

 [[1 2]
  [1 2]]

 [[1 2]
  [1 2]]]


'\nX varies in x-direction (1 to 3)\n\nY varies in y-direction (1 to 2)\n\nZ varies in z-direction (1 to 2)\n\nEach element in (X[i,j,k], Y[i,j,k], Z[i,j,k]) forms one 3D point.\n'

### To sum it up

In the NumPy module, there are various ways to create NumPy arrays that include basic creation methods, creation by reshaping and modifying data, creation using sequences, and creation using random functions. For further details on the creation functions, please refer to the official NumPy documentation.

----------------------------------------------------------------

# ***Creating NumPy Arrays from Existing Data***

In data science, machine learning, and numerical computing, NumPy arrays form the backbone of efficient data storage and computation. But before you can analyze, transform, or model data, you need to first create arrays from existing sources.

Think of it like this:

> ❝You already have data in various formats—lists, files, buffers, generators—and now you want to transform that into a powerful structure (a NumPy array) that can be sliced, transformed, visualized, or fed into ML models.❞

Following are few common ways to acheive this:

## 1. **From Python Lists**

Lists are one of the most basic data structures in Python. We often collect data in lists—like scores, names, sensor readings. NumPy can convert these into arrays for fast numerical operations.

In [19]:
## syntax - np.array(object, dtype=None, copy=True, order='K', subok=False, ndmin=0)

# recording daily temperatures as list:
temperatures = [22, 24, 21, 23, 25]

temp_array = np.array(temperatures)

print("Array from list:", temp_array)

# Now you can perform operations like mean, std deviation etc., which are hard with normal lists.


Array from list: [22 24 21 23 25]


## 2. **From Nested Lists**

Nested lists = lists of lists → can represent 2D (or more) data like matrices, grids, tables.

In [20]:
scores = [[85, 90, 78], [92, 88, 84], [70, 75, 80]]
score_array = np.array(scores)

print("2D Array from nested list:\n", score_array)

# You can now access scores like score_array[1][2]

2D Array from nested list:
 [[85 90 78]
 [92 88 84]
 [70 75 80]]


## 3. **From Python Tuples**

Tuples are like lists but immutable. You might use tuples for fixed data—like coordinates or RGB color values.

In [21]:
coordinates = (48.8566, 2.3522)
coord_array = np.array(coordinates)

print("Array from tuple:", coord_array)

# Now you can apply math functions (e.g., convert to radians, compute distances).

Array from tuple: [48.8566  2.3522]


## 4. **From Existing NumPy Arrays**

There are 5 submethods here, useful when working with existing data:

  1. `np.copy()`: Full data copy

In [22]:
original = np.array([1, 2, 3])
copy = np.copy(original)
copy[0] = 100
print("Original:", original, "Copied:", copy)

# Use when you want a safe duplicate to modify without affecting the original.


Original: [1 2 3] Copied: [100   2   3]


  2. `np.asarray()`: Converts to array but avoids unnecessary copying

In [23]:
original = np.array([1, 2, 3])
converted = np.asarray(original)
print("Same object?", original is converted)  # True

# Useful when you’re not sure if input is array or list.

Same object? True


  3. `np.view()`: Shares data with a new perspective (like changing dtype)

In [24]:
original = np.array([1, 2, 3], dtype=np.int32)
viewed = original.view(dtype=np.float32)
print("Viewed array:", viewed)

# Use this to reinterpret binary data types.


Viewed array: [1.e-45 3.e-45 4.e-45]


  4. `np.reshape()`: Change shape without altering data - already discussed in array manipulation

  5. `Slicing`: Get part of the array

In [25]:
original = np.array([10, 20, 30, 40, 50])
sliced = original[1:4]
print("Sliced:", sliced)

# Use this to process or analyze only a subset (e.g., last week’s data).


Sliced: [20 30 40]


## 5. **Using Python Range Objects**

range() generates sequences efficiently. Convert to array when you want to manipulate or analyze them.



In [26]:
student_ids = range(1, 1001)
id_array = np.array(student_ids)

print("First 10 IDs:", id_array[:10])

# Much faster than writing out a list manually.

First 10 IDs: [ 1  2  3  4  5  6  7  8  9 10]


## 6. **Using np.asarray()**

Use asarray() when input might already be an array. It avoids copying unless necessary.

In [27]:
# syntax - np.asarray(a, dtype=None, order=None)

data = [[1, 2, 3, 4],[5, 6, 7, 8]] # of any dimension 1D/2D etc
safe_array = np.asarray(data)
print(safe_array)

# Efficiently converts to array without redundancy.

[[1 2 3 4]
 [5 6 7 8]]


## 7. **Using np.frombuffer()**

Used when working with binary data or memory buffers. Useful for low-level tasks like image, audio processing.

In [28]:
# syntax - np.frombuffer(buffer, dtype=float, count=-1, offset=0)

byte_data = b'hello'
arr = np.frombuffer(byte_data, dtype='S1')
print("Array from buffer:", arr)

# Each byte becomes an array element.

Array from buffer: [b'h' b'e' b'l' b'l' b'o']


## 8. **Using np.fromiter()**

Creates array from any iterable (like a generator). Efficient when reading large data streams.

In [29]:
# syntax - np.fromiter(iterable, dtype, count=-1)

def sensor_data():
    for i in range(5):
        yield i * 10

arr = np.fromiter(sensor_data(), dtype=int)
print("Sensor array:", arr)

# Great for large datasets where you don’t want to hold everything in memory.

Sensor array: [ 0 10 20 30 40]


-----

## ***Manipulating the Shape of Arrays in NumPy***

Several routines are available in NumPy for manipulating the shape of elements in an ndarray object. These routines allow you to change the shape without altering the data, making it easier to work with different dimensions. The following functions are used for changing the shape of arrays:

## Frequently Used Functions

### Reshape and Flattening
- `reshape()`, `flatten()`, `ravel()` — Machine learning, data preprocessing.

### Transpose
- `transpose()`, `swapaxes()` — Linear algebra, deep learning.

### Joining Arrays
- `concatenate()`, `vstack()`, `hstack()` — Data preparation, feature engineering.

### Sorting and Searching
- `sort()`, `argsort()`, `argmax()` — Data analysis, optimization tasks.

### Broadcasting
- `broadcast_to()`, `expand_dims()` — Handling arrays of different shapes for operations.

### Set Operations
- `unique()`, `in1d()` — Data cleaning, analysis.

Discussion below covers these topics, additional manipulation operations are rarely used, if needed you can go through official user guide






## **Reshape and Flattening of ndarray**: 

In NumPy, reshaping means changing the structure of an array without modifying its actual data.

### 1. Numpy `reshape()` Function

The `reshape()` function in NumPy is used to change the shape of an array, but the data in the array stays the same. It doesn't change the values inside; it just rearranges them into a new shape based on the dimensions you provide. The total number of elements must remain the same, meaning the number of elements in the original array must match the number of elements in the reshaped array.

### Syntax:
```python
numpy.reshape(arr, newshape, order='C')

### Parameters:
- arr: The input array that you want to reshape.

- newshape: A new shape for the array. This can be a single integer or a tuple of integers. The new shape must be compatible with the total number of elements in the original array.

- order: (Optional) This specifies how the array should be read and written:

  'C': Row-major order (default).

  'F': Column-major order.

### Return Value:
The function returns a new array with the same data but a different shape.

### Example 1: Basic Reshaping
Let's say you have a 1D array and you want to reshape it into a 2D array.

In [30]:


# Create a 1D array with 6 elements
arr = np.array([1, 2, 3, 4, 5, 6])

# Reshape it into a 2D array with 2 rows and 3 columns
reshaped_arr = arr.reshape(2, 3) # here no need to add arr to function as 1st input as that function is called from the arr itself

print("Original Array:")
print(arr)

print("\nReshaped Array (2x3):")
print(reshaped_arr)

Original Array:
[1 2 3 4 5 6]

Reshaped Array (2x3):
[[1 2 3]
 [4 5 6]]


### Example 2: Using -1 to Automatically Calculate One Dimension
If you're unsure about one dimension of the new shape, you can use -1, and NumPy will automatically calculate it for you based on the total number of elements.

In [31]:


# Create a 1D array with 9 elements
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

reshaped_arr = arr.reshape(3, -1)

print("Original Array:")
print(arr)

print("\nReshaped Array (3x3):")
print(reshaped_arr)

Original Array:
[1 2 3 4 5 6 7 8 9]

Reshaped Array (3x3):
[[1 2 3]
 [4 5 6]
 [7 8 9]]


### Example 3: Reshaping with Column-Major Order
You can also reshape the array using a different reading and writing order (column-major order). This is less common but may be useful in some cases.

In [32]:


# Create a 2x3 array
arr = np.array([[1, 2, 3], [4, 5, 6]])

# Reshape it to 3x2 using 'F' for column-major order
reshaped_arr = np.reshape(arr, (3, 2), order='F')

print("Original Array:")
print(arr)

print("\nReshaped Array (3x2) in Column-Major Order:")
print(reshaped_arr)

Original Array:
[[1 2 3]
 [4 5 6]]

Reshaped Array (3x2) in Column-Major Order:
[[1 5]
 [4 3]
 [2 6]]


### 🔁 Error Occurrence while Reshaping Arrays in NumPy

When reshaping arrays in NumPy, you might run into errors — especially if the total number of elements doesn't match the new shape. It's important to make sure the new shape is compatible with the original number of elements.


### ❗ Common Errors While Reshaping

#### 1. **ValueError: Total size of new array must be unchanged**
This error happens when the total elements in the original array don’t match the product of the dimensions you are trying to reshape to.

**Example:**
```python
import numpy as np
np.arange(10).reshape(3, 3)  # Error: 10 elements can't fit into a 3x3 matrix (needs 9)
```

#### 2. **ValueError: cannot reshape array of size X into shape (Y, Z)**
This means you're trying to reshape an array of size `X` into a shape that doesn't match. Even if the math looks right, the structure might not fit.

**Example:**
```python
arr = np.arange(10)  # size 10
arr.reshape(2, 5)     # ✅ works
arr.reshape(2, 3)     # ❌ Error: 2x3 = 6, but array has 10 elements
```

#### 3. **TypeError: 'numpy.ndarray' object cannot be interpreted as an integer**
This happens if you accidentally pass a non-integer (like a float or an array) as a dimension.

**Example:**
```python
arr = np.arange(6)
arr.reshape([2.0, 3])  # ❌ Error: 2.0 is a float, not an int
```

### ✅ How to Handle Reshaping Errors

#### ✔️ 1. Check Array Size
Use `.size` to check total elements and ensure the new shape's dimensions multiply to that value.

```python
arr = np.arange(12)
print(arr.size)  # Output: 12
```

#### ✔️ 2. Use `-1` for Unknown Dimension
Let NumPy calculate one of the dimensions automatically:

```python
arr = np.arange(12)
arr.reshape(3, -1)  # Output: shape (3, 4)
```

#### ✔️ 3. Use Try-Except to Catch Errors
Wrap your reshaping in a try-except block to handle errors gracefully:

```python
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

try:
    reshaped_arr = arr.reshape((2, 3))
except ValueError as e:
    print("Error Occurred During Reshaping:")
    print(e)
```

**Output:**
```
Error Occurred During Reshaping:
cannot reshape array of size 5 into shape (2,3)
```

---

### 💡 Summary

- Always make sure the new shape's total elements match the original.
- Use `-1` when you're unsure of one dimension.
- Wrap reshaping code in `try-except` blocks to avoid crashes.


### 2. Numpy `flatten()` Function

The 'flatten()' method in NumPy is a useful function when you want to convert a multi-dimensional array (like a 2D, 3D array, etc.) into a **1D array**.

### What Does flatten() Do?

- Creates a new 1D array: The method takes all elements from the multi-dimensional array and arranges them into a single, 1D array.

- Order of elements: The elements are placed into the 1D array in a specific order, which can be controlled. By default, it follows row-major order (similar to how a C programming language would store data), but you can change it based on your requirements.

#### Syntax
```python
ndarray.flatten(order='C')
```
### parmeter description - 

order: This specifies how the elements should be flattened.

- 'C' : Row-major (C-style) order, i.e., flattening happens row by row.

- 'F' : Column-major (Fortran-style) order, i.e., flattening happens column by column.

- 'A' : If the array is Fortran-contiguous in memory, it flattens in column-major order; otherwise, row-major order.

- 'K' : Flatten in the order the elements occur in memory (preserves the original memory layout).

### Example 1: Flattening a 2D Array (Row-Major Order)


In [33]:


array_2d = np.array([[1, 2, 3], [4, 5, 6]])

flattened_arr = array_2d.flatten('F')
print("Original 2D Array:")
print(array_2d)
print("\nFlattened 1D Array:")
print(flattened_arr)

Original 2D Array:
[[1 2 3]
 [4 5 6]]

Flattened 1D Array:
[1 4 2 5 3 6]


### Example 2: Flattening in Column-Major Order

In [34]:


# Creating a 2D numpy array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Flattening the array in column-major ('F' order)
flattened_array = array_2d.flatten(order='F')

print("Original 2D Array:")
print(array_2d)
print("\nFlattened 1D Array (Column-major):")
print(flattened_array)


Original 2D Array:
[[1 2 3]
 [4 5 6]]

Flattened 1D Array (Column-major):
[1 4 2 5 3 6]


### Example 3: Flattening a 3D Array

In [35]:


# Creating a 3D numpy array
array_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

# Flattening the array (default 'C' order)
flattened_array = array_3d.flatten()

print("Original 3D Array:")
print(array_3d)
print("Flattened 1D Array:")
print(flattened_array)


Original 3D Array:
[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
Flattened 1D Array:
[1 2 3 4 5 6 7 8]


### Example 4: Flattening a Fortran-Contiguous Array

In [36]:


# Creating a 2D numpy array with Fortran-contiguous memory layout
array_fortran = np.asfortranarray([[1, 2, 3], [4, 5, 6]])

# Flattening the array using 'A' order (column-major if Fortran-contiguous)
flattened_array = array_fortran.flatten(order='A')

print("Original Fortran-contiguous 2D Array:")
print(array_fortran)
print("Flattened Array ('A' Order):")
print(flattened_array)


Original Fortran-contiguous 2D Array:
[[1 2 3]
 [4 5 6]]
Flattened Array ('A' Order):
[1 4 2 5 3 6]


### 3. Numpy `ravel()` – Flatten Arrays (Efficiently!) Function

### What is `ravel()`?

The `numpy.ravel()` function is used to convert **multi-dimensional arrays into a 1D array**—just like `flatten()`—but it's smarter about **memory**. If possible, `ravel()` gives you a **view** (not a copy), meaning it **doesn't duplicate data** unless it has to. i.e changes done by ravel are reflected in original array

It’s perfect when you want a flat array **without wasting memory**, and you don’t need to modify the original data independently.


### Difference between `ravel()` and `flatten()`

| Feature        | `ravel()`                        | `flatten()`                     |
|----------------|----------------------------------|----------------------------------|
| Return type    | Tries to return a **view**       | Always returns a **copy**       |
| Memory usage   | More memory **efficient**        | Uses **more memory**            |
| Modifying output affects original? | Yes (if it's a view)         | No (it's a separate copy)       |
| Speed          | Faster (usually)                 | Slower (due to copying)         |


### Syntax

```python
numpy.ravel(a, order='C')
```
- a - array we want to flatten
- order - parameter inputs are same as flatten

###  Example 1: ravel() affects the original array (if possible)

In [37]:


# Create a 2D numpy array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Use ravel() to flatten — gets a view (shared memory)
ravel_result = np.ravel(array_2d)

# Modify the flattened array
ravel_result[0] = 999

# Print results
print("Modified ravel result:", ravel_result)
print("\nOriginal array after ravel change:")
print(array_2d)

Modified ravel result: [999   2   3   4   5   6]

Original array after ravel change:
[[999   2   3]
 [  4   5   6]]


### Example comparision with Flatten - flatten() does not affect the original array

In [38]:


# Create a 2D numpy array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])

# Use flatten() to flatten — creates a new copy
flatten_result = array_2d.flatten()

# Modify the flattened array
flatten_result[0] = 111

# Print results
print("Modified flatten result:", flatten_result)
print("\nOriginal array after flatten change:")
print(array_2d)

Modified flatten result: [111   2   3   4   5   6]

Original array after flatten change:
[[1 2 3]
 [4 5 6]]


## **Transpose operations**

NumPy's transpose operations switch rows and columns in 2D arrays, and rearrange axes in arrays with more than two dimensions.

### 1. Numpy `transpose()` Function 

**Understanding `numpy.transpose()` in a Beginner-Friendly Way**

When we say we "transpose" an array in NumPy, it means we're flipping its axes.  
But let's put that into a real-world scenario to make it click.


**Real-World Analogy: Rotating a Set of Boxes**

Imagine you run a warehouse and you're stacking boxes with labels like this:

**Boxes (2 rows, 3 columns):**

Row 1: Apple, Banana, Cherry  
Row 2: Date, Fig, Grape

This setup can be seen as a 2D array:

```python
[
  ["Apple", "Banana", "Cherry"],
  ["Date", "Fig", "Grape"]
]
```

Now, suppose someone wants to view the boxes by columns instead of rows, like this:

Column 1: Apple  Date
Column 2: Banana Fig
Column 3: Cherry Grape

That's what numpy.transpose() does. It flips rows into columns and vice versa.

### Syntax
```python
np.transpose(a, axes=None)
```

`a` is the array you want to transpose.
`axes` lets you control how to rearrange the axes (optional).

### Example 1 - School Timetable

we have a timetable stored as rows = days, columns = periods:

Real-World View (Rows = Days, Columns = Periods):

|         |Period 1|Period 2|
|----|----|----|
|Monday |Math|English|
|Tuesday|Physics|Chemistry|

and if we want to tranpose it to view classes period wise as below
|         |Monday|Tuesday|
|----|----|----|
|Period 1|Math|Physics|
|Period 2|English|Chemistry|

In [39]:


timetable = np.array([
  ['Math', 'English'],
  ['Physics', 'Chemistry']
])

# now if you want to view the table period wise instead of day wise we can transpose

tranpose_mat = np.transpose(timetable)

print("Original Timetable (Day-wise):\n", timetable)
print("\nTransposed Timetable (Period-wise):\n", tranpose_mat)

Original Timetable (Day-wise):
 [['Math' 'English']
 ['Physics' 'Chemistry']]

Transposed Timetable (Period-wise):
 [['Math' 'Physics']
 ['English' 'Chemistry']]


### Examples 2 - Image Processing (RGB Image) using specifying axes

Images are stored as (height, width, channels) in many libraries. You want to change it to (channels, height, width) for deep learning libraries like PyTorch.

In [40]:
# Shape: (height=2, width=2, channels=3)
image = np.array([
    [[255, 0, 0], [0, 255, 0]],    # Row 1
    [[0, 0, 255], [255, 255, 0]]   # Row 2
])

# Transpose to (channels, height, width)
transposed_img = np.transpose(image, (2, 0, 1)) # 2 channel to 0th pos, 0th height to 1st pos, 1st width to 2nd pos

print("Original shape:", image.shape)  # (2, 2, 3)
print("Transposed shape:", transposed_img.shape)  # (3, 2, 2)


Original shape: (2, 2, 3)
Transposed shape: (3, 2, 2)


### 2. Numpy `T` Function 

**Understanding `ndarray.T` in in NumPy (Transpose)**

The `.T` attribute in NumPy is a shortcut for **transposing** an array—and it’s one of the most fundamental operations you’ll use in array manipulation, especially when working with tables, matrices, or multidimensional data.

Think of `.T` as saying: **"flip the shape of this data."**

What Does "Transpose" Really Mean?

At its core, transposing means:

- For a **2D array** (like a spreadsheet), it's like swapping **rows** and **columns**.
- For a **3D array** (like a stack of images), it’s like rotating or reordering the layers in some intuitive way.

> The result is a **view**, not a copy — so it’s **fast** and **memory-efficient**!

### Real-world Analogy for 2D Arrays

Imagine a classroom attendance sheet:

**Original**:
|Names |   Math |  Science|   English|
|---|---|---|---|
|John        |85       |90         |88|
|Alice|       92|       89         |94|

**If we transpose this sheet, we switch rows and columns**:
|Subjects|   John|   Alice|
|---|---|---|
|Math|          85|      92|
|Science|       90|      89|
|English|       88|      94|

### Syntax
```python
ndarray.T
```
As simple as that and no additional parameters needs to be passed, thats it

### Example 1 of previous discussed scenario



In [41]:


# Rows = students, Columns = subjects
grades = np.array([
    [85, 90, 88],   # John
    [92, 89, 94],   # Alice
    [78, 85, 80]    # Sam
])

# Transpose to get: Rows = subjects, Columns = students
grades_T = grades.T

print("Original Grades (Students × Subjects):")
print(grades)

print("\nTransposed Grades (Subjects × Students):")
print(grades_T)

Original Grades (Students × Subjects):
[[85 90 88]
 [92 89 94]
 [78 85 80]]

Transposed Grades (Subjects × Students):
[[85 92 78]
 [90 89 85]
 [88 94 80]]


### Example 2 - Transposing a 3D Image Batch (for ML Library)

In [42]:


# 2 images, 2x2 pixels each, with 3 color channels (RGB)
images = np.array([
    [[[255, 0, 0], [0, 255, 0]],
     [[0, 0, 255], [255, 255, 0]]],  # Image 1

    [[[125, 125, 0], [0, 125, 125]],
     [[125, 0, 125], [50, 50, 50]]]   # Image 2
])

print("Original shape (2 images, 2x2 pixels, RGB):", images.shape)

# Transpose for a library that expects channels first: (2, 3, 2, 2)
transposed_images = images.transpose(0, 3, 1, 2)

print("Transposed shape (batch, channels, height, width):", transposed_images.shape)


Original shape (2 images, 2x2 pixels, RGB): (2, 2, 2, 3)
Transposed shape (batch, channels, height, width): (2, 3, 2, 2)


### 3. Numpy `swapaxes()` Function

Imagine you're organizing a multi-layered filing cabinet or stacked boxes where each dimension means something different — for example:

- **Axis 0** → Different years
- **Axis 1** → Different departments
- **Axis 2** → Monthly reports for each department

Now let’s say you're told:
> "Give me the data by month first, then department, then year."

That’s exactly what `swapaxes()` helps with — reordering how you look at the data, without actually moving or copying it.  
It's like rotating the cabinet drawers without touching the files inside.

### Syntax and concept
```python
np.swapaxes(array, axis1, axis2)
```
- It swaps the labels or perspectives of two axes.
- You don’t move data — you just change how you access or iterate over it.

### Example 1- Sensor Data Example (Smartwatch)

Imagine you’re collecting data from a smartwatch with 3 sensors:

- Accelerometer (X, Y, Z)

Over:

- **100 time steps**
- **10 different users**

You store this in a NumPy array shaped like:

```python
# Shape: (users, time_steps, axes)
(10, 100, 3)
```
Meaning:
- Axis 0 → Users

- Axis 1 → Time

- Axis 2 → X, Y, Z sensor values

Suppose you want to process it axis-wise — for example:

Compute the mean X/Y/Z values over time and users and to do that, you'd bring the axis dimension (axis 2) to the front.
 

In [43]:
# Simulate the data
data = np.random.randn(10, 100, 2) # users × time × axes

reoriented = np.swapaxes(data, 0, 2)

# Now: axes × time × users
print(reoriented.shape)

(2, 100, 10)


### Example 2 - Image Data Example

Let’s say you're working with a colored image (RGB).  
A typical image might be represented as:

- **Shape**: Channels × Height × Width

Create a Sample Image

(3 channels for R, G, B, 64 pixels height, 64 pixels width)

In [44]:
# Channels × Height × Width
image = np.random.randint(0, 255, (3, 64, 64))
print("Original shape:", image.shape)

Original shape: (3, 64, 64)


**Problem** - 

Some image processing tools (like OpenCV or TensorFlow) expect images in:

- **Height × Width × Channels** format.

We need to rearrange the axes accordingly.

In [45]:
# Swap axes: move Channels (axis 0) to the last position
converted_image = np.swapaxes(image, 0, 2)
print("Converted shape:", converted_image.shape)
print("image is ready to be displayed or passed to libraries that expect (Height, Width, Channels) format.")

Converted shape: (64, 64, 3)
image is ready to be displayed or passed to libraries that expect (Height, Width, Channels) format.


_____

## **Joining Arrays Operations**

In NumPy, joining arrays involves concatenating multiple arrays along specified axes.

### 1. Numpy `concatenate()` Function

**What Is It?**

The numpy.concatenate() function is used to join two or more arrays into a single array along an existing axis. It does not create new axes—it combines the arrays along one of the dimensions they already share.

This is useful in many real-world scenarios:

- Merging batches of sensor readings.

- Combining image data from multiple channels.

- Appending rows or columns to a dataset in machine learning or statistics.

For suppose, Imagine you're collecting temperature data from two cities. Each city's data is a NumPy array. If you want to analyze them together—maybe to compute combined trends—you'd concatenate the two arrays. Similarly, you may want to stack image data or append new data to a training set.

> **Important:** All input arrays must match in shape except along the specified axis. This is a common source of error.

**Syntax & Parameters**
```python

numpy.concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")
```

| Parameter | Description |
|:----------|:------------|
| `a1, a2, ...` | Arrays to be joined. They must have the same shape except in the dimension you're concatenating along. |
| `axis` | The axis along which to join. Default is 0. If `None`, all arrays are flattened first. |
| `out` | Optional output array to store result. |
| `dtype` | Optional type for the result array. Inferred by default. |
| `casting` | Rules about how data is cast (type-converted). Default is `"same_kind"` for safety. |


### Example 1 - Combining Student Test Scores (Rows and Columns)



In [46]:
# Each row is a student, each column is a subject
core_scores = np.array([[85, 90], [78, 88]])   # Math, Science
new_students = np.array([[80, 89], [76, 82]])  # More student scores

# Electives for the original students
elective_scores = np.array([[75, 85], [82, 79]])  # Art, Music

# Add new students (more rows) → axis=0
full_scores = np.concatenate((core_scores, new_students), axis=0)
print("new students added scores: \n", full_scores)

# Add new subjects (more columns) → axis=1
full_scores = np.concatenate((core_scores, elective_scores), axis=1)
print("\nnew subject added scores: \n",full_scores)

new students added scores: 
 [[85 90]
 [78 88]
 [80 89]
 [76 82]]

new subject added scores: 
 [[85 90 75 85]
 [78 88 82 79]]


### Example 2 - Flatten and Combine Simulation Data (Using axis=None)

You have two simulation result grids. Each is a 2D matrix, but for analysis (e.g. plotting histogram or ML), you want to flatten and join everything into a 1D list.

In [47]:
sim1 = np.array([[1, 2], [3, 4]])
sim2 = np.array([[5, 6], [7, 8]])

# Flatten first, then concatenate
flat_combined = np.concatenate((sim1.ravel(), sim2.ravel()), axis=None) # no need to mention axis too here
print(flat_combined)

[1 2 3 4 5 6 7 8]


### Example 3 - Concatenating Arrays with Mixed Dimensions

In NumPy, you can **combine (concatenate) arrays** even when they have different shapes or dimensions — but only after making them compatible. This is done using **broadcasting techniques**, which allow you to adjust the shape of smaller arrays so they match the larger ones for proper alignment.

To adjust array shapes, you can use:

- `np.reshape()` – to change the overall shape.
- `np.expand_dims()` – to add a new axis (dimension).
- **Slicing** – to manipulate dimensions directly if needed.

These tools help ensure that arrays align correctly along the axis you want to concatenate.

Let’s say you have a 1D array and a 2D array, and you want to combine them **along rows (axis=0)**. You can’t do this directly because their shapes don’t match. But by **expanding the 1D array into 2D**, you can make them compatible.

### ***Adding a New Student to a Gradebook***

Suppose you’re working with student scores in a school. You already have a 2D array where each row represents a student, and each column represents a test score. Now, you want to add scores of a new student, which are in a 1D array.

To add this 1D array as a new row to the 2D gradebook, you need to match the dimensions first.


In [48]:
# Existing gradebook: 3 students, 3 tests each
gradebook = np.array([
    [85, 90, 88],
    [78, 82, 84],
    [92, 95, 91]
])

# New student's scores (1D array)
new_student_scores = np.array([80, 85, 89])

# Expand the 1D array to 2D so it becomes a row
new_student_expanded = np.expand_dims(new_student_scores, axis=0)

# Concatenate the new student to the gradebook
updated_gradebook = np.concatenate((gradebook, new_student_expanded), axis=0)

print("Updated Gradebook:")
print(updated_gradebook)


Updated Gradebook:
[[85 90 88]
 [78 82 84]
 [92 95 91]
 [80 85 89]]


### 2. Numpy `vstack()` Function

The numpy.vstack() function is used to **`stack multiple arrays vertically`**, meaning it joins them row by row into a single array. This is especially useful when dealing with data that is naturally organized in rows — such as time-series logs, datasets where each row is a record, or tabular data like spreadsheets.

Think of it like this:
> Imagine each array is a sheet of paper with data in a row. When you use vstack(), you're placing each sheet on top of the next, forming a taller stack of rows.

**When to Use vstack()?**

- When you are accumulating data row-wise — like adding daily records or hourly sensor readings.

- When you are combining 1D or 2D arrays where each one represents a row or block of rows.

- When working with machine learning or data science where rows represent samples and columns represent features.

**Key requirements:**
- All arrays must have the same number of columns.

- If using 1D arrays, each must have the same length and will be converted to a row.

**Syntax & parameter**
```python
numpy.vstack(tup)
```

tup -	A tuple or list of 1-D or 2-D arrays that will be stacked. All arrays must match in shape except along axis 0 (rows).

### Example 1 - Logging Sensor Data Over Time

Suppose you're collecting temperature sensor readings every hour from different locations. Each reading is stored as a 1D array row represeting 1 hour.
Now you want to build a table where:

Rows = time snapshots (1pm, 2pm, 3pm)

Columns = rooms (Room A, B, C)

In [49]:
reading_1pm = np.array([23.5, 24.1, 22.8])  # Sensors: Room A, B, C
reading_2pm = np.array([23.9, 24.3, 23.1])
reading_3pm = np.array([24.2, 24.5, 23.4])

log_table = np.vstack([reading_1pm, reading_2pm, reading_3pm])
log_table

array([[23.5, 24.1, 22.8],
       [23.9, 24.3, 23.1],
       [24.2, 24.5, 23.4]])

### Example 2 - Adding New Rows to an Existing Dataset

You have a 2D dataset of user data and a new user signs up and you receive their data as a 1D array

In [50]:
user_data = np.array([[101, 25], [102, 30]])  # [User ID, Age]

# new user
new_user = np.array([103, 27])

# you need to stack the data of new user to user data

user_data = np.vstack([user_data, new_user])
user_data

array([[101,  25],
       [102,  30],
       [103,  27]])

### Example 3 - Mixing 1D and 2D arrays

You can mix 1D and 2D arrays as long as the dimensions are compatible.

In [51]:
a = np.array([1, 2, 3])             # Shape (3,)
b = np.array([[4, 5, 6], [7, 8, 9]])  # Shape (2, 3)

result = np.vstack((a, b))

#NumPy automatically promotes the 1D array into a row vector with shape (1, 3) before stacking.
result

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

### 3. Numpy `hstack()` Function

**What Is It?**

`numpy.hstack()` is a NumPy function that stacks arrays `horizontally`, i.e., side-by-side, along axis 1.

This function is commonly used for tasks where you need to:

- Join multiple arrays where each array represents different attributes of the same entities.

- Data enrichment: Adding new feature columns to a dataset.

- Merging structured data: Joining multiple 2D arrays into a larger table of data.

- Combining vectors for modeling: Say you have X_coordinates and Y_coordinates of points — use hstack() to merge them into one 2D matrix [[x1, y1], [x2, y2], ...].etc

**important notes:**

>All arrays must have the same number of rows, or be broadcast-compatible (e.g., a 1D array being reshaped to match a 2D array’s rows).

>reshape(-1, 1) is often necessary to make 1D arrays compatible with 2D horizontal stacking.

**Syntax and Parameters**

```python
numpy.hstack(tup, *, dtype=None, casting='same_kind')
```

| Parameter | Description |
|:----------|:------------|
| `tup` | A sequence (tuple/list) of arrays to be horizontally stacked. All must match in row count (axis 0). |
| `dtype` | Optional. If provided, all arrays are cast to this data type before stacking. Useful for type consistency. |
| `casting` | Controls how strictly data types can be converted. Commonly used when `dtype` is specified. Default is 'same_kind'. |

### Example 1: Combining Demographic Data

Suppose you’re building a dataset for analysis where:

- One array holds ages of users.

- Another holds salaries.

You want to create a single 2D array where each row is [age, salary].



In [52]:
ages = np.array([25, 30, 45])
salaries = np.array([50000, 60000, 80000])

# Combine them side by side
user_data = np.hstack((ages, salaries))
print("before reshape", user_data)

# reshape them to column vector to allow horizantal stacking
ages = ages.reshape(-1, 1) #-1 to automatically calculate one dimension
salaries = salaries.reshape(-1, 1)
user_data = np.hstack((ages, salaries))
print("\nafter reshape: ")
user_data



before reshape [   25    30    45 50000 60000 80000]

after reshape: 


array([[   25, 50000],
       [   30, 60000],
       [   45, 80000]])

## Example 2: Augmenting Data with Derived Features

Let’s say you have a dataset of test scores and now, you want to add a new column — the average score of each student — calculated from the existing ones.

In [53]:
scores = np.array([[85, 90], [78, 82], [92, 88]])  # Columns: [Math, English]

averages = scores.mean(axis=1).reshape(-1, 1) # in further tutorials/description we will come to know how to use mean

aug_scores = np.hstack((scores, averages))
print("[Math, English, Average] scores: \n",aug_scores)

[Math, English, Average] scores: 
 [[85.  90.  87.5]
 [78.  82.  80. ]
 [92.  88.  90. ]]


### 4. Numpy `split()` Function

**What Is It?**

The `numpy.split()` function is used to divide an array into multiple sub-arrays along a specific axis. It helps when you want to:

- Split large datasets into smaller, easier-to-handle parts
- Group arrays into batches for processing
- Organize data into different sections

**Imagine this:**  
You have a long list of data coming from a sensor or a server — maybe it's temperature readings every second.  
Instead of working with the whole thing at once, you can use `numpy.split()` to **divide it into smaller chunks** — like one-minute or one-hour parts — and work with them separately!

That's exactly what `numpy.split()` is made for!

**Syntax & Parameters**
```python
numpy.split(ary, indices_or_sections, axis=0)
```

| Parameter             | Description |
|-----------------------|-------------|
| `ary` (array_like)    | The array you want to split. |
| `indices_or_sections` | Decides how the array is split:<br>– If it's an **integer**, the array is split into **equal parts**.<br>– If it's a **1D list**, splits happen at those **index positions**. |
| `axis` (int, optional)| The axis along which to split.<br>Default is `0` (split **row-wise**).<br>Use `axis=1` to split **column-wise**. |

### Example 1: Splitting Survey Responses

You have collected a flat array of 30 survey answers, and each person answered 5 questions. You want to group these answers by person.



In [54]:
responses = np.arange(30)  # Simulated answers: 0 to 29

# Split into 6 parts, each representing 1 person's 5 responses(6 splits of 5 responses each)
grouped_responses = np.split(responses, 6)

for i, person in enumerate(grouped_responses):
    print(f"Person {i+1}'s responses: {person}")

Person 1's responses: [0 1 2 3 4]
Person 2's responses: [5 6 7 8 9]
Person 3's responses: [10 11 12 13 14]
Person 4's responses: [15 16 17 18 19]
Person 5's responses: [20 21 22 23 24]
Person 6's responses: [25 26 27 28 29]


### Example 2: Separating Data Columns

You receive a 2D array with customer data in 3 columns: [CustomerID, Age, Spending]. You want to separate the ID column from the rest.

In [55]:
data = np.array([
    [101, 25, 200],
    [102, 30, 150],
    [103, 22, 180],
])

# Split at column index 1: one split before 'Age'
idx, age_col, rest = np.split(data, [1,2], axis=1) # idx - customer id, age_col - age, rest - remaining info
print("Customer IDs:\n", idx)
print("Ages:\n", age_col)
print("Remaining Info:\n", rest)

Customer IDs:
 [[101]
 [102]
 [103]]
Ages:
 [[25]
 [30]
 [22]]
Remaining Info:
 [[200]
 [150]
 [180]]


> Also refer Numpy **`hsplit()`** and **`vsplit()`** which work on same principles that we saw previously for joining stack

----

## **Sorting and Searching**

NumPy provides efficient functions to sort and search arrays, making data handling and analysis faster and easier.

### 1. Numpy `sort()` Function

The numpy.sort() function is used to sort elements of an array in ascending order by default. This sorting can be performed:

- On a flat (1D) array,

- On a 2D or bigger array — you can choose to sort each row or each column separately by setting the axis parameter.

It is important to know that this function returns a new sorted array and does not modify the original array in-place.

**Syntax & Parameters**
```python()
numpy.sort(a, axis=-1, kind=None, order=None)
```
| Parameter | Description |
|:----------|:------------|
| `a` | The array to be sorted. |
| `axis` | Axis along which to sort. Default is -1 (last axis). <br> `0` = sort by columns, `1` = sort by rows. |
| `kind` | Sorting algorithm to use: `'quicksort'` (default), `'mergesort'`, `'heapsort'`, `'stable'`. |
| `order` | For structured arrays. Specifies which field(s) to sort by. |

### Example 1 -  Sort Student Marks Row-wise Using a Stable Sort

You have a record of marks scored by students in three subjects, and want to sort each student’s scores, but preserve the order of equal scores.

imp point:
> `stable` (via `kind='stable'`)  Ensures that elements that compare equal retain their original order.

In [56]:
# Rows = Students, Columns = Scores in [Math, Physics, Chemistry]
marks = np.array([
    [75, 90, 75],
    [88, 70, 70],
    [92, 92, 85]
])


sorted_marks = np.sort(marks, axis=1, kind='stable')
# kind='stable' → Ensures that repeated scores like [75, 90, 75] keep the first 75 before the second.

print("Original Marks:\n", marks)
print("\nSorted Marks (Row-wise, stable):\n", sorted_marks)


Original Marks:
 [[75 90 75]
 [88 70 70]
 [92 92 85]]

Sorted Marks (Row-wise, stable):
 [[75 75 90]
 [70 70 88]
 [85 92 92]]


### Example 2 - Sort Product Data (Structured Array) by Name and Then by Price

You're managing product data with fields: name, price, and rating. You want to sort products by name, and also optionally by price when names are the same.

In [57]:
# Structured array: Products with name, price, and rating
products = np.array([
    ('Banana', 40, 4.5),
    ('Apple', 30, 4.7),
    ('Banana', 35, 4.6),
    ('Mango', 50, 4.2)
], dtype=[('name', 'U10'), ('price', int), ('rating', float)])

sorted_name_price = np.sort(products, order=['name', 'price'])
print("Original Product List:\n", products)
print("\nSorted by Name, then Price:\n", sorted_name_price)

Original Product List:
 [('Banana', 40, 4.5) ('Apple', 30, 4.7) ('Banana', 35, 4.6)
 ('Mango', 50, 4.2)]

Sorted by Name, then Price:
 [('Apple', 30, 4.7) ('Banana', 35, 4.6) ('Banana', 40, 4.5)
 ('Mango', 50, 4.2)]


### Example 3: Sort Log Records by Time, Using axis=0 and kind='mergesort'

You're analyzing log data where each row represents a log entry: [timestamp, severity_code]. You want to sort logs by timestamp (column 0), keeping log entries stable in case of same time (e.g., log batching).


In [58]:
# Logs as structured array : [timestamp, severity]
logs = np.array([
    (3, 1),  # [timestamp, severity]
    (1, 3),
    (3, 2),
    (2, 1)
], dtype=[('timestamp', int), ('severity', int)])

# Sort by 'timestamp' using stable sort
sorted_logs = np.sort(logs, order='timestamp', kind='mergesort')

print("Original Logs:\n", logs)

print("Sorted Logs:\n", sorted_logs)

Original Logs:
 [(3, 1) (1, 3) (3, 2) (2, 1)]
Sorted Logs:
 [(1, 3) (2, 1) (3, 1) (3, 2)]


### 2. Numpy `argsort()` Function

**What is it?**

It doesn’t sort the array itself. Instead, it tells you where each element would go if the array were sorted.

This is useful when:

- You want to rearrange another array based on how one array would be sorted.

- You're dealing with complex data and only want the sorted order (not the values).

- You need stable or specific sorting algorithms for consistency or performance.

simply put - np.argsort() tells you how to reorder your data to get it sorted, without actually sorting it for you. It returns the indices of the array in the order you should use to get a sorted version.

**Syntax & Paramters**
```python
numpy.argsort(arr, axis=-1, kind=None, order=None)
```
> Parameters follow the same structure and meaning that we saw in normal sort() function of numpy


### Example 1 -  Ranking Students with Tied Scores (using kind='stable')

You want to sort scores but preserve the order of equal elements.

In [59]:
scores = np.array([90, 70, 90, 85])
sort_preserve = np.argsort(scores, kind='stable')

print("Original Scores:", scores)
print("Sorted Indices:", sort_preserve)
print("Scores in Sorted Order:", scores[sort_preserve])

# as you can see in output the original array is not disturbed

Original Scores: [90 70 90 85]
Sorted Indices: [1 3 0 2]
Scores in Sorted Order: [70 85 90 90]


### Example 2 - Sorting Game Scores per Player (using axis=1)

Each row is a player, and you want to find the order of their scores across games.

In [60]:
scores = np.array([
    [20, 10, 30],  # Player 1
    [5, 25, 15]    # Player 2
])

indices = np.argsort(scores, axis=1)
print("Original Scores:\n", scores)
print("Argsort Indices Row-wise:\n", indices)

# Let's sort using the indices to see sorted row values
sorted_scores = np.take_along_axis(scores, indices, axis=1)
print("Sorted Scores Row-wise:\n", sorted_scores)

Original Scores:
 [[20 10 30]
 [ 5 25 15]]
Argsort Indices Row-wise:
 [[1 0 2]
 [0 2 1]]
Sorted Scores Row-wise:
 [[10 20 30]
 [ 5 15 25]]


### Example 3 - E-commerce Products Sorted by Price & Rating (using order)

You want to sort products by price, and then by rating if prices are equal.

In [61]:
products = np.array([
    ('Keyboard', 100, 4.5),
    ('Mouse', 50, 4.8),
    ('Monitor', 100, 4.2)
], dtype=[('name', 'U10'), ('price', 'i4'), ('rating', 'f4')])

indices = np.argsort(products, order=('price', 'rating'))
sorted_products = products[indices][::-1] # slicing can be used to revert the array - [::-1]

print("Original Products:\n", products)
print("Sorted Indices:", indices)
print("Sorted Products in descending:\n", sorted_products)

Original Products:
 [('Keyboard', 100, 4.5) ('Mouse',  50, 4.8) ('Monitor', 100, 4.2)]
Sorted Indices: [1 2 0]
Sorted Products in descending:
 [('Keyboard', 100, 4.5) ('Monitor', 100, 4.2) ('Mouse',  50, 4.8)]


### 3. Numpy `argmax()/argmin()` Function

**What is numpy.argmax()?**

`numpy.argmax()` returns the `index` of the maximum value in an array. If the maximum occurs multiple times, it gives the index of the `first occurrence`.

This is especially useful in tasks where you're not just interested in the `maximum value`, but in `where` that maximum occurs (like choosing a winner based on highest score, or finding the best-performing model/parameter).

**What is numpy.argmin()?**

`numpy.argmin()` returns the index (or indices) of the smallest value in an array. It’s commonly used in situations where you need to identify the position of the lowest score, smallest measurement, or minimum cost, etc.


**Syntax & Paramters of `argmax()`**
```python
>numpy.argmax(a, axis=None, out=None, *, keepdims=<no value>)

>numpy.argmin(a, axis=None, out=None, keepdims=<no value>)
```

| Parameter | Description |
|:----------|:------------|
| `a` | The input array from which to find the index of the maximum value. |
| `axis` | (Optional) The axis along which to find the maximum index:<br> - `None` (default): Flattens array and returns index of global max.<br> - `0`: Column-wise max index.<br> - `1`: Row-wise max index. |
| `out` | (Optional) An array where the result is stored. Must be same shape/type as output. |
| `keepdims` | (Optional) If `True`, the reduced axis is kept as a dimension with size 1, making output shape compatible with broadcasting. |

### Example 1 - Selecting the Best Product Based on Ratings

Out of several products, find which product has the highest average rating.


In [62]:
# Rows = Products, Columns = Ratings from different users
ratings = np.array([
    [4.2, 4.5, 4.0],   # Product A
    [3.9, 4.8, 4.1],   # Product B
    [4.5, 4.4, 4.3]    # Product C
])

average_mean = ratings.mean(axis=1)

avg_max = np.argmax(average_mean)

print("Average Ratings:", average_mean)

print("Best Product Index:", avg_max)

Average Ratings: [4.23333333 4.26666667 4.4       ]
Best Product Index: 2


### Example 2 - Classroom Exam – Find Top Scorer in Each Subject

Identify the student with the highest score in each subject.

In [63]:
# Rows = Students, Columns = Subjects
scores = np.array([
    [85, 92, 78],   # Student 1
    [88, 91, 90],   # Student 2
    [75, 94, 85]    # Student 3
])

# Column-wise argmax → axis=0 (which student topped each subject)
topper_indices = np.argmax(scores, axis=0)

print("Topper Indices by Subject:", topper_indices)

Topper Indices by Subject: [1 2 1]


### Example 3 - Tracking Winning Teams in a Sports League (with out and keepdims)

In [64]:
# Rows = Match weeks, Columns = Teams
scores = np.array([
    [3, 2, 1],   # Week 1
    [0, 5, 2],   # Week 2
    [1, 1, 4]    # Week 3
])

out_array = np.empty((3, 1), dtype=int)
win_teams = np.argmax(scores, axis=1, out=out_array, keepdims=True)
print("Scores:\n", scores)
print("Winning Team Indices (kept as 2D):\n", win_teams)

Scores:
 [[3 2 1]
 [0 5 2]
 [1 1 4]]
Winning Team Indices (kept as 2D):
 [[0]
 [1]
 [2]]


### Example 4 - Comparing Employee Productivity(using both argmin and argmax)

In [65]:
# Tasks Completed Per Day by Each Employee
# Rows = employees (E1, E2, E3)
# Columns = days (Day1, Day2, Day3, Day4)
tasks = np.array([
    [5, 7, 8, 6],    # Employee 1
    [9, 4, 3, 5],    # Employee 2
    [6, 8, 9, 7]     # Employee 3
])

# total tasks for each row (employee). So we sum across columns (axis=1).
total_tasks = np.sum(tasks, axis=1)
print("Total tasks per employee:", total_tasks)

# Most & Least Productive Employees
most_productive_idx = np.argmax(total_tasks)
least_productive_idx = np.argmin(total_tasks)
print("Most productive employee index:", most_productive_idx)
print("Least productive employee index:", least_productive_idx)

print("Most productive employee's task total:", total_tasks[most_productive_idx])
print("Least productive employee's task total:", total_tasks[least_productive_idx])

# argmax() → Employee 3 (index 2) did the most
# argmin() → Employee 2 (index 1) did the least

Total tasks per employee: [26 21 30]
Most productive employee index: 2
Least productive employee index: 1
Most productive employee's task total: 30
Least productive employee's task total: 21


### 4. Numpy `where()` Function

The numpy.where() function performs element-wise conditional logic on arrays. It can:

- Return indices of array elements that satisfy a condition.

- Select values from two options (x and y) based on whether a condition is True or False.

It serves as a vectorized version of the if-else statement, making it especially powerful for array-based decision-making.

>**np.where()** returns a tuple of arrays based on condition given — one for row indices, one for column indices of all elements

**Syntax and parameters**
```python
numpy.where(condition, [x, y])
```

| Parameter | Description |
|:----------|:------------|
| `condition` | A boolean array or condition expression. For each element, this condition determines whether to choose a value from `x` or `y`, or to return its index. |
| `x` | Values to use where the condition is `True`. This can be an array, scalar, or expression. |
| `y` | Values to use where the condition is `False`. This must be broadcastable to the shape of `x` or the input array. |

### Example - Inventory Management for a Supermarket

You are managing inventory for a supermarket. You have the current stock levels of several products, and you want to do three things:

- Identify which items are out of stock (i.e., stock = 0).

- Label them as 'Restock' if they are out of stock, or 'Sufficient' if they are available.

- Additionally, you want to store the indices of items that need restocking in a pre-allocated output array (using the out parameter with np.where() indirectly).

- You want the result to maintain the original shape of the array, even after finding positions (using keepdims=True, if applicable in related use).


In [66]:
# Step 1: 3x3 - Inventory data (each item represents quantity in stock)
stock = np.array([[3, 0, 5],
                  [0, 6, 2],
                  [1, 0, 4]])

# label each item as restock if stock is 0
label = np.where(stock==0, 'Restock', 'Sufficient')
print("Stock Matrix:\n", stock)
print("Labels Matrix:\n", label)

# Getting the Indices of Products That Need Restocking
# Step 3: Only the condition, to get indices of out-of-stock items
out_of_stock_indices = np.where(stock==0)
print("\nIndices of out-of-stock products:", out_of_stock_indices)

#Using the out Parameter to Store Index Information
# Step 4: Flattened version of stock array to find first 0
flat_stock = stock.flatten()

# Pre-allocated output array to store result
output = np.empty((), dtype=int)

# Get index of first zero using argwhere and store it in 'output'
np.argwhere(flat_stock == 0).flatten()[0].__int__()

# OR, using np.where for boolean result
first_zero_index = np.where(flat_stock == 0)[0][0]
output[...] = first_zero_index #this will return couple of arrays, one for rows and other for columns(indices)
# output[...] is NumPy’s way of writing “assign this value to the entire output scalar.”

print("Index of first out-of-stock product in flattened array:", output)


# Maintain Dimensionality with keepdims=True (When Used in Combination)
# Step 5: Check which rows need restocking using any(), preserving shape
restock_rows = np.any(stock == 0, axis=1, keepdims=True)

print("Restock needed (by row):\n", restock_rows)



Stock Matrix:
 [[3 0 5]
 [0 6 2]
 [1 0 4]]
Labels Matrix:
 [['Sufficient' 'Restock' 'Sufficient']
 ['Restock' 'Sufficient' 'Sufficient']
 ['Sufficient' 'Restock' 'Sufficient']]

Indices of out-of-stock products: (array([0, 1, 2]), array([1, 0, 1]))
Index of first out-of-stock product in flattened array: 1
Restock needed (by row):
 [[ True]
 [ True]
 [ True]]


------

## **Broadcasting**

### 1. Numpy `broadcast()` Function

The numpy.broadcast() class mimics NumPy's internal broadcasting rules. It lets you "manually" explore and iterate over broadcasted arrays without creating huge expanded versions in memory.

in simple terms: 
It lets arrays with different shapes work together as if they had the same shape and you don’t need to create a bigger array; broadcasting simulates it.

**Syntax**

```python
numpy.broadcast(*array_like)
```

**Parameters**
- *array_like: One or more arrays that you want to broadcast. These arrays must be compatible in shape according to NumPy's broadcasting rules.

**Returns**

Returns a numpy.broadcast object which:

- Has a .shape attribute (the broadcasted shape).

- Can be iterated element-wise.

- Supports .index and .iters to inspect broadcasting steps.

### Example with explanation if you didnt still quite understand the above description

Let’s explain this using a real-world example with zero assumptions.

Imagine a shop that sells 3 products:

- Product A

- Product B

- Product C

It tracks the prices for 2 days:

In [67]:
# 2 rows for 2 days, 3 columns for 3 products
daily_prices = np.array([
    [100, 200, 300],  # Day 1
    [110, 210, 310]   # Day 2
])


# One discount for each product
discount = np.array([10, 20, 30])


Here we have dairy prices and discount, and now the store wants to subtract discounts per product:

ususally without explicitly declaring broadcasting, the below syntax automatically takes broadcasting into picture to complete this operation

In [68]:
final_prices = daily_prices - discount
final_prices

array([[ 90, 180, 270],
       [100, 190, 280]])

**background**
```python
daily_prices:
[[100 200 300]
 [110 210 310]]

discount_(broadcasted version):
[[10 20 30]
 [10 20 30]]

Result after subtraction:
[[90 180 270]
 [100 190 280]]
 ```

But this broadcasted [[10, 20, 30], [10, 20, 30]] was never created in memory — NumPy just pretends it’s there. That saves memory.

>The np.broadcast() class lets you manually see this process.

In [69]:
daily_prices = np.array([
    [100, 200, 300],
    [110, 210, 310]
])

discount = np.array([10, 20, 30])

# Create a broadcast object (no actual array created!)
b = np.broadcast(daily_prices, discount)

print("Shape:", b.shape)  # Output: (2, 3)

# You Can Loop Through the Broadcasted Pairs
for price, disc in b:
    print(f"Original Price: {price}, Discount: {disc}, Final Price: {price - disc}")

Shape: (2, 3)
Original Price: 100, Discount: 10, Final Price: 90
Original Price: 200, Discount: 20, Final Price: 180
Original Price: 300, Discount: 30, Final Price: 270
Original Price: 110, Discount: 10, Final Price: 100
Original Price: 210, Discount: 20, Final Price: 190
Original Price: 310, Discount: 30, Final Price: 280


**To sum it up**

numpy.broadcast() helps NumPy figure out how to match two (or more) arrays of different shapes so they can work together in calculations — without actually copying or expanding any data.

**Imagine This:**
You want to add a small array (like bonus marks for 3 subjects) to a big array (like marks of 4 students in 3 subjects).
But their shapes don’t match exactly.

- Instead of manually looping through every row, NumPy:

- Virtually stretches the small array to match the big one.

- Lets you do the addition directly — fast and memory-efficient.

### 2. Numpy `expand_dims()` Function

The numpy.expand_dims() function **adds a new axis (dimension)** to an existing NumPy array at a specified position.
This is often used when:

- You want to change the shape of an array to make it compatible with broadcasting or machine learning models.

- You need to reshape data for operations like stacking, batch processing, or feeding into functions that require specific input shapes.

**Syntax & Parameters**
```python
numpy.expand_dims(a, axis)
```

| Parameter | Description |
|:----------|:------------|
| `a` | The input array you want to expand. |
| `axis` | The position (index) where the new axis should be added. Can be positive (from beginning) or negative (from end). |

**Returns**

Returns a view of the array with one more dimension (without copying the data).
The shape changes, but the data remains the same.

### Example Scanrio where this function is used: 

**Preparing Data for a Machine Learning Model :**

Imagine you're building a machine learning model to predict stock prices.
Your raw data for one day is stored in a 1D array:

In [70]:
daily_features = np.array([100.5, 101.0, 99.8])  # [Open, High, Low]
daily_features

array([100.5, 101. ,  99.8])

But your ML model expects input in the shape:
(batch_size, timesteps, features) — e.g., (1, 1, 3) for a single day.

In [71]:
# You can reshape your daily_features from shape (3,) to (1, 1, 3) like this:

# Step 1: Add a batch dimension → shape becomes (1, 3)
step1 = np.expand_dims(daily_features, axis=0)

# Step 2: Add a timestep dimension → shape becomes (1, 1, 3)
finalStep = np.expand_dims(step1, axis=1)

print("Final input shape for model:", finalStep.shape)
print("\nFInal array with expanded dimension:", finalStep)

# Now this array can be feeded to model

Final input shape for model: (1, 1, 3)

FInal array with expanded dimension: [[[100.5 101.   99.8]]]


to sum it up

> "np.expand_dims() adds a new axis to your array at the position you choose — helping you reshape data for ML, broadcasting, and more — without changing the data itself."

- Use np.broadcast() when you just want to perform operations like a + b without modifying shapes.

- Use np.expand_dims() when you need to physically reshape the array (e.g., for ML models or explicit shape matching).

------

## **Set Operations**

NumPy provides functions to perform set-based operations on arrays—like finding unique values, combining arrays (union), finding common elements (intersection), and identifying differences. These are especially helpful when working with distinct or non-repeating values in your data.

### 1. Numpy `unique()` Function

**np.unique()** function is used to find all the unique (non-repeating) elements in a NumPy array. In addition to identifying the unique elements, it can also return:

- The indices where those unique elements first appear in the original array.

- An array of indices that can be used to reconstruct the original array from the unique elements.

- The count of occurrences of each unique element in the original array.

**Syntax & Parameters**
```python
numpy.unique(ar, return_index=False, return_inverse=False, return_counts=False, axis=None)
```

| Parameter | Description |
|:----------|:------------|
| `ar` | Input array. It will be flattened if it is not 1-dimensional. |
| `return_index` | If `True`, returns the indices of the first occurrences of the unique values in the original array. |
| `return_inverse` | If `True`, returns the indices to reconstruct the original array from the unique values. |
| `return_counts` | If `True`, returns the number of times each unique item appears. |
| `axis` *(optional)* | If specified, finds unique values along the specified axis (advanced use). |


**Return Values**

Depending on the combination of optional flags (return_index, return_inverse, return_counts), the function returns:

- Just the unique values (if no flags are True)

- Or a tuple including:

    - unique: Sorted unique values

    - index: Indices of first occurrences (if return_index=True)

    - inverse: Indices to reconstruct original (if return_inverse=True)

    - counts: Count of each unique element (if return_counts=True)

### Example Scenario - Student Grades Table

Imagine you’re analyzing a small class's test scores:

In [72]:
grades = np.array([
    [90, 80, 70],
    [90, 80, 70],
    [85, 75, 65],
    [90, 80, 70],
    [85, 75, 65]
])

Each row represents a student’s scores in 3 subjects.

You want to know:

- How many unique sets of scores exist (i.e., unique students by marks).

- Which rows (students) are duplicates.

- Optionally, get how many times each set of scores appears.

Let’s use np.unique() with axis=0:

In [73]:
unique_rows, indices, inverse, counts = np.unique(grades, return_index=True, return_counts=True, return_inverse=True, axis=0)

print("Unique Rows:\n", unique_rows)
print("First Occurrence Indices:\n", indices)
print("Inverse Indices:\n", inverse)
print("Counts:\n", counts)

Unique Rows:
 [[85 75 65]
 [90 80 70]]
First Occurrence Indices:
 [2 0]
Inverse Indices:
 [1 1 0 1 0]
Counts:
 [2 3]


| Parameter | Meaning |
|:----------|:--------|
| `axis=0` | Find unique rows (students) in the array. |
| `return_index=True` | Show index of first appearance of each unique row. |
| `return_inverse=True` | Used to reconstruct the original array from unique rows. |
| `return_counts=True` | Show how many times each row appears. |

**Interpretation of output**

**`unique_rows`:**

Two unique score sets:  
```
[85 75 65]
[90 80 70]
```

**`indices`:**  
```
[2 0]
```
Meaning:  
- `[85 75 65]` was first seen at row 2  
- `[90 80 70]` at row 0

**`inverse`:**  
```
[1 1 0 1 0]
```
This means:  
- 1st student → 2nd unique row  
- 3rd student → 1st unique row  
- 5th student → 1st unique row  

**`counts`:**  

```
[2 3]
```
This means:  
- `[85 75 65]` appears 2 times  
- `[90 80 70]` appears 3 times  

In [74]:
reconstructed = unique_rows[inverse]
reconstructed

array([[90, 80, 70],
       [90, 80, 70],
       [85, 75, 65],
       [90, 80, 70],
       [85, 75, 65]])

### What if We Use axis=1?

Let’s try to find unique columns instead of rows:

In [75]:
unique_cols = np.unique(grades, axis=1)

print("observed grades\n", grades)

print("\n unique col grades\n", unique_cols)

observed grades
 [[90 80 70]
 [90 80 70]
 [85 75 65]
 [90 80 70]
 [85 75 65]]

 unique col grades
 [[70 80 90]
 [70 80 90]
 [65 75 85]
 [70 80 90]
 [65 75 85]]


It sorts and deduplicates columns, so axis=1 applies uniqueness column-wise, not row-wise.


### 2. Numpy `in1d()` Function

The numpy.in1d() function is useful for testing membership—that is, checking if elements from one array (ar1) are also present in another array (ar2). It returns a Boolean array where each element indicates whether the corresponding element from ar1 is found in ar2. This function can be especially useful for comparing data between two sets, such as checking if customers in a list received a promotion etc.

**Syntax and Parameters**
```python
numpy.in1d(ar1, ar2, assume_unique=False, invert=False)
```

| Parameter       | Type         | Description |
|----------------|--------------|-------------|
| `ar1`           | array_like   | The first input array, whose elements you want to check for membership in `ar2`. This can be a 1D array. |
| `ar2`           | array_like   | The second array against which you want to test for membership. It can also be a 1D array. |
| `assume_unique` | bool, optional | If `True`, both `ar1` and `ar2` are assumed to contain no duplicates, making the function faster. Default is `False`. |
| `invert`        | bool, optional | If `True`, the Boolean result is inverted: `True` means the element in `ar1` is **not** in `ar2`. Default is `False`. |


**Return Type:**

The function returns a 1D Boolean array with the same shape as ar1. The value is True if the corresponding element from ar1 is present in ar2, and False otherwise.

### Example Scenario - running an online course

Let’s say we are running an online course, and we have two lists:

- **List of students who paid for the course** (`students_paid`)
- **List of students who registered for the course** (`students_registered`)

We want to check:

- ✅ Which students from the paid list are also on the registration list (i.e., confirmed attendance).
- ❌ Which students are missing from registration, so we can follow up.

We'll use all parameters to:

- Test membership.
- Handle duplicates more efficiently with `assume_unique=True`.
- Invert the result to find students **not registered**.




In [76]:
# List of students who paid
students_paid = np.array(['Alice', 'Bob', 'Charlie', 'David', 'Eva', 'Frank', 'Grace', 'Hannah'])

# List of students who registered
students_registered = np.array(['Charlie', 'Eva', 'Grace', 'Hannah', 'Jack'])

# Check if paid students are registered
paid_registered = np.in1d(students_paid, students_registered)
print("\nPaid & Registered Students:", paid_registered)


Paid & Registered Students: [False False  True False  True False  True  True]


  paid_registered = np.in1d(students_paid, students_registered)


- Charlie, Eva, Grace, and Hannah are registered as indicated by True.

- Alice, Bob, David, and Frank are not found in the registration list, so they get False.

Now, we want to identify the students who paid but did not register (to follow up with them). We can use the invert=True parameter to flip the Boolean array:

In [77]:
paid_not_registered = np.in1d(students_paid, students_registered, invert=True)
print("Paid but Not Registered Students:", paid_not_registered)

Paid but Not Registered Students: [ True  True False  True False  True False False]


  paid_not_registered = np.in1d(students_paid, students_registered, invert=True)


- Alice, Bob, David, and Frank are not registered, so they get True in the output (because we inverted the result).

- Charlie, Eva, Grace, and Hannah are registered, so they get False.

If we know that both students_paid and students_registered have no duplicate entries, we can set **`assume_unique=True`** to speed up the operation. This tells NumPy to skip checking for duplicates, making the function faster for large lists.

In [78]:
# Using assume_unique=True to speed up the process
paid_registered_unique = np.in1d(students_paid, students_registered, assume_unique=True)
print("Paid & Registered Students (Unique):", paid_registered_unique)

Paid & Registered Students (Unique): [False False  True False  True False  True  True]


  paid_registered_unique = np.in1d(students_paid, students_registered, assume_unique=True)


The result remains the same, but with assume_unique=True, NumPy can skip the duplicate check and process the arrays faster, especially when the arrays are large.

### Example with Duplicates (Where assume_unique=False):

If the arrays have duplicates, we can see the effect of not assuming uniqueness.


In [79]:
# Add duplicates to the lists
students_paid_duplicates = np.array(['Alice', 'Bob', 'Charlie', 'Charlie', 'David', 'Eva', 'Frank', 'Frank', 'Grace', 'Hannah'])
students_registered_duplicates = np.array(['Charlie', 'Eva', 'Grace', 'Hannah', 'Jack', 'Charlie'])

# Check if paid students are registered (with duplicates in lists)
paid_registered_with_duplicates = np.in1d(students_paid_duplicates, students_registered_duplicates, assume_unique=False)
print("Paid & Registered Students with Duplicates:", paid_registered_with_duplicates)


Paid & Registered Students with Duplicates: [False False  True  True False  True False False  True  True]


  paid_registered_with_duplicates = np.in1d(students_paid_duplicates, students_registered_duplicates, assume_unique=False)


- We have duplicates like Charlie and Frank in students_paid_duplicates, and Charlie in students_registered_duplicates.

- Charlie is correctly marked as registered both times in the output.

- Frank is marked as not registered, even though it appears twice in students_paid_duplicates.


**Summary**

- **Without `invert`**: You get `True` when an element in `ar1` is found in `ar2`.

- **With `invert=True`**: The result is flipped—`True` means the element in `ar1` is **not** found in `ar2`.

- **With `assume_unique=True`**: Faster performance, assuming no duplicates in the arrays.

- **With `assume_unique=False`**: Checks duplicates, and the result reflects whether each occurrence in `ar1` is found in `ar2`.

-----

# ***Iterating Over Array in NumPy***

accessing each element in a systematic way to perform operations like calculations, checks, or modifications. NumPy offers several methods to iterate efficiently, especially over multi-dimensional arrays.

Different Ways to Iterate Over NumPy Arrays

- Using Basic for Loops

- Iterating with Indices

- Using nditer() Iterator

- Flat Iteration


- Controlling Iteration Order

- Broadcasting Iteration

- Vectorized Operations (No explicit loop)

- External Loop

- Modifying Array Values During Iteration


## **Using Basic for Loops**

Python’s native for loop can be used to iterate over NumPy arrays

In [80]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

for element in arr:
    print(element)

1
2
3
4
5


> When you iterate over a multi-dimensional array using a for loop, you do not get individual values — you get sub-arrays at the next depth.

In [81]:
# 2D Array (Matrix) — Iterating by Rows

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

for row in arr_2d:
    print(row)


[1 2 3]
[4 5 6]
[7 8 9]


In [82]:
# 3D Array — Iterating by 2D Matrices

arr_3d = np.array([[[1, 2, 3],
                    [4, 5, 6]],
                   [[7, 8, 9],
                    [10, 11, 12]]])

for matrix in arr_3d:
    print(matrix)


[[1 2 3]
 [4 5 6]]
[[ 7  8  9]
 [10 11 12]]


In [83]:
# Iterating Over Individual Elements in Multi-Dimensional Arrays

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

for i in range(arr_2d.shape[0]):       # Rows
    for j in range(arr_2d.shape[1]):   # Columns
        print(arr_2d[i, j])


1
2
3
4
5
6
7
8
9


##  **Iterating with Indices**

In [84]:
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

rows, cols = arr_2d.shape

for i in range(rows):
    for j in range(cols):
        print(f"Element at ({i}, {j}): {arr_2d[i, j]}")


Element at (0, 0): 1
Element at (0, 1): 2
Element at (0, 2): 3
Element at (1, 0): 4
Element at (1, 1): 5
Element at (1, 2): 6
Element at (2, 0): 7
Element at (2, 1): 8
Element at (2, 2): 9


## **Using np.nditer()**

The nditer() function in NumPy offers a fast and flexible way to loop through each element of a NumPy array, no matter how many dimensions it has. It follows Python’s regular iteration rules, making it easy to access every value in the array one by one.

**Syntax & Parameters**
```python
np.nditer(op, flags=None, op_flags=None, order='K', casting='safe', op_dtypes=None, ...)
```

| Parameter     | Description |
|---------------|-------------|
| `op`          | The array (or arrays) you want to iterate over. |
| `flags`       | Controls how the iteration happens (e.g., `'multi_index'`, `'external_loop'`). |
| `op_flags`    | Tells if you're reading, writing, or both (`'readonly'`, `'readwrite'`, `'writeonly'`). |
| `order`       | Iteration order: `'C'` (row-major), `'F'` (column-major), or `'K'` (as in memory). |
| `op_dtypes`   | You can specify data types for iteration if needed. |
| `broadcasting`| If iterating multiple arrays, they should be broadcast-compatible. |


In [85]:
# basic
a = np.array([[1, 2],
                [3, 4]])

for x in np.nditer(a):
    print(x)

# Modifying (writing to) array elements

for x in np.nditer(a, op_flags=['readwrite']):
    x[...] = x * 10

print(a)

# Iterating with Multi-Index (get row, col info)

it = np.nditer(a, flags=['multi_index'])
for x in it:
    print(f"Value: {x}, at index: {it.multi_index}")

# Iterating Two Arrays Together

b = np.array([[11, 21], [31, 41]])

for x, y in np.nditer([a, b]):
    print(x, y)

# keep in mind if you are trying to iterate over two arrays at same time be sure their shapes are same, else do brodacast or reshape

1
2
3
4
[[10 20]
 [30 40]]
Value: 10, at index: (0, 0)
Value: 20, at index: (0, 1)
Value: 30, at index: (1, 0)
Value: 40, at index: (1, 1)
10 11
20 21
30 31
40 41


**Use `nditer()` when:**

- You want to loop through every element of an n-dimensional array.
- You need to modify elements while looping.
- You need to track indices using `multi_index`.
- You are working with multiple arrays together (e.g., adding two arrays).
- You want more performance and control over the iteration.


## **Flat Iteration**

Flat iteration means going through each value in a multi-dimensional array one by one, just like it's a simple one-dimensional list. It’s helpful when you want to work on every element without worrying about the array's shape or number of dimensions.

In [86]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

for x in np.nditer(arr, flags=['buffered']):
    print(x, end=' ')


1 2 3 4 5 6 

## **Controlling Iteration Order (order='C' vs order='F')**

Iteration order affects which direction elements are accessed:

- 'C' = row-major (default, like C programming)

- 'F' = column-major (like Fortran)

In [87]:
arr = np.array([[1, 2], [3, 4]])
print(arr)

print("C-style:")
for x in np.nditer(arr, order='C'):
    print(x)

print("F-style:")
for x in np.nditer(arr, order='F'):
    print(x)


[[1 2]
 [3 4]]
C-style:
1
2
3
4
F-style:
1
3
2
4


## **Modifying Elements During Iteration (op_flags)**

By default, nditer() doesn’t allow modifications. You must specify op_flags=['readwrite'] to allow writing to the original array.
this is already covered in above explanation of nditer, but still look into the below example

In [88]:
arr = np.array([1, 2, 3])

with np.nditer(arr, op_flags=['readwrite']) as it:
    for x in it:
        x[...] = x * 2

print(arr)


[2 4 6]


## **Broadcasting While Iterating (Multiple Arrays)**

**Broadcasting iteration** in NumPy means looping over multiple arrays at the same time, even if their shapes are not exactly the same — as long as they are compatible.

NumPy automatically adjusts (or "broadcasts") the shapes so that element-wise operations can be done efficiently, without needing to reshape or align the arrays manually.

#### Example

In the example below, we loop through two arrays `arr1` and `arr2` at the same time using the `nditer()` function.

Each corresponding pair of elements from `arr1` and `arr2` is added together. This shows how you can perform element-wise operations across arrays of different shapes without extra code.

In [89]:
a = np.array([1, 2, 3])
b = np.array([10, 20, 30])

for x, y in np.nditer([a, b]):
    print(x + y, end=" ")


11 22 33 

## **External Loop Flag**

Flag external_loop makes iteration faster by returning chunks (e.g., rows or blocks) instead of individual scalars.

In [90]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

for row in np.nditer(arr, flags=['external_loop']):
    print(row)


[1 2 3 4 5 6]


## **Vectorized Operations (Preferred Over Loops When Possible)**

Instead of looping through array elements manually, NumPy is designed to do operations on whole arrays at once.

In [91]:
print("Vectorized Operations:")
arr1 = np.array([1, 2, 3, 4])
result = arr1 * 10
print("Result of multiplication:", result)


arr2 = np.array([10, 20, 30, 40])

# Vectorized addition operation
result = arr1 + arr2


print("Result of addition:", result)


Vectorized Operations:
Result of multiplication: [10 20 30 40]
Result of addition: [11 22 33 44]


-----

## ***NumPy Indexing and Slicing***

NumPy indexing is used to access or modify elements in an array. It supports:

- Basic Indexing – Accessing elements using integer indices

- Slicing – Extracting a portion of the array using a range

- Advanced Indexing – Using arrays/lists of indices or boolean masks (covered in-depth later)


In [92]:

# 1D Array Example: Marks of 5 students
marks = np.array([86, 98, 100, 65, 75])
print("1D Array:", marks)

# Accessing a single element (3rd student's score)
print("3rd student's score:", marks[2])  # Indexing starts at 0

# Accessing a range (from 2nd to 4th student)
print("Scores from 2nd to 4th student:", marks[1:4])  # index 1 to 3

# Accessing last two scores using negative indexing
print("Last two scores:", marks[-2:])  # from second-last to end

# Accessing every second score (even-indexed positions)
print("Scores at even indices:", marks[::2])  # 0, 2, 4

# Reversing the array
print("Reversed scores:", marks[::-1])

# 2D Array Example: Each row is a student, columns are scores in 3 subjects
scores = np.array([
    [85, 90, 95],   # Student 1
    [78, 88, 92],   # Student 2
    [69, 76, 80],   # Student 3
    [91, 94, 97]    # Student 4
])
print("\n2D Array (Student Scores):\n", scores)

# Accessing a specific element: 2nd student's 3rd subject
print("2nd student's 3rd subject score:", scores[1, 2])  # row 1, col 2

# Accessing a full row: all scores of 3rd student
print("3rd student's scores:", scores[2])  # row 2

# Accessing a full column: scores in 2nd subject
print("Scores in 2nd subject:", scores[:, 1])  # all rows, col 1

# Accessing a submatrix: 2nd and 3rd students, 1st and 2nd subjects
print("Submatrix (2nd–3rd students, 1st–2nd subjects):\n", scores[1:3, 0:2])

# Accessing from a certain row onward (e.g., from 2nd student to last)
print("Scores from 2nd student onward:\n", scores[1:])

# Accessing a specific range: only middle two students and their last subject
print("Middle students' last subject scores:", scores[1:3, 2])

1D Array: [ 86  98 100  65  75]
3rd student's score: 100
Scores from 2nd to 4th student: [ 98 100  65]
Last two scores: [65 75]
Scores at even indices: [ 86 100  75]
Reversed scores: [ 75  65 100  98  86]

2D Array (Student Scores):
 [[85 90 95]
 [78 88 92]
 [69 76 80]
 [91 94 97]]
2nd student's 3rd subject score: 92
3rd student's scores: [69 76 80]
Scores in 2nd subject: [90 88 76 94]
Submatrix (2nd–3rd students, 1st–2nd subjects):
 [[78 88]
 [69 76]]
Scores from 2nd student onward:
 [[78 88 92]
 [69 76 80]
 [91 94 97]]
Middle students' last subject scores: [92 80]


## **Advanced Indexing**

Advanced indexing allows you to select specific elements from a NumPy array using:

- Integer arrays

- Boolean arrays

Unlike basic slicing (like arr[1:4]), which returns a view (i.e., changes affect the original array), advanced indexing returns a copy. This is critical to understand because changes to the result won’t affect the original array.

***Integer Indexing***

You use integer arrays (lists or NumPy arrays of integers) to pick specific items by position.

**Syntax:**
```python
result = arr[rows, cols]
```

Here, rows and cols must be arrays (or lists) of the same shape. Each (row[i], col[i]) selects one element.

In [None]:
x = np.array([[1, 2],
              [3, 4],
              [5, 6]])

# Select elements: (0,0), (1,1), (2,0)
y = x[[0, 1, 2], [0, 1, 0]]

print("Original array:\n", x)
print("Selected elements:", y)


"""
Row 0, Column 0 → 1

Row 1, Column 1 → 4

Row 2, Column 0 → 5
"""


Original array:
 [[1 2]
 [3 4]
 [5 6]]
Selected elements: [1 4 5]


In [None]:
# Selecting elements at corners
rows = np.array([[0, 0], [3, 3]])
cols = np.array([[0, 2], [0, 2]])

corners = x[rows, cols]
print("Corner elements:\n", corners)

"""
You selected:

(0,0), (0,2), (3,0), and (3,2)
"""


Corner elements:
 [[ 0  2]
 [ 9 11]]


In [95]:
# Index Out of Bounds Error
x = np.array([[0, 1], [2, 3], [4, 65]])
print(x[3, 1])  # This will raise an error


IndexError: index 3 is out of bounds for axis 0 with size 3

***Boolean Indexing***

You use a Boolean array (True/False values) to filter elements.

**Syntax**:
```python
result = arr[condition]
```
Each True in the condition means "include this element".

In [96]:
arr = np.array([10, 20, 30, 40, 50])
mask = np.array([True, False, True, False, True])
print(arr[mask])

[10 30 50]


In [97]:
x = np.array([[ 0,  1,  2],
              [ 3,  4,  5],
              [ 6,  7,  8],
              [ 9, 10, 11]])

print(x[x > 5])  # Only values greater than 5


[ 6  7  8  9 10 11]


In [98]:
a = np.array([np.nan, 1, 2, np.nan, 3, 4, 5])
filtered = a[~np.isnan(a)]  # ~ means NOT
print(filtered)


[1. 2. 3. 4. 5.]


In [99]:
a = np.array([1, 2+6j, 5, 3.5+5j])
print(a[np.iscomplex(a)])


[2. +6.j 3.5+5.j]


___

## ***NumPy - Array Attributes***

Attributes in NumPy provide information about the array itself, not the data. They describe the array’s shape, data type, memory layout, size, and more.

Unlike methods (which are called with parentheses, like array.sum()), attributes are accessed without parentheses—e.g., array.shape, not array.shape().

✅ **List of All Important NumPy Array Attributes**  
We will cover the following attributes with structured explanations:

| Attribute    | Description                                             |
|--------------|---------------------------------------------------------|
| ndim         | Number of dimensions (axes)                             |
| shape        | Tuple showing the size of each dimension                |
| size         | Total number of elements in the array                   |
| dtype        | Data type of elements in the array                      |
| itemsize     | Size (in bytes) of each item                            |
| nbytes       | Total memory (in bytes) consumed by the array           |
| T            | Transposed view of the array                            |
| data         | Memory address of the data buffer                       |
| strides      | Steps in bytes to move along each dimension             |
| flags        | Memory layout info (e.g., contiguous, writable)         |
| base         | Reference to the original array (if a view)             |
| real / imag  | Real and imaginary parts (for complex arrays)           |


In [None]:
# Create a 2D array
a = np.array([[1, 2, 3], [4, 5, 6]])

print("Array:\n", a)
print("ndim       :", a.ndim)        # Number of dimensions
print("shape      :", a.shape)       # Tuple of dimensions
print("size       :", a.size)        # Total number of elements
print("dtype      :", a.dtype)       # Data type of elements
print("itemsize   :", a.itemsize)    # Bytes per element
print("nbytes     :", a.nbytes)      # Total memory usage, itemsize*size
print("strides    :", a.strides)     # Byte steps per axis(8*3 for each ele)
print("T (Transpose):\n", a.T)       # Transpose of array

# But these attributes doesnt disturb the original array as below
a

Array:
 [[1 2 3]
 [4 5 6]]
ndim       : 2
shape      : (2, 3)
size       : 6
dtype      : int64
itemsize   : 8
nbytes     : 48
strides    : (24, 8)
T (Transpose):
 [[1 4]
 [2 5]
 [3 6]]


array([[1, 2, 3],
       [4, 5, 6]])

In [107]:
# Advanced attributes

print("flags      :\n", a.flags)      # Memory layout info
print("data       :", a.data)        # Pointer to memory buffer
print("base       :", a.base)        # If it's a view, shows original array

# Complex array example
b = np.array([1 + 2j, 3 + 4j])
print("Complex Array:", b)
print("Real part :", b.real) # real part of number
print("Imag part :", b.imag) # complex part of number
print("Complex conjugate :", np.conj(b))
print("angle of complex argument:", np.angle(b, deg=True))


flags      :
   C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False

data       : <memory at 0x0000028B63030A00>
base       : None
Complex Array: [1.+2.j 3.+4.j]
Real part : [1. 3.]
Imag part : [2. 4.]
Complex conjugate : [1.-2.j 3.-4.j]
angle of complex argument: [63.43494882 53.13010235]


### NumPy Array Flags

NumPy provides several flags that describe different aspects of the array's memory layout and properties:

| Sr.No. | Attribute & Description                                               |
|--------|----------------------------------------------------------------------|
| 1      | **C_CONTIGUOUS (C)**: The data is in a single, C-style contiguous segment. |
| 2      | **F_CONTIGUOUS (F)**: The data is in a single, Fortran-style contiguous segment. |
| 3      | **OWNDATA (O)**: The array owns the memory it uses or borrows it from another object. |
| 4      | **WRITEABLE (W)**: The data area can be written to. Setting this to False locks the data, making it read-only. |
| 5      | **ALIGNED (A)**: The data and all elements are aligned appropriately for the hardware. |
| 6      | **WRITEBACKIFCOPY (U)**: This array is a copy of some other array. When this array is deallocated, the base array will be updated with the contents of this array. |
