## What is numpy?

Numpy is a powerful library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. Numpy arrays are more efficient than Python lists for numerical computations, and it is designed to work with large datasets efficiently.

In summary, NumPy is a fundamental library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions for numerical computations. It is widely used in data analysis, scientific computing, and machine learning applications.

## Why is numpy important?

NumPy offers several advantages over traditional Python lists:

1. Performance: NumPy arrays are more efficient than Python lists for numerical computations. They use memory more efficiently, and they support vectorized operations, which can lead to faster and more readable code.

2. Flexibility: NumPy arrays can have a variable number of dimensions, allowing for more complex data structures and operations. They also support different data types, such as integers, floating-point numbers, and complex numbers, which can be useful in various applications.

3. Convenient mathematical functions: NumPy provides a collection of mathematical functions, such as `sin`, `cos`, `exp`, `log`, and `sqrt`, that can be applied to arrays efficiently. These functions are implemented in C, which can lead to significant performance improvements compared to Python's built-in functions.

In summary, NumPy is a powerful library in Python that provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions for numerical computations. It is widely used in data analysis, scientific computing, and machine learning applications, and it offers several advantages over traditional Python lists.

## How can I get started with numpy?

To get started with NumPy, you can follow these steps:

1. Import the NumPy library: Use the `import numpy as np` statement to import the NumPy library and give it a shorter alias, such as `np`.

2. Create NumPy arrays: Use the `np.array()` function to create arrays from Python lists. You can also create arrays using various other functions, such as `np.zeros()`, `np.ones()`, `np.full()`, `np.arange()`, and `np.linspace()`.

3. Perform mathematical operations: Use the NumPy array methods and functions to perform mathematical operations on arrays. Some common operations include addition, subtraction, multiplication, division, and exponentiation.

By following these steps, you can start using NumPy to work with large, multi-dimensional arrays and matrices efficiently.

I hope this information helps you get started with NumPy! Let me know if you have any further questions.

`ndarray` is the core data structure in NumPy, representing a multi-dimensional array of homogeneous data types. It is a powerful and flexible way to store and manipulate numerical data in Python.


In [1]:
import numpy as np

x = np.array([1, 2, 3, 4, 5])
print(x)
print(type(x))

[1 2 3 4 5]
<class 'numpy.ndarray'>


In [2]:
y = [1, 2, 3, 4, 5]
print(y)
print(type(y))

[1, 2, 3, 4, 5]
<class 'list'>


## Can numpy arrays store different data types?

Yes — NumPy arrays can store heterogeneous data, but it’s usually not recommended, and there are important caveats.

When it comes to data types, NumPy arrays are designed to be efficient and flexible. They can store a variety of data types, including integers, floating-point numbers, and even complex numbers. However, it's important to note that:

- While NumPy arrays can store heterogeneous data, it's generally not recommended to mix data types within a single array. This can lead to unexpected behavior and inefficient memory usage.
- If you need to store heterogeneous data, you can use a combination of different data types within the same array. For example, you can create an array of integers and a separate array of floating-point numbers.

Here's an example of creating a NumPy array with heterogeneous data types:

In [3]:
d = np.array([1, 2, 3, 4, 5, 6.0, '7'])
print(d)

## In this example, the array `d` contains integers, floating-point numbers, and a string.

['1' '2' '3' '4' '5' '6.0' '7']


In [4]:
## By design, Numpy array has one data type for the entire array, which in this case is a floating-point number.
arr = np.array([1, 2, 3, 4, 5], dtype=int)
print(arr.dtype)


# If you mix types, NumPy upcasts everything to a common type. This is still homogeneous.
arr = np.array([1, 2.5, 3])
print(arr)         # [1.  2.5 3. ]
print(arr.dtype)   # float64


# By default, Numpy is smart enough to automatically convert the string to a floating-point number when performing mathematical operations.

## By default, NumPy will store the integers and floating-point numbers as floating-point numbers. However, if you want to store the integers as integers, you can specify the `dtype` parameter

## True Heterogeneous Numpy Array -> Arrays can store heterogeneous data, but it's generally not recommended, and there are important caveats. You can force heterogeneity using `object` dtype:
arr = np.array([1, "hello", 3.14, [1, 2, 3]], dtype=object)
print(arr)         # [1 2 3]
print(arr.dtype)   # int64

## Downside of this, this can lead to slower performance and increased memory usage.

int64
[1.  2.5 3. ]
float64
[1 'hello' 3.14 list([1, 2, 3])]
object


In [5]:
# If the requirment is different types per column, use structured arrays

dt = np.dtype([
    ("id", np.int32),
    ("price", np.float64),
    ("symbol", "U10")
])

arr = np.array([
    (1, 1.2, "AAPL"),
    (2, 1.5, "GOOG"),
    (3, 0.8, "MSFT")
], dtype=dt)

print(arr)
print(arr.dtype)

[(1, 1.2, 'AAPL') (2, 1.5, 'GOOG') (3, 0.8, 'MSFT')]
[('id', '<i4'), ('price', '<f8'), ('symbol', '<U10')]


### Perfomance of numpy arrays compared to Python lists

In [6]:
##  Performace of numpy arrays compared to Python lists
%timeit np.arange(1, 9)**2
%timeit [i**2 for i in range(1, 9)]

# Numpy array are faster and more memory efficient than Python lists for numerical computations because they are implemented in C.

637 ns ± 26.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
279 ns ± 14.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)


In [7]:
arr = np.array([1, 2, 3, 4, 5])
print(arr.ndim)

1


### NumPy Array vs Python List: A Deep Dive
While both look like sequences of data, they function very differently under the hood.

**Python Lists:**
- **Dynamic & Heterogeneous**: Can hold different types (int, float, string) mixed together.
- **Pointer-Based**: A Python list is actually an array of pointers (memory addresses). Each pointer points to a full Python object stored elsewhere in memory.
- **Memory Overhead**: Each simple integer in Python is an object with overhead (reference count, type info, value).

**NumPy Arrays:**
- **Fixed Type & Homogeneous**: All elements must be of the same type (e.g., all `int32`).
- **Contiguous Memory**: Elements are stored side-by-side in a single continuous block of RAM. The array object itself just points to the start of this block and knows the data type and shape.

![SIMD Comparison](simd_comparison.png)

### Why is Contiguous Memory Faster?
1.  **Locality of Reference**: When the CPU fetches data from RAM, it fetches a "cache line" (e.g., 64 bytes). Because NumPy data is contiguous, a single fetch brings in multiple useful elements. With Python lists, the next element might be far away in memory, causing a "Cache Miss" and forcing the CPU to wait.
2.  **No Type Checking**: Since NumPy knows all elements are `int32`, it doesn't need to check the type of every single element during an operation. Python must check the type of every object in the list.

### SIMD (Single Instruction, Multiple Data)
Modern CPUs have special "Vector Units" that can perform math on multiple numbers at once.
- **SISD (Scalar)**: Standard Python adds numbers one by one: `a[0]+b[0]`, then `a[1]+b[1]`.
- **SIMD (Vector)**: NumPy passes the contiguous block of data to the CPU. The CPU can load, say, 4 or 8 integers into a vector register and add them all effectively in a single clock cycle.

This hardware-level parallelism is why NumPy is orders of magnitude faster.

## Creating n dimension array in numpy
- Pass to `n` dimensional list to `np.array()` to create `n` dimensional numpy array. 
- To get the dimension of a numpy array use `.ndim` on the numpy array

In [8]:
## Creating 1D array
arr1 = np.array([1, 2, 3])
print(arr1.ndim)


## Creating 2D array. Size of each element of the multi-dimensional array must be same. Think of it like an matrix
arr2 = np.array([[1, 2, 3], [4, 5, 7]])
print(arr2.ndim)
print(arr2.dtype)

# Creating a 10D array
arr10 = np.array([[1, 2, 3], [4, 5, 6]], ndmin=10)
print(arr10)
print(arr10.ndim)

1
2
int64
[[[[[[[[[[1 2 3]
         [4 5 6]]]]]]]]]]
10


### Creating different types of Numpy array

In [9]:
## Creating a zero array
zeroarr = np.zeros(shape=(2, 3, 4), dtype=int)
print(zeroarr)


# [   
#     [  
#         [0, 0, 0, 0],
#         [0, 0, 0, 0],
#         [0, 0, 0, 0]
#     ],
#     [
#         [0, 0, 0, 0],
#         [0, 0, 0, 0],
#         [0, 0, 0, 0]
#     ]
# ]


[[[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]

 [[0 0 0 0]
  [0 0 0 0]
  [0 0 0 0]]]


In [10]:
## Creating a ones array
onesarr = np.ones(shape=(2, 3, 4), dtype=int)
print(onesarr)


[[[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]

 [[1 1 1 1]
  [1 1 1 1]
  [1 1 1 1]]]


In [11]:
# Create a full array
fullarr = np.full((2, 2), None) 
print(fullarr)

[[None None]
 [None None]]


In [12]:
# Creating a empty array
emptyarr = np.empty(shape=(2,3), dtype=int)
print(emptyarr)

[[0 0 0]
 [0 0 0]]


In [13]:
## Creating range array -> Always outputs a 1D array. Start and stop may not be included
rangearr = np.arange(start=1, stop=5, step=.5, dtype=float)
print(rangearr)


# Create a array from range where elements are equi-distance from each other, start and end of a range is included.
randomarr = np.linspace(1, 10, num=3, dtype=int)
print(randomarr)

[1.  1.5 2.  2.5 3.  3.5 4.  4.5]
[ 1  5 10]


In [14]:
# For creating identity matrix, both `np.identity()` and `np.eye()`  can be used to create identity matrix, but `np.eye()` gives
# more control
 
identityarr = np.identity(3)
print(identityarr)




identityarr2 = np.eye(3)
print(identityarr2)


# N → number of rows
# M → number of columns (default = N)
# k → diagonal offset
# 0 → main diagonal
# >0 → upper diagonal
# <0 → lower diagonal
identityarr3 = np.eye(M=5, N=4, k=1)
print(identityarr3)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]


In [15]:
## Creating numpy arrays from random numbers frm 0 to 1

randarr = np.random.rand(4) # No. of times to render
print(randarr)

print("=============")

randarr1 = np.random.rand(2, 5)  # These are dimensions
print(randarr1)

[0.46614636 0.19501061 0.58327447 0.8768851 ]
[[0.58727511 0.74076836 0.69606138 0.15228189 0.78071103]
 [0.54723338 0.2956857  0.23314014 0.75842684 0.11218124]]


In [16]:
## Generate random between -1 to 1 close to 0

r = np.random.randn(2, 3) ## Shape
print(r)

[[ 1.19470746 -2.38973938 -1.08843624]
 [ 0.21120526  1.85031719  0.74256033]]


In [17]:
## Genearte random numbers between [0.0, 1.0) -> 1 is not inclusive

arr = np.random.ranf((3, 4))
print(arr)

[[0.32187574 0.11406312 0.19996119 0.84016494]
 [0.6200982  0.98142401 0.97353743 0.2025287 ]
 [0.209122   0.84004439 0.82190265 0.56568503]]


In [18]:
## Generate random numbers int within a range

rnum = np.random.randint(low=1, high=6, size=(2, 4)) 
rnum

array([[2, 5, 2, 2],
       [3, 4, 1, 4]])

## Datatypes in Numpy arrays
- You can specify the datatype of a numpy array while creating it using the `dtype` parameter
- You can also change the datatype of an existing numpy array using the `.astype()` method 
- Numpy supports various datatypes like `int`, `float`, `complex`, `bool`, `object`, etc.
- In summary, while NumPy arrays can store heterogeneous data types, it's generally recommended to use homogeneous data types for efficiency and performance. If you need to work with heterogeneous data, consider using separate arrays for each data type or using the `dtype=object` option with caution.

In [19]:
# Creating a numpy array with heterogeneous data types
heterogeneous_array = np.array([1, 2.5, 3+4j, 'Hello'], dtype=object)
print(heterogeneous_array)

print("===========")

# Creating a numpy array with homogenous int data types
int_array = np.array([1, 2, 3, 4, 5], dtype=int)
print(int_array)

print("===========")

# Creating a numpy array with floating-point numbers
float_array = np.array([1.0, 2.5, 3.3, 4.8, 5.1], dtype=float)
print(float_array)

print("===========")

# Creating a numpy array with complex numbers
complex_array = np.array([1+2j, 3+4j, 5+6j], dtype=complex)
print(complex_array)

print("===========")

# Changing the datatype of an existing numpy array
original_array = np.array([1, 2, 3, 4, 5],)
float_array = original_array.astype(float)
print(float_array)


[1 2.5 (3+4j) 'Hello']
[1 2 3 4 5]
[1.  2.5 3.3 4.8 5.1]
[1.+2.j 3.+4.j 5.+6.j]
[1. 2. 3. 4. 5.]


### Arithmetic Operation in Numpy Arrays
- a + b  -> np.add(a, b)
- a - b  -> np.subtract(a, b)
- a * b -> np.multiply(a, b)
- a / b -> np.divide(a, b)
- a % b -> np.mod(a, b)
- a ** b -> np.power(a, b)
- 1 / a -> np.reciprocal(a)

In [20]:
# Arithmetic operations in Numpy arrays are more like Hadamard product (i.e element-wise multiplication)
# Both the array must be of same shape if not scalar


a = np.array([1, 2, 3])
b = np.array([7, 8, 9])

result = a + 2
result1 = a + b 
result2 = np.add(a, 5)


print(result)
print(result1)
print(result2)

A = np.random.randint(1, 100, size=(3, 4))
B = np.random.randint(1, 100, size=(3, 4))
resultant = A * B

print(A)
print(B)
print(resultant)


[3 4 5]
[ 8 10 12]
[6 7 8]
[[79 92 51 17]
 [73 46 77 55]
 [80 75 56 31]]
[[27 78  8 80]
 [ 4 39 83 99]
 [54 10 63 48]]
[[2133 7176  408 1360]
 [ 292 1794 6391 5445]
 [4320  750 3528 1488]]


### Arithmetic funtions in Numpy

- np.min(a)
- np.max(a)
- np.sqrt(a)
- np.sin(a)
- np.cos(a)
- np.cumsum(a)
- np.argmin(a) -> Returns postion of min from a
- np.argmax(a) -> Returns postion of max from a

For multidimensional need to pass axis. `Axis = 0` respresent column and 1 respresents row

**NOTE**: The outer dimesion of the numpy array is considered the lower axis, so as the depth of dimension increases, the axis value increases. For example, in a 3D array, axis 0 represents the outermost dimension, axis 1 represents the middle dimension, and axis 2 represents the innermost dimension.
```
axis=0 → first dimension
axis=1 → second dimension
axis=2 → third dimension
```

In [21]:
arr = np.array([[1, 2, 3], [7, 1, 8]])


# To get max across column
print(np.max(arr, axis=0))


# To get max position across column
print(np.argmax(arr, axis=0))

# To get cummulative sum at each element
print(np.cumsum(arr))


[7 2 8]
[1 0 1]
[ 1  3  6 13 14 22]


### Numpy Shape and reshape

In [22]:
## To get shape of a Numpy array
## Trick:
## Shape -> Look at each level openning bracket from inside-out, & get the number of items from each (1, 2, 4)
## Dim -> Number of openning brackets

data = [ # 1
    [   # 2
        [1, 2, 3, 4], #4
        [6, 7, 8, 9]
    ]
]

arr = np.array(data)
print(arr.shape) 
print(arr.ndim)

(1, 2, 4)
3


In [23]:
## Creating a Numpy array of `n` dimension

narr = np.array([1, 23, 5, 4, 8, 9, 5, 7, 8, 9, 12, 100],ndmin=3)
print(narr)

[[[  1  23   5   4   8   9   5   7   8   9  12 100]]]


In [24]:
# Reshape the numpy arr. Basically this is like reshaping to a different matrix
# NOTE: While reshaping check the number of elements in the original array must be equal to the new array i.e. product of all
# dimensions here, narr.length == 3 * 2 *  2

np.array(narr).reshape(3, 2, 2)

array([[[  1,  23],
        [  5,   4]],

       [[  8,   9],
        [  5,   7]],

       [[  8,   9],
        [ 12, 100]]])

### Broadcasting in Numpy
- Broadcasting is a powerful mechanism that allows NumPy to perform operations on arrays of different shapes.
- When performing operations on arrays of different shapes, NumPy virtually automatically expands the smaller array to match the shape of the larger array.
- This allows for efficient computation without the need for explicit replication of data.
- Broadcasting follows a set of rules to determine how the arrays are expanded:
  1. If the arrays have a different number of dimensions, the shape of the smaller array is padded with ones on the left side until both shapes have the same length.
  2. The sizes of the dimensions are compared element-wise, starting from the rightmost dimension. `Two dimensions are compatible if they are equal or if one of them is 1`.
  3. If the dimensions are compatible, the array with size 1 is expanded to match the size of the other array.
  4. If the dimensions are not compatible, a `ValueError` is raised.
- Broadcasting is commonly used in operations such as addition, subtraction, multiplication, and division between arrays of different shapes.
- Reference: https://www.youtube.com/watch?v=P67wiuTx7l0
- It is important to understand the rules of broadcasting to avoid unexpected results and ensure efficient computation in NumPy.
Here's an example of broadcasting in NumPy:

```python
import numpy as np
# Create a 2D array
A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])
# Create a 1D array
B = np.array([10, 20, 30])
# Add the 2D array and the 1D array using broadcasting
C = A + B
print(C)
```
Output:
```[[11 22 33]
 [14 25 36]
 [17 28 39]]
```



In [25]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])


B = np.array([
    [10, 20, 30],
])

# Step 1
# A → (2, 3)
# B → (3,)

# Step 2 -> Pad the smaller shape with 1s on the LEFT
# B → (1, 3)

# (2, 3)
# (1, 3)

# Step 3: Apply broadcasting rules
# Dimension 1: 2 vs 1 → ✅ stretch 1 to 2
# Dimension 2: 3 vs 3 → ✅ same

# B becomes:
# [[10, 20, 30],
#  [10, 20, 30]]

print(A+B)

[[11 22 33]
 [14 25 36]
 [17 28 39]]


In [26]:
v1 =np.array([[1, 2, 3]])
v2 = np.array([[1], [2], [3]])
print(v1.ndim, v1.shape)
print(v2.ndim, v2.shape)


# Step 1
# v1 → (1, 3)
# v2 → (3,)


# Step 2 -> Pad the smaller shape with 1s on the LEFT
# v1 → (1, 3)

# (1, 3)
# (3, 1)

# Step 3: Apply broadcasting rules
# Dimension 1: 3 vs 1 → ✅ stretch 1 to 3
# Dimension 2: 3 vs 1 → ✅ same

# [ 1, 2, 3] [ 1, 1, 1]
# [ 1, 2, 3] [ 2, 2, 2 ]
# [ 1, 2, 3] [ 3, 3, 3 ]

print(v1 * v2)

2 (1, 3)
2 (3, 1)
[[1 2 3]
 [2 4 6]
 [3 6 9]]


### Numpy Indexing and Slicing
- Numpy indexing and slicing allows you to access and manipulate specific elements or subsets of a NumPy array.
- You can use indexing to access individual elements of an array by specifying their position using square brackets `[]`.
- You can also use slicing to extract a subset of an array by specifying a range of indices using the colon `:` operator.
- Numpy supports various types of indexing, including:
  1. Integer indexing: Accessing elements using integer indices.
  2. Boolean indexing: Accessing elements using boolean arrays. Also used for **filtering** arrays based on conditions.
  3. Fancy indexing: Accessing elements using arrays of indices.
- Slicing can be done using the syntax `array[start:stop:step]`, where `start` is the starting index, `stop` is the ending index (exclusive), and `step` is the step size.
- Numpy indexing and slicing are powerful tools for manipulating and analyzing data in NumPy arrays. arrays efficiently.

In [27]:
## Examples of Indexing and Slicing

arr = np.array([[10, 20, 30, 40, 50],
                [60, 70, 80, 90, 100],
                [110, 120, 130, 140, 150]])

print(arr)
# Accessing element at 2nd row and 3rd column
element = arr[1, 2]
print("Element at 2nd row and 3rd column:", element)

# Slicing to get a sub-array (2nd and 3rd rows, 2nd to 4th columns)
sub_array = arr[1:3, 1:4]
print("Sub-array (2nd and 3rd rows, 2nd to 4th columns):")
print(sub_array)

[[ 10  20  30  40  50]
 [ 60  70  80  90 100]
 [110 120 130 140 150]]
Element at 2nd row and 3rd column: 80
Sub-array (2nd and 3rd rows, 2nd to 4th columns):
[[ 70  80  90]
 [120 130 140]]


In [28]:
ndimarr = [
    [
        [
            [1, 2, 3, 4],
        ],
        [ 
            [4, 5, 6, 7],
        ],
    ],
    [
        [
            [7, 8, 9, 10]
        ],
        [
            [11, 12, 13, 14]
        ]
    ]
]

arr = np.array(ndimarr)
print(arr.ndim)
print(arr.shape)

print("=========")
print(arr[1:, :, :, 2:5])

4
(2, 2, 1, 4)
[[[[ 9 10]]

  [[13 14]]]]


In [29]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(arr[2:8:2]) # [start: stop : step]

[3 5 7]


### Iterating over Numpy arrays
- You can iterate over numpy arrays using loops, similar to how you would iterate over Python lists.
- For 1D arrays, you can use a simple for loop to iterate over each element.
- For multi-dimensional arrays, you can use nested loops to iterate over each dimension.
- Using numpy's built-in functions like `np.nditer()` can also help in iterating over arrays efficiently. Numpy's built-in functions like `np.nditer()` can also help in iterating over arrays efficiently

In [30]:
# Iterating over Numpy arrays

arr = np.array([[1, 2, 3], [4, 5, 6]])
for row in arr:
    for element in row:
        print(element)


# Note: Applying enumerate here will lose the multi-dimensional position of each element, whereas ndenumerate will retain it.


print("=========")
for idx, x in enumerate(np.nditer(arr)): # lose multi-dimensional position
    print(idx, x)

print("=====")
for idx, x in np.ndenumerate(arr):
    print(idx, x)


1
2
3
4
5
6
0 1
1 2
2 3
3 4
4 5
5 6
=====
(0, 0) 1
(0, 1) 2
(0, 2) 3
(1, 0) 4
(1, 1) 5
(1, 2) 6


### Numpy copy vs view
- In NumPy, both `copy` and `view` are used to create new arrays from existing arrays, but they have different behaviors and use cases.
- A `copy` creates a new array that is a separate copy of the original array. Any changes made to the copy do not affect the original array, and vice versa. This is useful when you want to create a completely independent array that can be modified without affecting the original data.
- A `view`, on the other hand, creates a new array that shares the same data as the original array. Changes made to the view will also affect the original array, and vice versa. This is useful when you want to create a new array that is a different representation of the same data, without the overhead of copying the data.


In [31]:
## Copy vs View in Numpy
arr = np.array([1, 2, 3, 4])
arr_copy = arr.copy()

print("Original Array: ", arr)
print("Copy of the Array:", arr_copy)

print("After modifying original array")
arr[3] = 45

print("Original Array: ", arr)
print("Copy of the Array:", arr_copy)

Original Array:  [1 2 3 4]
Copy of the Array: [1 2 3 4]
After modifying original array
Original Array:  [ 1  2  3 45]
Copy of the Array: [1 2 3 4]


In [32]:
arr = np.array([1, 2, 3, 4])
arr_view = arr.view()

print("Original Array: ", arr)
print("View of the original Array: ", arr_view)

print("After modifying the original Array")

arr_view[3] = 45

print("Original Array: ", arr)
print("View of the original Array: ", arr_view)

Original Array:  [1 2 3 4]
View of the original Array:  [1 2 3 4]
After modifying the original Array
Original Array:  [ 1  2  3 45]
View of the original Array:  [ 1  2  3 45]


### Joining and Splitting Numpy arrays

- You can join multiple numpy arrays into a single array using functions like `np.concatenate()`, `np.vstack()`, and `np.hstack()`.
- You can split a numpy array into multiple arrays using functions like `np.split()`, `np.vsplit()`, and `np.hsplit()`.
- Condition for joining and splitting is that the arrays must have compatible shapes along the specified axis.

#####  What is the difference between vstack, hstack, stack and concatenate in numpy?
- `np.concatenate()` is a general-purpose function that can join arrays along any specified axis.
- `np.vstack()` is a specialized function that stacks arrays vertically (row-wise) along the first axis (axis=0).
- `np.hstack()` is another specialized function that stacks arrays horizontally (column-wise) along the second axis (axis=1).
- `np.stack()` is a function that joins arrays along a new axis, creating a new dimension.
- In summary, `concatenate` is a general function for joining arrays, while `vstack`, `hstack`, and `stack` are specialized functions for specific stacking operations.

In [33]:
## Joining Numpy arrays
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([4, 5])

print(np.concatenate([arr1, arr2], axis=0)) # axis=0 is default 

arr3 = np.array([[1, 2], [3, 4]])
arr4 = np.array([[4, 5], [6, 7]])

# [
#     [1, 2],
#     [3, 4]
# ]
# [
#     [4, 5],
#     [6, 7]
# ]

print(np.concatenate([arr3, arr4], axis=0)) # Axis=0 outer dimensions is merged

print(np.concatenate([arr3, arr4], axis=1)) # Axis=1 outer dimensions is merged

[1 2 3 4 4 5]
[[1 2]
 [3 4]
 [4 5]
 [6 7]]
[[1 2 4 5]
 [3 4 6 7]]


In [34]:
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])


# stack -> creates a new axis and adds elements along the specified axis from the given stack axis

print(np.stack((arr1, arr2), axis=0))
print(np.stack((arr1, arr2), axis=1))

[[1 2 3 4]
 [5 6 7 8]]
[[1 5]
 [2 6]
 [3 7]
 [4 8]]


In [35]:
arr3 = np.array([1, 2, 3, 4])
arr4 = np.array([5, 6, 7, 8])
arr5 = np.array([[9, 10, 11, 12], [13, 14, 15, 16]])
arr6 = np.array([[131, 141, 151, 161], [171, 181, 191, 121]])

# How vstack works -> Vertical stack i.e. along rows
# Step 1: Converts inputs to 2D arrays
# Step 2: Stacks them along the first axis (rows)
# Step 3: Returns the new stacked array
print(np.vstack((arr3, arr4)))

print("=====")


# How hstack works -> Horizontal stack i.e. along columns
# if arrays are 1D:
#     concatenate along axis=0
# else:
#     concatenate along axis=1

print(np.hstack([arr3, arr4]))

print("=====")


# [
#     [9, 10, 11, 12], 
#     [13, 14, 15, 16]
# ]
# [
#     [131, 141, 151, 161], 
#     [171, 181, 191, 121]
# ]

print(np.hstack((arr5, arr6)))

[[1 2 3 4]
 [5 6 7 8]]
=====
[1 2 3 4 5 6 7 8]
=====
[[  9  10  11  12 131 141 151 161]
 [ 13  14  15  16 171 181 191 121]]


In [36]:
### Splitting Numpy arrays


# `split()` requires the input array must be divisible by the number of splits i.e. arr.length % no.of splits == 0
# split() outputs all splits will be of equal size

arr = np.array([1, 2, 3, 4, 5, 6])
n_arr = np.split(arr, 3) # Here 3 is the number of parts to split into
print(n_arr)

print("=====")


arr1 = np.array([[1, 2, 3, 4], [23, 56, 78, 90]])
# [
#     [1, 2, 3, 4], 
#     [23, 56, 78, 90]
# ]

s_arr = np.split(arr1, 2, axis=1) # Split along columns
print(s_arr)


[array([1, 2]), array([3, 4]), array([5, 6])]
=====
[array([[ 1,  2],
       [23, 56]]), array([[ 3,  4],
       [78, 90]])]


In [37]:
# Split array with unequal sizes
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
n_arr = np.array_split(arr, 4) # Here arr.length is not divisible by 4
print(n_arr)


[array([1, 2, 3]), array([4, 5]), array([6, 7]), array([8, 9])]


### Search, Sort Search sorted, Filter in Numpy Arrays
- You can search for specific values in a numpy array using functions like `np.where()` and `np.searchsorted()`.
- You can sort a numpy array using functions like `np.sort()` and `np.argsort()`.
- You can filter elements in a numpy array based on specific conditions using boolean indexing or functions like `np.extract()`.
- `np.where()` -> Returns tuple of indices where condition is met. The tuple contains arrays for each dimension of the input array.
- 

In [38]:
## Searching in Numpy arrays
import select


arr = np.array([[10, 20, 30, 40, 50], [60, 70, 80, 90, 100]])

result = np.where(arr > 40)
print(result)
print("Indices of elements greater than 50:", result[0])


fruits = np.array(['apple', 'banana', 'cherry', 'date', 'elderberry'])
selected_fruits = np.where(np.char.str_len(fruits) > 5)
print(selected_fruits)
# print("Fruits with names longer than 5 characters:", fruits[selected_fruits])

(array([0, 1, 1, 1, 1, 1]), array([4, 0, 1, 2, 3, 4]))
Indices of elements greater than 50: [0 1 1 1 1 1]
(array([1, 2, 4]),)


In [39]:
### Search sorted array, performs binary search in the sorted array, and returns the index where the element should be inserted to maintain order.
## What is side 'left' and 'right'?
#   side='left': Insert the element to the left of existing elements with the same value.
#   side='right': Insert the element to the right of existing elements with the same value.
arr = np.array([10, 20, 30, 40, 50])
left_index = np.searchsorted(arr, 40, side='left') 
right_index = np.searchsorted(arr, 40, side='right') 
print("Index to insert 40 to maintain order:", left_index, right_index)  


Index to insert 40 to maintain order: 3 4


In [40]:
### Sorting Numpy arrays
arr = np.array([64, 34, 25, 12, 22, 11, 90])
sorted_arr = np.sort(arr)
print("Sorted array:", sorted_arr)

arr1 = np.array([[50, 81, 13, 4], [5, 65, 80, 18]])
sorted_arr0 = np.sort(arr1, axis=0)
sorted_arr1 = np.sort(arr1, axis=1)
print("Sorted 2D array:", sorted_arr0)
print("Sorted 2D array:", sorted_arr1)


Sorted array: [11 12 22 25 34 64 90]
Sorted 2D array: [[ 5 65 13  4]
 [50 81 80 18]]
Sorted 2D array: [[ 4 13 50 81]
 [ 5 18 65 80]]


In [41]:
### Filter Numpy arrays by applying conditions that uses/results boolean indexing
arr = np.array([10, 15, 20, 25, 30, 35, 40])
filter_condition = arr > 25
print(filter_condition) # This is a boolean array
filtered_arr = arr[filter_condition]
print("Filtered array (elements greater than 25):", filtered_arr)


[False False False False  True  True  True]
Filtered array (elements greater than 25): [30 35 40]


### Some additional Numpy functions
- `np.unique()` -> Returns the sorted unique elements of an array.
- np.clip() -> Clips (limits) the values in an array to a specified range.
- np.roll() -> Rolls the elements of an array along a specified axis.
- np.transpose() -> Transposes the dimensions of an array.
- `random.shuffle()` -> Randomly shuffles the elements of an array along a specified axis.
- resize -> Changes the shape of an array without changing its data.
- flatten -> Flattens a multi-dimensional array into a 1D array.
- ravel -> Returns a contiguous flattened array.
  

In [42]:
## Shuffle Numpy arrays
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15]])
print("Original array:", arr)
np.random.shuffle(arr)
print("Shuffled array:", arr)


Original array: [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]
 [13 14 15]]
Shuffled array: [[10 11 12]
 [ 7  8  9]
 [ 1  2  3]
 [13 14 15]
 [ 4  5  6]]


In [43]:
## Unique
arr = np.array([[1, 2, 2], [3, 4, 4,], [4, 5, 6], [3, 4, 4]])
unique_elements = np.unique(arr, axis=0, return_counts=True)
print("Unique elements in the array:", unique_elements)

Unique elements in the array: (array([[1, 2, 2],
       [3, 4, 4],
       [4, 5, 6]]), array([1, 2, 1]))


### What is the difference between resize and reshapre in numpy?
- `reshape` returns a new array with the specified shape without changing the original array. The total number of elements must remain the same.
- `resize` modifies the original array in place to the specified shape. If the new shape has more elements, it fills the new elements with zeros; if it has fewer elements, it truncates the array.

In [44]:
## Reshape example
np_var1 = np.array([1, 2, 3, 4, 5, 6])
print(np_var1)


print("Reshaped arrays:")
# Method 1
print(np_var1.reshape(2, 3))

# Method 2
print(np.reshape(np_var1, (2, 3)))

# Original array remains unchanged
print(np_var1)


print("Another reshape example:")

np_var1 = np.array([[1, 2, 3], [4, 5, 6]])
print("Reshaped arrays:")
# Method 1
print(np_var1.reshape(2, 3))

# Method 2
print(np.reshape(np_var1, (3, 2)))

# Original array remains unchanged
print(np_var1)


[1 2 3 4 5 6]
Reshaped arrays:
[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[1 2 3 4 5 6]
Another reshape example:
Reshaped arrays:
[[1 2 3]
 [4 5 6]]
[[1 2]
 [3 4]
 [5 6]]
[[1 2 3]
 [4 5 6]]


In [45]:
## Resize example
np_var2 = np.array([[1, 2, 3], [4, 5, 6]])
print(np_var2)

print("Resized array:")
# Method 1 -> Note: This modifies the original array. This is ndarray.resize() method.
# np_var2.resize((3, 2))


# Method 2 -> Note: This does not modify the original array, and returns a new array instead. This is numpy.resize() function.
resized = np.resize(np_var2, (3, 2))
print(resized)

print("Original array after resizing", np_var2)


[[1 2 3]
 [4 5 6]]
Resized array:
[[1 2]
 [3 4]
 [5 6]]
Original array after resizing [[1 2 3]
 [4 5 6]]


### Insertion in Numpy arrays
- You can insert elements into a numpy array using the `np.insert()` function. 
- The `np.insert()` function allows you to specify the index at which to insert the new element(s), as well as the axis along which to insert them (for multi-dimensional arrays).
- The original array remains unchanged, and a new array with the inserted elements is returned.
- np.append() -> Appends elements to the end of an array and returns a new array.

In [46]:
## Inserting in Numpy arrays
arr = np.array([[1, 2, 3], [4, 5, 6]])
new_arr = np.insert(arr, 2, 60, axis=0) # Insert along rows at index 0
# Inserting 60 at the outer dimension (rows) and then broadcasting it to match the number of columns

new_arr1 = np.insert(arr, 3, 70, axis=1) # Insert along columns at index 1
# Inserting 70 at the inner dimension (columns) and then broadcasting it to match the number of rows


print("Array after insertion:\n", new_arr)
print("Original array remains unchanged:\n", arr)


print("Array after insertion:\n", new_arr1)

Array after insertion:
 [[ 1  2  3]
 [ 4  5  6]
 [60 60 60]]
Original array remains unchanged:
 [[1 2 3]
 [4 5 6]]
Array after insertion:
 [[ 1  2  3 70]
 [ 4  5  6 70]]


In [47]:
## Appending in Numpy arrays

items = np.array(['apple', 'banana', 'cherry'])
new_items = np.append(items, 'date')
print("Array after appending an item:", new_items)
print("Original array remains unchanged:", items)

Array after appending an item: ['apple' 'banana' 'cherry' 'date']
Original array remains unchanged: ['apple' 'banana' 'cherry']


### Delete elements from Numpy arrays
- You can delete elements from a numpy array using the `np.delete()` function.
- The `np.delete()` function allows you to specify the index or indices of the elements to be deleted, as well as the axis along which to delete them (for multi-dimensional arrays).
- The original array remains unchanged, and a new array with the specified elements removed is returned.


In [48]:
### Deletion in Numpy arrays
arr = np.array([[10, 20, 30, 40], [50, 60, 70, 80], [90, 100, 110, 120]])
new_arr = np.delete(arr, 1, axis=0) # Delete 2nd
print("Array after deletion:\n", new_arr)
print("Original array remains unchanged:\n", arr)

new_arr1 = np.delete(arr, 2, axis=1) # Delete 3rd column
print("Array after deletion of 3rd column:\n", new_arr1)
print("Original array remains unchanged:\n", arr)

Array after deletion:
 [[ 10  20  30  40]
 [ 90 100 110 120]]
Original array remains unchanged:
 [[ 10  20  30  40]
 [ 50  60  70  80]
 [ 90 100 110 120]]
Array after deletion of 3rd column:
 [[ 10  20  40]
 [ 50  60  80]
 [ 90 100 120]]
Original array remains unchanged:
 [[ 10  20  30  40]
 [ 50  60  70  80]
 [ 90 100 110 120]]


## Matrix in Numpy
- A matrix in NumPy is a specialized 2D array that is used for mathematical operations involving linear algebra.
- You can create a matrix using the `np.matrix()` function or by creating a 2D NumPy array using `np.array()`.
- Matrices support various operations such as addition, subtraction, multiplication, and transposition.

## Difference between Numpy array and Matrix
- Numpy arrays are more general-purpose and can have any number of dimensions, while matrices are specifically 2D.
- Numpy arrays support element-wise operations by default, while matrices use matrix multiplication rules.
- Matrices have additional methods for linear algebra operations, such as `I` for inverse and `H` for conjugate transpose.

In [49]:
### Matrix in Numpy

matrix_1 = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("Matrix 1:\n", matrix_1)
print("Type  of matrix 1", type(matrix_1))

matrix_2  = np.array([[9, 8, 7], [6, 5, 4], [3, 2, 1]])
print("Matrix 2:\n", matrix_2)
print("Type  of matrix 2", type(matrix_2))


Matrix 1:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Type  of matrix 1 <class 'numpy.matrix'>
Matrix 2:
 [[9 8 7]
 [6 5 4]
 [3 2 1]]
Type  of matrix 2 <class 'numpy.ndarray'>


In [50]:
l1 = [[1, 2], [3, 4]]
l2 = [[5, 6], [7, 8]]

A = np.matrix(l1)
B = np.matrix(l2)
C = A * B  # Matrix multiplication
print("Matrix Multiplication Result:\n", C)
D = A.dot(B)
print("Matrix Multiplication using dot Result:\n", D)



a = np.array(l1)
b = np.array(l2)
c = a * b  # Element-wise multiplication
print("Element-wise Multiplication Result:\n", c)


Matrix Multiplication Result:
 [[19 22]
 [43 50]]
Matrix Multiplication using dot Result:
 [[19 22]
 [43 50]]
Element-wise Multiplication Result:
 [[ 5 12]
 [21 32]]


### Matrix functions in Numpy
- np.transpose() -> Transposes the matrix.
- A.T -> Transpose of matrix A
- np.linalg.inv() -> Computes the inverse of a matrix.
- np.linalg.det() -> Computes the determinant of a matrix.
- np.linalg.eig() -> Computes the eigenvalues and right eigenvectors of a square matrix.
- np.dot() -> Performs matrix multiplication.
- swapaxes() -> Swaps the axes of a matrix.
- invert() -> Inverts the matrix.
- inverse() -> Computes the inverse of a matrix.
- determinant() -> Computes the determinant of a matrix
- matrix.t() -> Computes the eigenvalues and right eigenvectors of a square matrix.
- power() -> Raises the matrix to a specified power.


In [51]:
### Transpose of a matrix - 2D array
l1 = [[7, 8, 1, 3], [9, 10, 13, 14], [11, 12, 21, 22], [80, 90, 100, 35], [45, 55, 65, 75]]
A = np.matrix(l1)
print("Matrix", A)

print("Transpose of the matrix")

## Method 1
AT = np.transpose(A)
print(AT)

print
## Method 2
AT1 = A.T
print(AT1)

Matrix [[  7   8   1   3]
 [  9  10  13  14]
 [ 11  12  21  22]
 [ 80  90 100  35]
 [ 45  55  65  75]]
Transpose of the matrix
[[  7   9  11  80  45]
 [  8  10  12  90  55]
 [  1  13  21 100  65]
 [  3  14  22  35  75]]
[[  7   9  11  80  45]
 [  8  10  12  90  55]
 [  1  13  21 100  65]
 [  3  14  22  35  75]]


In [52]:
## Transpose of Numpy arrays 3D array

l1 = [[[7, 8, 1, 3], [9, 10, 13, 14]], [[11, 12, 21, 22], [80, 90, 100, 35]], [[45, 55, 65, 75], [19, 29, 39, 49]]]
A = np.array(l1)


print("Original Array:\n", A, )
print("Shape of Original Array:", A.shape)


print("\nTranspose of the Array:\n")

print(A.T)
print(A.T.shape)

# Shape changes from
# (3, 2, 4) → (4, 2, 3)

# Transpose = flip the axis labels, not the data

Original Array:
 [[[  7   8   1   3]
  [  9  10  13  14]]

 [[ 11  12  21  22]
  [ 80  90 100  35]]

 [[ 45  55  65  75]
  [ 19  29  39  49]]]
Shape of Original Array: (3, 2, 4)

Transpose of the Array:

[[[  7  11  45]
  [  9  80  19]]

 [[  8  12  55]
  [ 10  90  29]]

 [[  1  21  65]
  [ 13 100  39]]

 [[  3  22  75]
  [ 14  35  49]]]
(4, 2, 3)


In [53]:
# Swapaxes in Numpy arrays
l1 = [
        [ # First 2D array 
            [
                [7, 8, 1200, 1201], 
                [1, 3, 1300, 1301]
            ],
            [
                [9, 10, 1400, 1401],
                [13, 14, 1500, 1501]
            ]
        ],
        [ # Second 2D array
            [
                [11, 12, 1600, 1601],
                [21, 22, 1700, 1701]
            ],
            [
                [80, 90, 1800, 1801],
                [100, 35, 1900, 1901]
            ]
        ],
        [ # Third 2D array
            [
                [45, 55, 2200, 2201],
                [65, 75, 2300, 2301]
            ],
            [
                [19, 29, 2400, 2401],
                [39, 49, 2500, 2501]
            ]
        ]
    ]
A = np.array(l1)

print("Original Array:\n", A, )
print("Shape of Original Array:", A.shape)

Original Array:
 [[[[   7    8 1200 1201]
   [   1    3 1300 1301]]

  [[   9   10 1400 1401]
   [  13   14 1500 1501]]]


 [[[  11   12 1600 1601]
   [  21   22 1700 1701]]

  [[  80   90 1800 1801]
   [ 100   35 1900 1901]]]


 [[[  45   55 2200 2201]
   [  65   75 2300 2301]]

  [[  19   29 2400 2401]
   [  39   49 2500 2501]]]]
Shape of Original Array: (3, 2, 2, 4)


In [54]:
swapped = np.swapaxes(A, 0, 1)
print("Array after swapping axes 0 and 2:\n", swapped)
print("Shape after swapping axes 0 and 2:", swapped.shape)

Array after swapping axes 0 and 2:
 [[[[   7    8 1200 1201]
   [   1    3 1300 1301]]

  [[  11   12 1600 1601]
   [  21   22 1700 1701]]

  [[  45   55 2200 2201]
   [  65   75 2300 2301]]]


 [[[   9   10 1400 1401]
   [  13   14 1500 1501]]

  [[  80   90 1800 1801]
   [ 100   35 1900 1901]]

  [[  19   29 2400 2401]
   [  39   49 2500 2501]]]]
Shape after swapping axes 0 and 2: (2, 3, 2, 4)


In [55]:
# Inverse of a matrix in Numpy

l1 = [[1, 2], [3, 4]]
A = np.matrix(l1)
print("Original Matrix:\n", A)
inverse_A = np.linalg.inv(A)
print("Inverse of the Matrix:\n", inverse_A)

Original Matrix:
 [[1 2]
 [3 4]]
Inverse of the Matrix:
 [[-2.   1. ]
 [ 1.5 -0.5]]


In [56]:
# Power of a matrix in Numpy
k1 = [[2, 0], [0, 2]]
A = np.matrix(k1)
print("Original Matrix:\n", A)
power_A = np.linalg.matrix_power(A, 3)
print("Matrix raised to the power 3:\n", power_A)
print("Matrix raised to the power 3 using ** operator:\n", A**3)
print("Matrix raised to the power 3 using ** operator:\n", A * A * A )


Original Matrix:
 [[2 0]
 [0 2]]
Matrix raised to the power 3:
 [[8 0]
 [0 8]]
Matrix raised to the power 3 using ** operator:
 [[8 0]
 [0 8]]
Matrix raised to the power 3 using ** operator:
 [[8 0]
 [0 8]]


In [57]:
## Determinant of a matrix in Numpy
m1 = [[4, 3], [6, 3]]
A = np.matrix(m1)
det_A = np.linalg.det(A)
print("Matrix:\n", A)
print("Determinant of the Matrix:", det_A)

Matrix:
 [[4 3]
 [6 3]]
Determinant of the Matrix: -6.0


## Saving and Loading Numpy arrays
- You can save numpy arrays to disk using functions like `np.save()` and `np.savez()`.
- You can load numpy arrays from disk using functions like `np.load()`.

In [58]:
# Saving and Loading NumPy Data
m1 = [[4, 3], [6, 3]]
A = np.matrix(m1)
np.save('array_data.npy', A)  # Save array A to a file named 'array_data.npy'

loaded_A = np.load('array_data.npy')  # Load the array from the file
print("Loaded Array from file:\n", loaded_A)

np.savez('array_data.npz', A)
loaded_data = np.load('array_data.npz')
print("Loaded Array from .npz file:\n", loaded_data['arr_0'])

Loaded Array from file:
 [[4 3]
 [6 3]]
Loaded Array from .npz file:
 [[4 3]
 [6 3]]


In [59]:
a = np.arange(6)
print(a)
a2 = a[np.newaxis, 2]
print(a2)



[0 1 2 3 4 5]
[2]
