### ndarray

An ndarray is a multi-dimensional array with items of the same type and size.

Dimensions and items are defined by a tuple of N non-negative integers specifying each dimension’s size.

An ndarray has an associated data-type object that defines the stored dtype.

Items in an ndarray can be accessed via indexing and slicing.

The ndarray has various methods and attributes for accessing and manipulating its contents.

Separate ndarray instances can share contents, so changes in one can reflect in another when created as a "view" of another ndarray (the "base").

In [2]:
import numpy as np

#### Creation

In [7]:
nd1 = np.array([[1,2,3],[4,5,6],[7,8,9]]) #Creates a 2D array nd1 with the shape (3, 3).
nd2 = np.ones_like(nd1) #Creates a new array nd2 with the same shape and dtype as nd1, but filled with ones
nd3 = np.full((2,2),5) #Creates a 2x2 array nd3 filled with the value 5

'''.ones() requires explicitly defined dimensions,
while .ones_like() replicates the shape and dtype of an existing array.

similarly this is applicable for .empty()/.empty_like(), .zeros()/.zeros_like(),.full()/.full_like()
'''

print(nd1)
print() 
print(nd2)
print() 
print(nd3)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[1 1 1]
 [1 1 1]
 [1 1 1]]

[[5 5]
 [5 5]]


#### Operations on an ndarray

.. +	        numpy.add(X,Y)	 

.. -	        numpy.subtract(X,Y)	

.. *	        numpy.multiply(X,Y)

.. /	        numpy.divide(X,Y)

.. **	        numpy.power(X,Y)	

.. %	        numpy.mod(X,Y)

.. //	        numpy.floor_divide(X,Y)	

.. @	        numpy.matmul(X,Y)	Matrix multiplication of arguments. (The @ operator was introduced in Python 3.5)

In [10]:
nd1 = np.array([[1,2,3],[4,5,6],[7,8,9]])

print(nd1)
print()
print(nd1 + 5)
print()
print(nd1 % 2)
print()
print(np.matmul(nd1, nd1 % 2))

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[[ 6  7  8]
 [ 9 10 11]
 [12 13 14]]

[[1 0 1]
 [0 1 0]
 [1 0 1]]

[[ 4  2  4]
 [10  5 10]
 [16  8 16]]


#### Other Methods


.astype()
Converts a NumPy array to a specified data type.


.fiil()
Fills a NumPy array with a specified scalar value.


.flatten()
Returns a one-dimensional copy of a given 2D array.


.reshape()
Rearranges the data of an ndarray into a new shape.


.tolist()
Converts an array into a nested list of Python scalars.


.transpose()
Reverses or permutes the axes of an ndarray.



In [12]:
# 2D array with 3 rows and 4 columns
array_2d = np.array([[1, 2, 3, 4],
                     [5, 6, 7, 8],
                     [9, 10, 11, 12]])

print(array_2d)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [16]:
#min val across the entire array
min_val = np.min(array_2d)

#min val across the entire array
max_val = np.max(array_2d)

# Compute the minimum value along each column (axis=0)
min_value_col = np.min(array_2d, axis=0)

# Compute the minimum value along each column (axis=0)
max_value_col = np.max(array_2d, axis=0)

# Compute the minimum value along each row (axis=1)
min_value_row = np.min(array_2d, axis=1)

# Compute the minimum value along each row (axis=1)
max_value_row = np.max(array_2d, axis=1)

# Display the results
print(f"Min value of the entire array: {min_val}")
print(f"Max value of the entire array: {max_val}")
print(f"Min value along each column: {min_value_col}")
print(f"Max value along each column: {max_value_col}")
print(f"Min value along each row: {min_value_row}")
print(f"Max value along each row: {max_value_row}")

Min value of the entire array: 1
Max value of the entire array: 12
Min value along each column: [1 2 3 4]
Max value along each column: [ 9 10 11 12]
Min value along each row: [1 5 9]
Max value along each row: [ 4  8 12]


#### To convert a Pandas DataFrame into a NumPy ndarray, you can use the .to_numpy() method. This method returns the underlying data of the DataFrame as a NumPy array.

#####  When to Convert:
If you need to perform complex numerical or matrix operations (such as using matrix factorization, linear algebra, etc.).

If you want to use the array in machine learning frameworks or scientific libraries that accept NumPy arrays.

If you no longer need the rich functionality of Pandas (like indexing, missing data handling, or column labels), and want to focus on efficient computation.


##### When Not to Convert:
If your data involves mixed types (e.g., numeric and categorical), as NumPy arrays typically require homogenous types across the array.

If you need to make frequent use of Pandas’ high-level features, such as groupby, filtering, or complex indexing. In this case, keeping the data in a Pandas DataFrame is more beneficial.


#### Key Points:
2D is the default dimension of a Pandas DataFrame (rows and columns).

1D DataFrames can exist if there’s only one column, but it’s typically treated as a Series.

Higher dimensions (such as 3D or beyond) aren’t natively supported by DataFrames but can be simulated with MultiIndex or by using other data structures like Panel (deprecated). However, working with multi-dimensional data structures in Pandas usually requires handling complex indexing or using other libraries like xarray for true multidimensional support.




In [18]:
import pandas as pd

# Create a simple DataFrame
data = {'A': [1, 2, 3],
        'B': [4, 5, 6],
        'C': [7, 8, 9]}

df = pd.DataFrame(data)

# Convert DataFrame to ndarray
ndarray = df.to_numpy()

print("DataFrame:")
print(df)
print("\nConverted ndarray:")
print(ndarray)


DataFrame:
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

Converted ndarray:
[[1 4 7]
 [2 5 8]
 [3 6 9]]


#### In Python, both array and ndarray are used for handling arrays of data, but they belong to different libraries and have some important distinctions. Here's a breakdown of the differences:

1. Origin and Libraries:

array: This is part of Python's standard library (array module) and is used for creating arrays of homogeneous data types (i.e., arrays where all elements must be of the same type).

ndarray: This is part of NumPy, a popular library for numerical computing in Python. The ndarray is NumPy's core data structure and is used for handling n-dimensional arrays of homogeneous data types.

2. Capabilities:

array:

Limited functionality compared to ndarray.

Primarily for basic, lower-level array operations (such as iteration, slicing, and element access).

Supports fewer operations for mathematical and vectorized calculations.


ndarray:

Supports a wide range of operations, including element-wise arithmetic, linear algebra, broadcasting, and many other powerful functions.

Built to handle large multi-dimensional arrays efficiently.

Integrates with other libraries for numerical and scientific computing (e.g., SciPy, pandas).

3. Performance:

array: Suitable for smaller-scale arrays or when you don't need advanced mathematical functionality. Its performance is generally slower compared to ndarray because it's part of the standard library with fewer optimizations.

ndarray: Highly optimized for numerical operations, especially with large arrays. NumPy arrays are implemented in C and are much faster for numerical computing.

4. Dimensionality:

array: Limited to one-dimensional or simple multi-dimensional arrays.

ndarray: Can represent arrays of any number of dimensions (1D, 2D, 3D, etc.), allowing you to work with complex structures like matrices and tensors.


5. Memory:

array: More memory-efficient for simple arrays of homogeneous data types (such as integers or floats) compared to Python lists.

ndarray: More efficient in terms of memory layout and handling of large datasets. It uses contiguous blocks of memory to store the data, which leads to better cache locality and faster access.


6. Data Type Support:

array: Only supports a limited number of basic data types such as int, float, and char.

ndarray: Supports a wide variety of numerical types, including integers, floats, complex numbers, and even custom data types via structured arrays.

8. Slicing and Indexing:

array: Supports basic slicing and indexing operations, but its capabilities are limited compared to NumPy arrays.

ndarray: Supports advanced indexing, slicing, and fancy indexing. It also supports multidimensional slicing, which is essential when working with matrices or higher-dimensional arrays.



In [21]:
# Using Python array
import array
arr = array.array('d', [1.0, 2.5, 3.7])
print(arr)

# Using NumPy ndarray
import numpy as np
arr_np = np.array([1.0, 2.5, 3.7])
print(arr_np)


array('d', [1.0, 2.5, 3.7])
[1.  2.5 3.7]


#### Vectorization is Important in NumPy:

Performance: Vectorized operations are executed in C under the hood, which is significantly faster than Python loops.

Code Simplicity: Vectorized operations simplify the code and make it more readable.

Memory Efficiency: NumPy’s vectorized operations are memory-efficient, as they avoid creating temporary arrays (unlike loops, which might generate unnecessary intermediate results).

### Notes:

1. Scalar and Array Operations

You can perform operations between scalars and arrays, and NumPy automatically broadcasts the scalar value to each element of the array.

2. Mathematical Functions

NumPy provides many built-in vectorized functions (ufuncs) that can be applied directly to arrays. for e.g. np.sqrt(arr)

3. Broadcasting (Handling Arrays of Different Shapes)

Broadcasting is a powerful feature of NumPy that allows operations on arrays of different shapes without the need for explicit replication.

4. Conditional Vectorized Operations (Using np.where)

NumPy allows you to apply conditional operations on arrays, which can be vectorized efficiently.

5. Vectorized Aggregations (Sum, Mean, etc.)

You can perform aggregations like sum, mean, and standard deviation on entire arrays or along specific axes without the need for loops.

6. Vectorized Linear Algebra Operations

NumPy also supports vectorized operations for linear algebra, like matrix multiplication.


In [24]:
#without vectorization, using loops

# Two 1D arrays
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

# Using loops
result = np.zeros_like(arr1)
for i in range(len(arr1)):
    result[i] = arr1[i] + arr2[i]
print(result)  # Output: [ 6  8 10 12]



#with vectorization

# Two 1D arrays
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

# Vectorized operation
result = arr1 + arr2
print(result)  # Output: [ 6  8 10 12]



[ 6  8 10 12]
[ 6  8 10 12]


### Problem 1: Calculate Euclidean distance



In [25]:
import numpy as np

# Two points
point1 = np.array([1, 2, 3])
point2 = np.array([4, 5, 6])

# 
distance = np.linalg.norm(point1 - point2)

print("Euclidean Distance:", distance)

Euclidean Distance: 5.196152422706632


### Problem 2: Find all occurences of an element in a list

In [32]:
#Using python loops

my_list = [1, 2, 3, 2, 4, 2, 5, 6, 5]
element = 2 

indices = []

for i in range(len(my_list)):
    if my_list[i] == element:
        indices.append(i)

print(indices)




#using enummerate + comprehension

element2 = 5

indices2 = []

indices2 = [i for i, x in enumerate(my_list) if x == element2]

print(indices2)

#using numpy where

element3 = 3

my_list = np.array(my_list)
indices3 = np.where(my_list == element3)[0]

print(indices3)



[1, 3, 5]
[6, 8]
[2]


### Problem 3: Access ith column  of numpy multidimensional array



In [36]:
# A 2D array (3x4 matrix)
arr = np.array([[1, 2, 3, 4],
                [5, 6, 7, 8],
                [9, 10, 11, 12]])

print(arr[:,1])  #i = 1

[ 2  6 10]


### Problem 4: Convert array of indices to one-hot encoded numpy array

### Problem 5: Convert numpy array to categorical array

In [39]:
# Numerical array
numerical_array = np.array([1, 2, 3, 4, 5])

# Categorical Array
cat_array = numerical_array.astype(str)

print(cat_array)
print(numerical_array)

['1' '2' '3' '4' '5']
[1 2 3 4 5]


### Problem 6: How would you get N-max values in a numpy array?

#### Hint:
Use np.sort() or np.argsort() for small to medium arrays.

Use np.argpartition() for large arrays or when performance is critical.

### Problem 7: Find local maxima in a 1D and 3D array?

### Problem 8: np.transpose() vs eisum() with example

### Problem 9: Reverse a numpy array for 1D and 2D
#### Also practice for 3D

In [40]:
# Example 1D array
arr = np.array([1, 2, 3, 4, 5])

# Reverse the array
reversed_arr = arr[::-1]

print("Original Array:", arr)
print("Reversed Array:", reversed_arr)


Original Array: [1 2 3 4 5]
Reversed Array: [5 4 3 2 1]


In [53]:
import numpy as np

# Example 2D array
arr_2D = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Reverse the array along the rows
reversed_rows = arr_2D[::-1]

print("Original Array:\n", arr_2D)
print("Reversed Rows:\n", reversed_rows)

#along colums
rev_col = arr_2D[:,::-1]
print("Reversed Columns:\n", rev_col)

#both rows and columns
rev_both = arr_2D[::-1,::-1]
print("Reversed both:\n", rev_both)

Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Reversed Rows:
 [[7 8 9]
 [4 5 6]
 [1 2 3]]
Reversed Columns:
 [[3 2 1]
 [6 5 4]
 [9 8 7]]
Reversed both:
 [[9 8 7]
 [6 5 4]
 [3 2 1]]


### Problem 10: test[:,0] vs test[:,[0]]

### Problem 11: Create subarrays with a sliding window of size 4

##### E.g input_arr = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14], 
 output =  [[1, 2, 3, 4], 
 [2, 3, 4, 5], 
 [3, 4, 5, 6], 
 [4, 5, 6, 7], 
 [5, 6, 7, 8], 
 [6, 7, 8, 9], 
 [7, 8, 9, 10], 
 [8, 9, 10, 11], 
 [9, 10, 11, 12], 
 [10, 11, 12, 13], 
 [11, 12, 13, 14]]



In [None]:
arr_4 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

# Create subarrays with a sliding window of size 4
result = [arr[i:i + 4] for i in range(len(arr) - 3)]

print(result)


'''
Explanation:
range(len(arr) - 3):
Ensures the loop stops in a way that a complete subarray of size 4 can be formed.
arr[i:i + 4]:
Slices the list starting from index i to i + 4 (exclusive).
The result is a list of overlapping subarrays of size 4.
'''
