<a href="https://colab.research.google.com/github/Shivauppe/NumPy/blob/main/NumPy_Assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Explain the purpose and advantage of NumPy in Scientific computing and data analysis , how does it enhance Python capabilities for numerical operations**

**NumPy**, short for Numerical Python, is a powerful library in Python specifically designed for scientific computing and data analysis. Its primary purpose is to enable efficient handling and processing of large arrays and matrices of numerical data. Here’s a breakdown of its core purpose, key features, and advantages in enhancing Python's numerical computing capabilities:

**Purpose of NumPy** in Scientific Computing and Data Analysis
Efficient Array Operations: NumPy provides the ndarray object, which represents multi-dimensional arrays and matrices. This is the core structure around which most of its functionality revolves, allowing for fast and efficient operations on large datasets.

**Mathematical Functions:** It includes an extensive set of mathematical functions and operations specifically optimized for array processing, which are essential in fields like physics, engineering, and machine learning.
Linear Algebra, Fourier Transforms, and Random Number Generation: NumPy also provides specialized modules for linear algebra operations, Fourier analysis, and random number generation, which are often necessary in scientific applications.

**Interfacing with Other Libraries:** Many libraries in the Python scientific stack, such as SciPy, Pandas, and Matplotlib, build on or rely on NumPy arrays, making it the foundational library in Python’s scientific ecosystem.
Advantages of Using NumPy
Performance Optimization: NumPy’s core operations are implemented in C, which is much faster for numerical computations than standard Python lists. NumPy uses vectorization, which applies operations to entire arrays instead of looping through elements, leading to substantial speedups in computations.

**Memory Efficiency**: Unlike Python lists, which are dynamically typed, NumPy arrays have a fixed data type and are more memory-efficient. This efficiency is crucial when dealing with large datasets, as it reduces the overall memory footprint.

**Broad Array Operations:** With NumPy, users can perform complex array operations like element-wise arithmetic, broadcasting (applying operations on arrays of different shapes), and indexing, which are difficult to handle efficiently in pure Python.

**Vectorized Operations:** By eliminating explicit Python loops, NumPy’s vectorized operations make code cleaner, more readable, and faster. For example, mathematical operations can be directly applied across an entire array without requiring loops, which reduces computational time and simplifies code.

Easy Integration with Other Languages: NumPy arrays can be easily converted to/from arrays in languages like C and Fortran, which are widely used in high-performance computing. This interoperability is crucial for using Python in fields that traditionally relied on lower-level languages for computational efficiency.

Enhancing Python for Numerical Operations:

In summary, NumPy transforms Python from a general-purpose language into a powerful tool for scientific and data analysis. Its capabilities enable researchers and data scientists to work with large datasets more efficiently, making it foundational for any numerical work in Python.

### **Compare and contrast np.mean() and np.average() functions in numpy when would you use one over the other?**

**np.mean()** and **np.average()** are NumPy functions used to calculate the central tendency of data, but they serve slightly different purposes and offer different functionalities.
Here’s a detailed comparison and contrast of the two, along with guidance on when to use each:

**np.mean():**

**Purpose:** Computes the simple arithmetic mean of the elements along a specified axis.
Usage: np.mean(array, axis=None, dtype=None, out=None, keepdims=False)
Weighted Calculations: np.mean() does not support weighted calculations; it only calculates the plain mean (i.e., each element is given equal weight).
Performance: Since it only calculates a simple average, it’s typically faster and less complex.

In [None]:
import numpy as np
data = np.array([1, 2, 3, 4, 5])
np.mean(data)


3.0

**np.average()**

**Purpose:** Computes the weighted average if weights are provided; otherwise, it behaves similarly to np.mean().

**Usage:** np.average(array, axis=None, weights=None, returned=False)
Weighted Calculations: The key feature of np.average() is its ability to calculate a weighted average when the weights parameter is specified. This allows you to give different elements different levels of importance in the average calculation.

**Return Weights:** It has an optional returned parameter, which, if set to True, will return a tuple with the calculated average and the sum of the weights. This can be useful for further calculations or error-checking.

In [None]:
import numpy as np
data = np.array([1, 2, 3, 4, 5])
weights = np.array([1, 2, 1, 1, 5])
np.average(data, weights=weights)


3.7

**When to Use Each Function**

**Use np.mean():** When you want the plain arithmetic mean of an array or along an axis, especially when weights are irrelevant.

**Use np.average():** When you need to calculate a weighted average. For example, in cases where certain data points have more significance than others (like averaging scores where some assessments have higher importance).

### **Describe the methods for reversing a NumPy array along different axes. Provide a example for 1d 2d arrays**

Reversing a NumPy array can be done easily using slicing and specific NumPy functions to reverse the array along different axes. Here’s how to reverse a 1D and 2D array along various axes:

1. Reversing a 1D Array
For a 1D array, reversing means flipping the order of elements from the last to the first.

Method: Using Slicing
You can reverse a 1D array by slicing it with [::-1], which steps through the array from the end to the beginning.

In [None]:
import numpy as np

# 1D Array
arr1d = np.array([1, 2, 3, 4, 5])
reversed_arr1d = arr1d[::-1]
print("Original 1D Array:", arr1d)
print("Reversed 1D Array:", reversed_arr1d)


Original 1D Array: [1 2 3 4 5]
Reversed 1D Array: [5 4 3 2 1]


2. Reversing a 2D Array

In a 2D array, you have the option to reverse along:

Rows (axis 0): Flip the rows in reverse order.
Columns (axis 1): Flip the columns in reverse order.
Entire Array: Reverse both rows and columns.

Method: Using Slicing
Reverse along rows (axis 0): array[::-1, :]
Reverse along columns (axis 1): array[:, ::-1]
Reverse entire array: array[::-1, ::-1]

In [None]:
# 2D Array
arr2d = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

# Reverse along rows (axis 0)
reversed_rows = arr2d[::-1, :]
print("Reversed Rows:\n", reversed_rows)

# Reverse along columns (axis 1)
reversed_columns = arr2d[:, ::-1]
print("Reversed Columns:\n", reversed_columns)

# Reverse entire array
reversed_entire = arr2d[::-1, ::-1]
print("Reversed Entire Array:\n", reversed_entire)


Reversed Rows:
 [[7 8 9]
 [4 5 6]
 [1 2 3]]
Reversed Columns:
 [[3 2 1]
 [6 5 4]
 [9 8 7]]
Reversed Entire Array:
 [[9 8 7]
 [6 5 4]
 [3 2 1]]


Alternative: Using np.flip()
NumPy’s np.flip() function can also be used for reversing along specified axes:

np.flip(arr, axis=0): Reverses rows (axis 0).
np.flip(arr, axis=1): Reverses columns (axis 1).
np.flip(arr): Reverses both rows and columns.

In [None]:
# Reverse rows using np.flip
reversed_rows_flip = np.flip(arr2d, axis=0)
print("Reversed Rows using np.flip:\n", reversed_rows_flip)

# Reverse columns using np.flip
reversed_columns_flip = np.flip(arr2d, axis=1)
print("Reversed Columns using np.flip:\n", reversed_columns_flip)

# Reverse entire array using np.flip
reversed_entire_flip = np.flip(arr2d)
print("Reversed Entire Array using np.flip:\n", reversed_entire_flip)


Reversed Rows using np.flip:
 [[7 8 9]
 [4 5 6]
 [1 2 3]]
Reversed Columns using np.flip:
 [[3 2 1]
 [6 5 4]
 [9 8 7]]
Reversed Entire Array using np.flip:
 [[9 8 7]
 [6 5 4]
 [3 2 1]]


### **How can you determine the data types of a elements in a numPy array? Disscuss the importance of data types in memoray management and performance**

In NumPy, you can determine the data type of elements in an array using the .dtype attribute, which provides information about the type of data stored in the array. Here’s a closer look at how to check data types in NumPy and why managing data types is crucial for memory efficiency and computational performance.

Determining the Data Type of Elements in a NumPy Array
Using .dtype Attribute

The .dtype attribute of a NumPy array returns the data type of the elements in the array.

In [None]:
import numpy as np
arr = np.array([1, 2, 3])
print(arr.dtype)  # Output: int64 (or int32 depending on the system)


int64


Specifying Data Type When Creating an Array

You can specify the data type explicitly when creating a NumPy array using the dtype parameter, which helps in memory optimization.

In [None]:
arr_float = np.array([1, 2, 3], dtype=np.float32)
print(arr_float.dtype)

float32


Using np.array().astype() for Type Conversion

The .astype() method can convert an array to a specified data type, which can help in cases where you need a particular precision or memory optimization.

In [None]:
arr_int = arr_float.astype(np.int32)
print(arr_int.dtype)  # Output: int32


int32


Importance of Data Types in Memory Management and Performance
Memory Efficiency

Data Type and Size: Different data types use different amounts of memory. For example, int32 uses 4 bytes per element, whereas int64 uses 8 bytes. By choosing the appropriate data type, you can reduce memory usage, which is essential when working with large datasets.
Impact of Precision: Using a larger data type than necessary wastes memory. For instance, if a dataset contains only small integers, storing it in int8 or int16 rather than int64 can save a significant amount of memory.

### **Define ndarrays in numPy and explain their key features how do they differ from the standard python lists**

In NumPy, an ndarray (short for N-dimensional array) is the core data structure used for representing multi-dimensional, homogeneous data arrays. Unlike standard Python lists, which are flexible but less efficient for numerical computations, ndarrays are specifically optimized for handling large volumes of numerical data in scientific computing.

**Key Features of ndarrays**

Homogeneous Data Type:

Every element in an ndarray has the same data type (e.g., int32, float64), which allows for more efficient memory management and faster computations.
Multi-dimensional:

ndarrays can represent data in multiple dimensions (1D, 2D, 3D, etc.), making it easy to work with complex datasets like matrices and higher-dimensional tensors.
Fixed Size:

The size of an ndarray is fixed when created, meaning that the number of elements cannot be changed (though the shape can be altered, if compatible).
Memory Efficiency:

ndarrays store data in contiguous memory blocks, which improves data access speed and minimizes memory overhead, especially with large datasets.
Supports Vectorized Operations:

ndarrays allow vectorized operations, meaning that mathematical operations can be applied to entire arrays at once without explicit loops. This is faster than iterating through elements individually, as you would with a list.
Advanced Indexing and Slicing:

ndarrays support complex slicing and indexing, which lets users efficiently select, modify, or transform specific parts of the array without needing additional data copies.
Broadcasting:

Broadcasting allows operations between arrays of different shapes by "stretching" smaller arrays to match the shape of larger ones when possible, avoiding explicit looping and simplifying mathematical operations.
Mathematical and Statistical Functions:

NumPy provides a large suite of mathematical functions optimized for ndarrays, allowing easy calculations of mean, median, standard deviation, and other operations.


In [None]:
import numpy as np

# Creating a 2D ndarray (matrix)
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(array_2d)
print("Shape:", array_2d.shape)
print("Data type:", array_2d.dtype)


[[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Data type: int64


**Why Use ndarrays Over Python Lists?**

For data-intensive applications, ndarrays offer substantial advantages in terms of speed, memory efficiency, and ease of use. By choosing ndarrays, you benefit from optimized, pre-built numerical operations, efficient memory layout, and sophisticated manipulation capabilities, all of which are critical in scientific computing and data analysis.

### **Analyze the performance benefits of numpy arrays over Python lists for large scale numerical operations**

NumPy arrays (ndarrays) offer significant performance benefits over Python lists for large-scale numerical operations. These benefits stem from several underlying optimizations that make ndarrays more efficient for handling large volumes of numerical data. Here’s a breakdown of the key performance benefits and why they make NumPy arrays superior for large-scale numerical tasks:

1. Contiguous Memory Layout and Cache Efficiency
NumPy arrays store elements in contiguous blocks of memory, which makes them more cache-friendly. This layout allows faster access to elements since the CPU can retrieve subsequent data points in bulk rather than making individual memory calls.
In contrast, Python lists store references to objects, which can be scattered across memory, resulting in slower access times as the system must locate each object.

2. Fixed Data Type (Homogeneity)
Each element in a NumPy array has the same data type, meaning NumPy arrays don’t need to store type information with each element. This makes it possible to allocate memory more efficiently and access data without checking type compatibility on every operation.
Python lists are heterogeneous (they can store different types of data within a single list), so each element has associated metadata, increasing memory consumption and slowing down operations that involve large numbers of elements.

3. Reduced Memory Overhead

NumPy arrays are more memory-efficient. For example, a float64 NumPy array only uses 8 bytes per element, while a Python list storing floats will require significantly more memory per element because it stores both the object reference and the data.
This reduced memory footprint is particularly valuable for large datasets, where saving memory directly translates into faster computation and the ability to work with larger datasets.

4. Vectorized Operations (Avoiding Loops)

NumPy supports vectorized operations, meaning operations can be performed on entire arrays at once without the need for explicit loops. This approach leverages low-level, optimized C implementations behind the scenes, resulting in significant speed improvements.
In contrast, Python lists require explicit looping to apply operations on each element, which can be much slower due to the interpreted nature of Python and the overhead of repeatedly calling functions.

In [None]:
import numpy as np

# NumPy array
arr_np = np.array([1, 2, 3, 4, 5])
arr_np = arr_np * 2  # Vectorized multiplication: no loop needed

# Python list
arr_list = [1, 2, 3, 4, 5]
arr_list = [x * 2 for x in arr_list]  # Explicit loop required


5. Broadcasting

Broadcasting allows NumPy to perform operations on arrays of different shapes by "stretching" the smaller array to match the shape of the larger one. This avoids the need for extra memory allocation and complex looping logic.
Python lists don’t support broadcasting; you need to use explicit loops to adjust and align data structures for operations, leading to slower and more cumbersome code.

In [None]:
# Broadcasting in NumPy
arr = np.array([1, 2, 3])
arr_broadcasted = arr + 10  # Adds 10 to each element without a loop

# No broadcasting for Python lists
arr_list = [1, 2, 3]
arr_broadcasted_list = [x + 10 for x in arr_list]  # Loop required


 Example of Performance Comparison: NumPy Arrays vs. Python Lists

In [None]:
import numpy as np
import time

# Creating a large NumPy array and a large Python list
size = 10**6
arr_np = np.arange(size)
arr_list = list(range(size))

# Timing NumPy array operation
start_time = time.time()
result_np = arr_np * 2  # Vectorized operation
end_time = time.time()
print("NumPy array time:", end_time - start_time)

# Timing Python list operation
start_time = time.time()
result_list = [x * 2 for x in arr_list]  # Loop-based operation
end_time = time.time()
print("Python list time:", end_time - start_time)


NumPy array time: 0.008945226669311523
Python list time: 0.17708325386047363


In conclusion, for large-scale numerical operations, NumPy arrays vastly outperform Python lists in both speed and memory efficiency due to contiguous memory storage, homogeneous data types, vectorized operations, and low-level optimizations. This is why NumPy is the preferred choice for scientific computing, data analysis, and machine learning tasks in Python.

### **Compare vstack() hstack() functions in numpy.Provide examples demonstrating their usuage and output**

The vstack() and hstack() functions in NumPy are used to stack arrays vertically and horizontally, respectively. Here’s a breakdown of each function, along with examples to demonstrate their usage and outputs.

np.vstack()

Purpose: Stacks arrays in sequence vertically (row-wise).

Usage: np.vstack((array1, array2, ...))

Requirements: The arrays must have the same number of columns (i.e., matching the second dimension).

In [None]:
import numpy as np

# Two 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Using vstack to stack them vertically
result = np.vstack((arr1, arr2))
print("Result of vstack:\n", result)


Result of vstack:
 [[1 2 3]
 [4 5 6]]


In [None]:
# Two 2D arrays
arr3 = np.array([[1, 2, 3], [4, 5, 6]])
arr4 = np.array([[7, 8, 9], [10, 11, 12]])

# Using vstack to stack them vertically
result_2d = np.vstack((arr3, arr4))
print("Result of vstack (2D arrays):\n", result_2d)


Result of vstack (2D arrays):
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


np.hstack()

Purpose: Stacks arrays in sequence horizontally (column-wise).

Usage: np.hstack((array1, array2, ...))

Requirements: The arrays must have the same number of rows (i.e., matching the first dimension).

In [None]:
# Two 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Using hstack to stack them horizontally
result = np.hstack((arr1, arr2))
print("Result of hstack:\n", result)


Result of hstack:
 [1 2 3 4 5 6]


In [None]:
# Two 2D arrays
arr3 = np.array([[1, 2, 3], [4, 5, 6]])
arr4 = np.array([[7, 8, 9], [10, 11, 12]])

# Using hstack to stack them horizontally
result_2d = np.hstack((arr3, arr4))
print("Result of hstack (2D arrays):\n", result_2d)


Result of hstack (2D arrays):
 [[ 1  2  3  7  8  9]
 [ 4  5  6 10 11 12]]


Use vstack() when you want to add rows, and hstack() when you want to add columns. These functions simplify array manipulations, particularly when working with multi-dimensional data.

### **Explain the diff between fliplr() and flipud() methods in numpy, including their effects on various array dimensions**

In NumPy, the fliplr and flipud functions are used to flip arrays along different axes:

fliplr (Flip Left to Right): This function flips the array in the left-to-right (horizontal) direction, meaning it reverses the order of columns.
flipud (Flip Up to Down): This function flips the array in the up-to-down (vertical) direction, meaning it reverses the order of rows.
These functions work specifically on 2D (or higher-dimensional) arrays but behave differently depending on the array dimensions.

np.fliplr()
Purpose: Flips the array horizontally, reversing the order of columns.
Usage: np.fliplr(array)
Requirements: The array must have at least 2 dimensions (it does not affect 1D arrays).



In [None]:
import numpy as np

# 2D Array
arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

# Using fliplr to flip horizontally (left to right)
result_fliplr = np.fliplr(arr)
print("Original Array:\n", arr)
print("Array after fliplr:\n", result_fliplr)


Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Array after fliplr:
 [[3 2 1]
 [6 5 4]
 [9 8 7]]


np.flipud()

Purpose: Flips the array vertically, reversing the order of rows.

Usage: np.flipud(array)

Requirements: The array must have at least 2 dimensions (it does not affect 1D arrays).


In [None]:
# Using flipud to flip vertically (up to down)
result_flipud = np.flipud(arr)
print("Original Array:\n", arr)
print("Array after flipud:\n", result_flipud)


Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Array after flipud:
 [[7 8 9]
 [4 5 6]
 [1 2 3]]


Effects on Arrays of Different Dimensions

1D Arrays: Neither fliplr nor flipud will affect a 1D array because these functions expect at least 2D input. If you attempt to use them on a 1D array, they will raise an error.

In [None]:
arr_1d = np.array([1, 2, 3])
# np.fliplr(arr_1d) -> Raises an error
# np.flipud(arr_1d) -> Raises an error


2D Arrays:

fliplr reverses the columns, changing the left-to-right order in each row.
flipud reverses the rows, changing the up-to-down order of rows.
3D Arrays (or Higher):

fliplr and flipud will treat each 2D sub-array independently. For example, if you have a 3D array, fliplr will reverse the columns within each 2D matrix along the last axis, while flipud will reverse the rows along the second-to-last axis.

In [None]:
arr_3d = np.array([[[1, 2, 3], [4, 5, 6]],
                   [[7, 8, 9], [10, 11, 12]]])

print("Original 3D Array:\n", arr_3d)
print("After fliplr:\n", np.fliplr(arr_3d))
print("After flipud:\n", np.flipud(arr_3d))


Original 3D Array:
 [[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
After fliplr:
 [[[ 4  5  6]
  [ 1  2  3]]

 [[10 11 12]
  [ 7  8  9]]]
After flipud:
 [[[ 7  8  9]
  [10 11 12]]

 [[ 1  2  3]
  [ 4  5  6]]]


### **Discuss the functionality of array_split() method in NumPy.How does it handle uneven splits**

In NumPy, the array_split() function is used to split an array into multiple sub-arrays along a specified axis. This method is flexible and can handle cases where the array cannot be evenly split, making it particularly useful when you need to split arrays into chunks of varying sizes.

array_split() Functionality
Syntax: np.array_split(array, indices_or_sections, axis=0)
Parameters:
array: The array to be split.
indices_or_sections: Specifies the number of sections to split the array into or a list of indices where splits should occur.
axis: The axis along which to split the array (default is 0, i.e., split along rows).
Key Features of array_split()
Flexible Splitting:

If indices_or_sections is an integer, array_split() splits the array into that many sections. If the array size isn’t evenly divisible, the last chunks will have fewer elements.
If indices_or_sections is a list of indices, array_split() splits the array at the specified indices, creating sub-arrays based on those splits.
Handles Uneven Splits:

Unlike np.split(), which requires the array to be split into equal parts, array_split() can handle cases where the array size is not a multiple of indices_or_sections. If the array cannot be divided evenly, it distributes elements as evenly as possible, with earlier sub-arrays having more elements if needed.
Examples of array_split() Usage

In [None]:
# Syntax: np.array_split(array, indices_or_sections, axis=0)

# array: The array to be split.
# indices_or_sections: Specifies the number of sections to split the array into or a list of indices where splits should occur.
# axis: The axis along which to split the array (default is 0, i.e., split along rows).

Key Features of array_split()
Flexible Splitting:

If indices_or_sections is an integer, array_split() splits the array into that many sections. If the array size isn’t evenly divisible, the last chunks will have fewer elements.
If indices_or_sections is a list of indices, array_split() splits the array at the specified indices, creating sub-arrays based on those splits.
Handles Uneven Splits:

Unlike np.split(), which requires the array to be split into equal parts, array_split() can handle cases where the array size is not a multiple of indices_or_sections. If the array cannot be divided evenly, it distributes elements as evenly as possible, with earlier sub-arrays having more elements if needed.

In [1]:
import numpy as np

# 1D Array
arr = np.array([1, 2, 3, 4, 5, 6, 7])

# Splitting into 3 sections (7 elements cannot be divided evenly into 3 parts)
result = np.array_split(arr, 3)
print("Result of array_split with uneven split:\n", result)


Result of array_split with uneven split:
 [array([1, 2, 3]), array([4, 5]), array([6, 7])]


In [2]:
# 2D Array
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9],
                   [10, 11, 12]])

# Splitting into 3 sections along rows
result_2d = np.array_split(arr_2d, 3, axis=0)
print("Result of array_split along rows:\n", result_2d)


Result of array_split along rows:
 [array([[1, 2, 3],
       [4, 5, 6]]), array([[7, 8, 9]]), array([[10, 11, 12]])]


In [3]:
# 2D Array
arr_2d = np.array([[1, 2, 3, 4],
                   [5, 6, 7, 8]])

# Splitting into 3 sections along columns
result_2d_columns = np.array_split(arr_2d, 3, axis=1)
print("Result of array_split along columns:\n", result_2d_columns)


Result of array_split along columns:
 [array([[1, 2],
       [5, 6]]), array([[3],
       [7]]), array([[4],
       [8]])]


The array_split() method is ideal for splitting arrays flexibly, especially when the array’s size does not allow for equal splits.

### **Explain the concept of vectorization and broadcasting in NumPy.How do they contribute to the efficifient array operator?**


1. Vectorization

Definition:
Vectorization is the process of performing operations on entire arrays (or vectors) at once, rather than element by element. This approach uses low-level implementations in optimized C and Fortran libraries, enabling much faster execution than if we used Python loops.

How it Works:
In a vectorized operation, a single operation is applied to every element in an array without the need for explicit iteration. This is achieved by leveraging underlying SIMD (Single Instruction, Multiple Data) capabilities of modern CPUs, allowing multiple operations to be performed in parallel.

Advantages:

Speed: Operations are done directly in compiled code, avoiding the overhead of Python loops.

Code Simplicity: Vectorized code is usually more readable, concise, and less error-prone.

In [4]:
import numpy as np

# Creating two large arrays
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([10, 20, 30, 40, 50])

# Vectorized addition
result = arr1 + arr2  # Adds element-wise without loops
print("Result of vectorized addition:", result)


Result of vectorized addition: [11 22 33 44 55]


2. Broadcasting

Definition:

Broadcasting is a mechanism that allows NumPy to perform operations on arrays of different shapes in a way that would otherwise be incompatible. When two arrays of different shapes are operated on, NumPy attempts to "stretch" or "broadcast" the smaller array to match the shape of the larger one.

How it Works:
 Broadcasting works by virtually expanding the smaller array along the dimensions required, without physically copying data, so that it matches the shape of the larger array. This allows for element-wise operations to be performed seamlessly.

Conditions for Broadcasting:

If the arrays have different numbers of dimensions, prepend 1s to the shape of the smaller array until both arrays have the same number of dimensions.
Two dimensions are compatible for broadcasting if they are equal or if one of them is 1.

If both arrays can match shapes using these rules, broadcasting occurs.
Advantages:

Memory Efficiency:

No additional memory is used for the smaller array, as broadcasting only conceptually expands it.
Speed: It avoids the need for manual looping and allows NumPy to perform the operation using optimized code.

In [5]:
arr1 = np.array([[1, 2, 3], [4, 5, 6]])  # Shape (2, 3)
arr2 = np.array([10, 20, 30])            # Shape (3,)

# Broadcasting in action
result = arr1 + arr2
print("Result of broadcasting addition:\n", result)


Result of broadcasting addition:
 [[11 22 33]
 [14 25 36]]


How Vectorization and Broadcasting Contribute to Efficient Array Operations
Eliminate Explicit Loops: Vectorization and broadcasting remove the need for explicit loops in Python, which are slower due to Python’s interpreter overhead. This directly improves performance, especially for large arrays.

Leverage Optimized Libraries: Both vectorization and broadcasting allow NumPy to use optimized C/Fortran routines that are much faster than equivalent operations performed in pure Python.

Reduce Memory Usage: Broadcasting avoids duplicating data, allowing operations to be performed without increasing memory usage, even when arrays of different shapes are involved.

Enable Parallel Processing: Vectorized operations take advantage of the CPU’s ability to perform multiple operations simultaneously (SIMD), further enhancing performance.

In [6]:
# Summary Example

# Here’s an example that combines vectorization and broadcasting to perform complex calculations efficiently.

arr1 = np.array([1, 2, 3])   # Shape (3,)
arr2 = np.array([[10], [20], [30]])  # Shape (3, 1)

# Broadcasting allows addition across mismatched dimensions
result = arr1 + arr2  # Broadcasts arr1 across arr2
print("Result of broadcasting and vectorized addition:\n", result)


Result of broadcasting and vectorized addition:
 [[11 12 13]
 [21 22 23]
 [31 32 33]]


Here, arr1 (shape (3,)) is broadcast across arr2 (shape (3,1)) so they can be added together, with vectorized addition applying across the entire array efficiently.

Conclusion

Vectorization and broadcasting are key to NumPy’s efficiency in handling array operations. They allow for faster, more memory-efficient, and concise code, making NumPy a powerful tool for scientific computing, data analysis, and machine learning applications.

### **==============================================================**

### **===========Practical Questions====================**

In [8]:
# Create a 3*3 NumPy array with random integers between 1 and 100, then interchange its rows and columns

import numpy as np

# Create a 3x3 array with random integers between 1 and 100
array = np.random.randint(1, 101, size=(3, 3))
print("Original Array:\n", array)

# Interchange rows and columns by transposing the array
transposed_array = array.T
print("Transposed Array:\n", transposed_array)


Original Array:
 [[22 24 37]
 [18 55 86]
 [ 7 84 22]]
Transposed Array:
 [[22 18  7]
 [24 55 84]
 [37 86 22]]


In [9]:
# #Generate a 1D NumPy array with 10 elements . Reshape it into 2 * 5 array, then into 5 * 2 array

import numpy as np

# Generate a 1D array with 10 elements
array_1d = np.arange(10)
print("Original 1D Array:\n", array_1d)

# Reshape it into a 2x5 array
array_2x5 = array_1d.reshape(2, 5)
print("\nReshaped into 2x5 Array:\n", array_2x5)

# Reshape it into a 5x2 array
array_5x2 = array_2x5.reshape(5, 2)
print("\nReshaped into 5x2 Array:\n", array_5x2)


# Explanation
# np.arange(10): Creates a 1D array with 10 elements from 0 to 9.
# array_1d.reshape(2, 5): Reshapes the array into a 2x5 shape.
# array_2x5.reshape(5, 2): Further reshapes the array into a 5x2 shape.

Original 1D Array:
 [0 1 2 3 4 5 6 7 8 9]

Reshaped into 2x5 Array:
 [[0 1 2 3 4]
 [5 6 7 8 9]]

Reshaped into 5x2 Array:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [10]:
# Create a 4 * 4 NumPy array with random float values. Add a border of zeroes around it, resulting in a 6 * 6 array

import numpy as np

# Create a 4x4 array with random float values
array_4x4 = np.random.rand(4, 4)
print("Original 4x4 Array:\n", array_4x4)

# Add a border of zeroes to create a 6x6 array
array_6x6 = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)
print("\n6x6 Array with Zero Border:\n", array_6x6)


# Explanation
# np.random.rand(4, 4): Generates a 4x4 array with random float values between 0 and 1.

# np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0): Adds a border of width 1 around the original array, filling it with zeroes.

# Example Output
# The output will vary because the array values are randomly generated, but it will look like this:

# The resulting 6x6 array has a border of zeroes around the original 4x4 array.


Original 4x4 Array:
 [[0.17443927 0.14303244 0.1413225  0.01004002]
 [0.31740645 0.2453697  0.82212226 0.01083958]
 [0.58087511 0.70433981 0.80747976 0.57380893]
 [0.73918965 0.9423672  0.54179403 0.65868331]]

6x6 Array with Zero Border:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.17443927 0.14303244 0.1413225  0.01004002 0.        ]
 [0.         0.31740645 0.2453697  0.82212226 0.01083958 0.        ]
 [0.         0.58087511 0.70433981 0.80747976 0.57380893 0.        ]
 [0.         0.73918965 0.9423672  0.54179403 0.65868331 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]
