# ***Theoretical***

In [None]:
import numpy as np

**Q.1 - Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?**

**Ans.-** NumPy, short for Numerical Python, is a fundamental library for scientific computing in Python. It provides powerful tools for handling large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these data structures efficiently. Here’s a breakdown of its purpose and advantages:

**Purpose of NumPy**

1. Array Manipulation: NumPy allows users to create and manipulate arrays with ease. Unlike Python's built-in lists, NumPy arrays are more efficient for numerical computations.

2. Performance: It is designed for performance on large datasets, allowing for operations that are significantly faster than those performed with standard Python data structures.

3. Mathematical Functions: NumPy provides a variety of functions for linear algebra, statistical operations, and random number generation, facilitating complex mathematical computations.

4. Interoperability: It integrates well with other scientific libraries, such as SciPy, Pandas, and Matplotlib, enhancing Python's ecosystem for scientific computing.

**Advantages of NumPy**

1. Efficiency: NumPy arrays use less memory and provide better performance compared to Python lists, especially for large datasets. Operations on NumPy arrays are implemented in C, which speeds up computations.

2. Convenient Syntax: NumPy offers a concise and expressive syntax for performing complex mathematical operations, such as element-wise addition, multiplication, and other array manipulations, without the need for explicit loops.

3. Broadcasting: This feature allows for arithmetic operations between arrays of different shapes, making it easy to perform operations without needing to reshape or expand arrays explicitly.

Rich Functionality: NumPy includes functions for reshaping arrays, slicing, aggregating data, and performing statistical analysis, making it a comprehensive tool for data analysis.

4. Support for Multi-dimensional Data: It supports n-dimensional arrays (ndarrays), enabling users to work with higher-dimensional data structures, which is essential in fields like machine learning and image processing.

5. Community and Documentation: NumPy has a strong community and extensive documentation, making it easier for users to find help and resources for their scientific computing needs.

**Enhancing Python's Capabilities**

NumPy enhances Python's numerical capabilities by providing:

- Speed: Operations on NumPy arrays are much faster than those on native Python data structures due to the optimized C implementation.
- Vectorization: This allows users to write code that is not only more readable but also executes faster, as it eliminates the need for explicit loops.
- Integration with C/C++ and Fortran: Users can call routines written in C or Fortran directly, which is beneficial for performance-critical applications.
- Support for Universal Functions (ufuncs): These functions allow element-wise operations on arrays, facilitating efficient computation across large datasets.

**Q.2 - Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?**

**Ans.-** The np.mean() and np.average() functions in NumPy both compute the average of an array, but they differ in functionality and flexibility. Here’s a comparison of the two:

**np.mean()**
- Purpose: Computes the arithmetic mean of the elements along a specified axis.
- Syntax: np.mean(a, axis=None, dtype=None, out=None, keepdims=False)

- **Parameters:**
1. a: Input array.
2. axis: Axis along which to compute the mean. If None, computes the mean of the flattened array.
3. dtype: Data type to use for the computation.
4. out: Alternative output array to store the result.
5. keepdims: If True, retains reduced dimensions.
- Use Case: Use np.mean() when you want a straightforward calculation of the mean across a dataset without any additional weighting considerations.

**np.average()**
- Purpose: Computes the weighted average of the elements, allowing for flexibility in assigning different weights to the elements.
- Syntax: np.average(a, axis=None, weights=None, returned=False)
- **Parameters:**

1. a: Input array.
2. axis: Axis along which to compute the average.
3. weights: Optional array of weights for the elements. If provided, the average is computed as the sum of the elements multiplied by their corresponding weights divided by the sum of the weights.
4. returned: If True, returns a tuple of the average and the sum of the weights.
Use Case: Use np.average() when you need to calculate an average where elements contribute differently based on their assigned weights.

**Key Differences**

**1. Weighting:**

- np.mean() does not consider weights; it simply calculates the arithmetic mean.
- np.average() allows for weights, providing more flexibility for situations where some data points are more significant than others.

**2. Output:**

- Both functions return a single value (or an array) representing the average, but np.average() can return additional information (the sum of the weights) when the returned parameter is set to True.

**3. Performance:**

- Both functions are efficient, but np.mean() may be slightly faster since it does not account for weights.

**When to Use Each**

**- Use np.mean() when:**

- You want to calculate a simple average without any consideration of weights.
- You are working with uniform data where every entry has equal importance.

**- Use np.average() when:**

- You need to compute a weighted average, where some values should contribute more to the average than others.
- You want to incorporate additional information (e.g., weights) into your calculations.

**Q.3 - Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.**

**Ans.-** **Reversing a 1D Array**

For a 1D array, you can reverse the array by slicing:

In [None]:
import numpy as np

# Create a 1D array
arr_1d = np.array([1, 2, 3, 4, 5])

# Reverse the array
reversed_arr_1d = arr_1d[::-1]

print(reversed_arr_1d)  # Output: [5 4 3 2 1]


[5 4 3 2 1]


**Reversing a 2D Array**

For a 2D array, you can reverse it along different axes using slicing as well.
You can specify the axis you want to reverse along.

Example of Reversing Along Different Axes:

In [None]:
#Reverse along the rows (axis=0):
#This flips the array upside down.

# Create a 2D array
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# Reverse along the rows (axis=0)
reversed_rows = arr_2d[::-1, :]

print(reversed_rows)

[[7 8 9]
 [4 5 6]
 [1 2 3]]


In [None]:
#Reverse along the columns (axis=1):
#This flips the array left to right.

# Reverse along the columns (axis=1)
reversed_columns = arr_2d[:, ::-1]

print(reversed_columns)


[[3 2 1]
 [6 5 4]
 [9 8 7]]


In [None]:
#Reverse along both axes:
#You can reverse both rows and columns simultaneously.

# Reverse along both axes
reversed_both = arr_2d[::-1, ::-1]

print(reversed_both)



[[9 8 7]
 [6 5 4]
 [3 2 1]]


Summary of Methods

- 1D Array: Use slicing array[::-1] to reverse the array.
- 2D Array:

1. To reverse along rows (upside down): array[::-1, :]
2. To reverse along columns (left to right): array[:, ::-1]
3. To reverse both rows and columns: array[::-1, ::-1]

**Q.4-  How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.**

**Ans.-** In NumPy, the data type of the elements in an array is an important factor for both memory management and computational performance. You can determine the data type of the elements in a NumPy array using the .dtype attribute.

**Determining the Data Type of a NumPy Array**

You can check the data type of the elements in a NumPy array by accessing the dtype attribute of the array.

In [None]:
import numpy as np

# Create a NumPy array with integers
arr_int = np.array([1, 2, 3, 4])
print(arr_int.dtype)  # Output: int64 (or another integer type depending on your system)

# Create a NumPy array with floating-point numbers
arr_float = np.array([1.0, 2.5, 3.8])
print(arr_float.dtype)  # Output: float64

# Create a NumPy array with complex numbers
arr_complex = np.array([1+2j, 3+4j, 5+6j])
print(arr_complex.dtype)  # Output: complex128


int64
float64
complex128


**Importance of Data Types in Memory Management and Performance**

1. **Memory Efficiency:**

- The choice of data type directly impacts the amount of memory used by the array. For example, a float64 array uses 64 bits per element, while a float32 array uses only 32 bits.
- If your data doesn't require high precision, using smaller data types can save memory. For instance, if you don’t need 64-bit precision for floating-point numbers, using float32 can significantly reduce memory usage.

In [None]:
arr_large = np.array([1.0] * 1000000, dtype=np.float64)  # 64-bit precision
print(arr_large.nbytes)  # Memory used in bytes

arr_small = np.array([1.0] * 1000000, dtype=np.float32)  # 32-bit precision
print(arr_small.nbytes)  # Memory used in bytes


8000000
4000000


2. **Performance Optimization:**

- Smaller data types not only reduce memory usage but can also speed up computation. NumPy operations are optimized for the data type of the array, so performing operations on float32 arrays can be faster than float64 arrays, especially on systems that use 32-bit processors or GPUs.
- Vectorization: NumPy operations are vectorized (i.e., operations are applied to all elements of the array without explicit loops). This process can be much faster when the data type is a smaller type, as there are fewer bits to process.

In [None]:
import time

# Creating two large arrays with different data types
arr_64 = np.ones((1000000,), dtype=np.float64)
arr_32 = np.ones((1000000,), dtype=np.float32)

# Measure time for a simple operation
start_time = time.time()
arr_64 + 1
print("64-bit operation time:", time.time() - start_time)

start_time = time.time()
arr_32 + 1
print("32-bit operation time:", time.time() - start_time)


64-bit operation time: 0.005056619644165039
32-bit operation time: 0.001733541488647461


3. **Precision and Accuracy:**

- Selecting the right data type ensures that your calculations retain the necessary level of precision. For example, using float32 when float64 precision is needed may lead to loss of accuracy in scientific calculations.
- For extremely large datasets or calculations involving very small or very large numbers, choosing an appropriate type (e.g., float128 for high precision) can prevent rounding errors or overflows.

4. **Interoperability:**

- Data types also play a key role when exchanging data between different libraries or systems. For example, if you're working with image data (which typically uses uint8 or float32), using the wrong data type could lead to errors when interfacing with other libraries (such as OpenCV, PIL, or TensorFlow).
- When loading data from files (e.g., CSVs, HDF5), NumPy may automatically infer the appropriate data type, but you can specify the dtype to ensure compatibility and efficiency.

**Common Data Types in NumPy**

- Integers: np.int8, np.int16, np.int32, np.int64 (signed integers)
- Unsigned integers: np.uint8, np.uint16, np.uint32, np.uint64
- Floating-point numbers: np.float16, np.float32, np.float64, np.float128
- Complex numbers: np.complex64, np.complex128, np.complex256
- Boolean: np.bool_
- Object: np.object (used for arrays containing arbitrary Python objects)

**Converting Between Data Types**


You can convert between different data types using the .astype() method:

In [None]:
arr_float = np.array([1.5, 2.3, 3.7], dtype=np.float64)

# Convert to integer (truncating the decimal part)
arr_int = arr_float.astype(np.int32)
print(arr_int)  # Output: [1 2 3]
print(arr_int.dtype)  # Output: int32


[1 2 3]
int32


- Memory management: Choosing the right data type can help reduce memory usage. - Smaller types (e.g., float32, int32) use less memory than larger types (e.g., float64, int64), which is important for handling large datasets.
- Performance: Smaller data types can also result in faster computations, especially on memory-bound operations, because they require less data to be processed.
- Precision and accuracy: Ensure that the data type chosen provides the necessary precision for your application without introducing errors or inefficiencies.
- Interoperability: Data types ensure compatibility with other libraries or data formats and can be explicitly specified when needed.

**Q.5- Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?**

**Ans.-** In NumPy, an ndarray (short for "n-dimensional array") is the core data structure that represents multi-dimensional, homogeneous arrays of fixed-size elements. It is a central feature of the NumPy library and is optimized for numerical operations, making it far more efficient than standard Python lists, especially for large datasets. Here's a breakdown of the key features of NumPy arrays (ndarrays) and how they differ from Python lists.

**Key Features of ndarray in NumPy**

1. Homogeneous Data:

- Type: All elements of a NumPy array must be of the same type (e.g., all integers, floats, or complex numbers). This ensures that operations on the array are more efficient, as the memory layout is contiguous and optimized for the chosen data type.
- Memory efficiency: Because elements are stored in a continuous block of memory, they use less overhead compared to Python lists, where each element is a separate object.

In [None]:
arr = np.array([1, 2, 3, 4])
print(arr.dtype)  # Output: int64 (or system-specific)


int64


2. Fixed Size:

- Once an ndarray is created, its size (the number of elements) cannot be changed. However, the array can be reshaped (i.e., the dimensions of the array can be changed), but the number of elements remains constant.

In [None]:
arr = np.array([1, 2, 3, 4])
arr = arr.reshape((2, 2))  # Reshaping the array
print(arr)


[[1 2]
 [3 4]]


3. Multidimensional:

- ndarray objects support multi-dimensional arrays, from 1D arrays (vectors) to 2D arrays (matrices) and beyond (3D arrays, nD arrays). This makes NumPy extremely powerful for handling large datasets in fields like machine learning, scientific computing, and image processing.

In [None]:
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr_2d.shape)  # Output: (2, 3), meaning 2 rows and 3 columns


(2, 3)


4. Vectorized Operations:

- NumPy allows for vectorized operations, which means you can apply mathematical operations to entire arrays at once without needing explicit loops. This is much faster than iterating over lists element-by-element.

In [None]:
arr = np.array([1, 2, 3, 4])
arr_squared = arr ** 2  # Element-wise squaring of the array
print(arr_squared)  # Output: [1 4 9 16]


[ 1  4  9 16]


5. Efficient Memory Layout:

- NumPy arrays are stored in contiguous memory blocks, which makes them much faster for large-scale numerical computations. This efficient layout contrasts with Python lists, which are arrays of pointers to objects.
- ndarrays have a dtype that specifies the type of elements, and NumPy arrays store data in a compact, low-overhead format.

6. Broadcasting:

- NumPy arrays support broadcasting, which allows operations between arrays of different shapes, as long as they are compatible in a certain way. This feature eliminates the need for manual loops or reshaping when performing element-wise operations on arrays of different sizes.

In [None]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([10])
result = arr1 + arr2  # Broadcasting arr2 across arr1
print(result)  # Output: [11 12 13]


[11 12 13]


**How ndarray Differs from Python Lists**

1. Homogeneity vs. Heterogeneity:

- NumPy ndarray: Requires all elements to be of the same data type (homogeneous).
- Python list: Can hold elements of different data types (heterogeneous), allowing greater flexibility but with some performance overhead.

In [None]:
# NumPy array
np_arr = np.array([1, 2, 3, 4])  # All elements must be of the same type

# Python list
py_list = [1, 'two', 3.0, [4]]  # Different types in the same list


2. Performance:

- NumPy ndarray: Much more efficient for numerical computations due to its optimized memory layout and vectorized operations. The operations on ndarrays are typically much faster than iterating over Python lists, especially for large datasets.
- Python list: Slower for numerical computations, especially for large datasets, because Python lists are general-purpose containers and don't have the performance optimizations of ndarrays.

3. Memory Layout:

- NumPy ndarray: Stored in a contiguous block of memory, which minimizes memory overhead and improves cache locality.
- Python list: A list is an array of references to objects, each of which can be a different type. This introduces more overhead in terms of memory and processing time.

4. Functionality:

- NumPy ndarray:Supports a wide range of specialized mathematical and logical operations (like element-wise operations, linear algebra, statistics) that can be applied to arrays.
- Python list: Lacks the built-in support for these types of operations. Lists require explicit loops or list comprehensions for similar tasks, making them less efficient for numerical operations.

5. Size:

- NumPy ndarray: The size of a NumPy array is fixed once created, but the shape (dimensionality) can be changed (with reshaping). Memory is contiguous and compact.
- Python list: Lists are dynamic in size and can grow or shrink by adding or removing elements. However, they don't offer the same memory efficiency or performance as NumPy arrays.

6. Support for Multidimensional Data:

- NumPy ndarray: Supports arrays of any number of dimensions (nD), from 1D (vectors) to 2D (matrices) to nD (higher-dimensional tensors).
- Python list: Lists can be used to represent multidimensional data, but they require nested lists (e.g., lists of lists for 2D arrays), which are not as efficient or flexible as NumPy arrays.


**Q.6- Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.**

**Ans.-** NumPy arrays provide significant performance benefits over standard Python lists, especially when dealing with large-scale numerical operations. These benefits stem from the optimized memory layout, vectorized operations, and the implementation of core functionality in compiled C code. Let's dive into the key reasons why NumPy is much faster than Python lists for numerical tasks.

**Key Performance Benefits of NumPy Arrays Over Python Lists**

1. **Homogeneous Data vs. Heterogeneous Data**

- NumPy Arrays: NumPy arrays store elements of the same data type (homogeneous). This allows NumPy to optimize the storage of data by allocating a contiguous block of memory for elements of the same type. It can efficiently perform operations using this predictable memory layout, reducing the overhead caused by type-checking and object reference handling in heterogeneous data structures.

- Python Lists: A Python list is a collection of pointers to objects (each potentially of different data types). This introduces overhead since each element is treated as a separate Python object, and operations on lists require dereferencing these pointers and checking types, which slows down execution.

Example: A list of integers is stored as individual Python objects, each with a memory overhead. In contrast, a NumPy array simply stores the raw data in a contiguous block with no per-element object overhead.

2. **Memory Efficiency**

- NumPy Arrays: NumPy arrays are stored in contiguous blocks of memory, which not only reduces the memory overhead but also allows for better cache locality. This means that when performing operations on NumPy arrays, the CPU cache can be used efficiently, leading to faster execution times.

- Python Lists: Python lists, on the other hand, store references to objects scattered across memory, leading to poor cache locality. As a result, memory accesses can be slower, especially for large datasets, since the CPU cache is not utilized effectively.

Example: A large NumPy array can hold millions of elements in a compact, contiguous memory block, whereas a Python list will have additional memory overhead due to storing references to the elements.

3. **Vectorized Operations**

- NumPy Arrays: NumPy supports vectorization, which allows operations to be applied to entire arrays at once without the need for explicit loops. NumPy achieves this by using optimized C code under the hood, which operates on blocks of memory in parallel or via SIMD (Single Instruction, Multiple Data) operations. This eliminates the need for looping through each element in Python code, which can be slow.

- Python Lists: With Python lists, you typically need to use loops (e.g., for loops) to perform operations on each element. Python's loops are interpreted at runtime and can be slow when performing operations on large datasets.

In [None]:
#Example: If you want to add two arrays element-wise, you can do this in a single operation with NumPy:

import numpy as np
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])
result = arr1 + arr2  # Vectorized operation


In [None]:
#With Python lists, the same operation requires a loop:

arr1 = [1, 2, 3, 4]
arr2 = [5, 6, 7, 8]
result = [a + b for a, b in zip(arr1, arr2)]  # Using a loop


4. **Optimized Low-Level Implementations**

- Python Lists: Python’s dynamic nature and its reliance on references make list operations relatively slow, especially for numerical tasks. When you need to perform arithmetic on lists, Python has to interpret the operation for each individual element.
- NumPy Arrays: NumPy arrays are implemented in C and have low-level bindings to optimized libraries like BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra PACKage). These libraries implement highly optimized algorithms for matrix operations, linear algebra, and other numerical tasks, making NumPy a powerful tool for performance-critical applications.
Benefit: NumPy leverages highly optimized C libraries that provide a significant speed advantage over the more general-purpose Python lists.

5. **Parallelism and Multi-threading**

- Python Lists: Python's list operations are single-threaded by default, and for complex numerical operations, they can only utilize one CPU core. For parallelism, external libraries like multiprocessing would need to be employed, which introduces additional complexity.
- NumPy Arrays: NumPy is capable of taking advantage of multiple cores or vectorized instruction sets (such as SIMD instructions) due to the underlying C implementation. In some cases, operations on NumPy arrays can be parallelized automatically, which leads to better performance when working with large datasets.
Benefit: NumPy arrays can leverage multi-core processors more efficiently and perform computations in parallel, whereas Python lists cannot take full advantage of modern hardware without external intervention.

6. **Reduced Overhead with Homogeneous Data Types**

- Python Lists: Python lists can store elements of different data types, but this flexibility comes at the cost of performance. Python must handle the dynamic typing and associated overhead, even when working with numerical data.
- NumPy Arrays: NumPy arrays are typed, meaning every element in the array is guaranteed to be of the same type (e.g., all floats). This uniformity allows NumPy to implement more efficient memory access patterns and operations on data, which speeds up numerical computations.
Benefit: The type homogeneity of NumPy arrays allows for highly optimized memory access and operations, leading to better performance over Python lists, which are more flexible but slower in this regard.

7. **Access Patterns**

- Python Lists: Accessing elements in a Python list involves dereferencing an object, which adds overhead. Moreover, Python lists are not designed with efficient numerical access patterns, leading to suboptimal cache usage when performing large-scale computations.
- NumPy Arrays: NumPy arrays are stored in a contiguous block of memory, and this allows for fast element access. This storage layout is cache-friendly and provides better performance, especially when working with large data arrays.
Benefit: The contiguous memory layout of NumPy arrays leads to faster memory access and better CPU cache utilization than the more scattered storage used by Python lists.

The performance benefits of NumPy arrays over Python lists are evident when dealing with large-scale numerical operations. NumPy arrays provide:

- More efficient memory usage due to homogeneous data types and contiguous storage.

- Significant speed improvements through vectorized operations and optimized low-level implementations.

- Advanced features like broadcasting, parallelism, and complex mathematical functions that are optimized for performance.

**Q.7- Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.**

**Ans.-** In NumPy, the functions vstack() and hstack() are used to stack arrays along different axes. They provide convenient ways to combine arrays either vertically (along rows) or horizontally (along columns).

1. **vstack() – Stack arrays vertically (along rows)**

- The vstack() function stacks arrays vertically (i.e., along the first axis, which is the rows axis).

- This means that it adds rows to the bottom of an array or combines multiple arrays row-wise.

- The arrays to be stacked must have the same number of columns (i.e., the second dimension must be the same).

2. **hstack() – Stack arrays horizontally (along columns)**

- The hstack() function stacks arrays horizontally (i.e., along the second axis, which is the columns axis).

- This means that it adds columns to the right side of an array or combines multiple arrays column-wise.

- The arrays to be stacked must have the same number of rows (i.e., the first dimension must be the same).


In [20]:
#Using vstack()

# Two 2D arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Stack vertically
result = np.vstack((arr1, arr2))
print(result)


[[1 2]
 [3 4]
 [5 6]
 [7 8]]


In this example, arr1 has shape (2, 2) and arr2 has shape (2, 2). After stacking them vertically, the resulting array has shape (4, 2)—four rows, two columns.

In [21]:
#Using hstack()

# Two 2D arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Stack horizontally
result = np.hstack((arr1, arr2))
print(result)


[[1 2 5 6]
 [3 4 7 8]]


Here, arr1 and arr2 are stacked horizontally. Since both arrays have the same number of rows (2), they are joined along the second axis (columns), producing a result with shape (2, 4)—two rows, four columns.

In [22]:
#Stacking more than two arrays

# Three 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr3 = np.array([7, 8, 9])

# Stack vertically
result_v = np.vstack((arr1, arr2, arr3))
print("Vertical Stack:")
print(result_v)

# Stack horizontally
result_h = np.hstack((arr1, arr2, arr3))
print("\nHorizontal Stack:")
print(result_h)


Vertical Stack:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Horizontal Stack:
[1 2 3 4 5 6 7 8 9]


- Vertical Stack: The 1D arrays are stacked into a 2D array with shape (3, 3).
- Horizontal Stack: The 1D arrays are concatenated into a single 1D array of length 9.

- **vstack()** stacks arrays vertically, adding more rows.
- **hstack()** stacks arrays horizontally, adding more columns.
- Both functions require arrays to have compatible shapes along the axis that is not being stacked. For vstack(), the arrays must have the same number of columns; for hstack(), the arrays must have the same number of rows.

**Q.8-  Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.**

**Ans.-** In NumPy, the functions fliplr() and flipud() are used to flip arrays along different axes, specifically for left-right (horizontal) and up-down (vertical) flips. These functions are used to reverse the order of elements in an array along specific axes.

1. **fliplr() – Flip Left to Right (Horizontal Flip)**

- Effect: This function flips an array left-to-right along its second axis (axis 1, the columns axis).
- For a 2D array, it reverses the order of columns, meaning the first column becomes the last, the second column becomes the second last, and so on.
- The shape of the array is preserved, but the contents within each row are reversed.
- This function only works on 2D arrays or higher.

2. **flipud() – Flip Up to Down (Vertical Flip)**

- Effect: This function flips an array up-to-down along its first axis (axis 0, the rows axis).
- For a 2D array, it reverses the order of rows, meaning the first row becomes the last row, the second row becomes the second last, and so on.
- The shape of the array is preserved, but the rows are reversed.
- This function works on any dimensional array but has its most noticeable effect on 2D arrays.





In [24]:
#1. Flipping a 1D Array:
#Input:

arr1d = np.array([1, 2, 3, 4, 5])

#Using fliplr() on a 1D array:

#fliplr() does not apply to 1D arrays, so calling it will result in an error. NumPy will raise an exception:


In [None]:
#Using flipud() on a 1D array:

#flipud() also doesn't modify 1D arrays in a meaningful way. It returns the array itself because there's no second axis to flip.

flipped_ud = np.flipud(arr1d)
print(flipped_ud)  # Output: [1 2 3 4 5]


In [27]:
#2. Flipping a 2D Array:
#Input:
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])


In [28]:
#Using fliplr() on a 2D array:

#This will flip the array horizontally (left-to-right).

flipped_lr = np.fliplr(arr2d)
print(flipped_lr)


[[3 2 1]
 [6 5 4]
 [9 8 7]]


In [None]:
#Effect: Each row is reversed, so the first column becomes the last, the second column becomes the second last, and so on.

In [29]:
#Using flipud() on a 2D array:

#This will flip the array vertically (up-to-down).

flipped_ud = np.flipud(arr2d)
print(flipped_ud)


[[7 8 9]
 [4 5 6]
 [1 2 3]]


In [None]:
#Effect: The rows are reversed, so the first row becomes the last, the second row becomes the second last, and so on.

In [31]:
#3. Flipping a 3D Array:
#For a 3D array, the behavior of fliplr() and flipud() is still based on flipping along the first or second axis, but their effects
#are applied across all the inner 2D slices (planes) in the array.

arr3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]], [[9, 10], [11, 12]]])

#This is a 3D array with shape (3, 2, 2)


In [32]:
#Using fliplr() on a 3D array:

#fliplr() will flip each 2D slice of the array horizontally.

flipped_lr = np.fliplr(arr3d)
print(flipped_lr)


[[[ 3  4]
  [ 1  2]]

 [[ 7  8]
  [ 5  6]]

 [[11 12]
  [ 9 10]]]


In [33]:
#Using flipud() on a 3D array:

#flipud() will flip the entire array vertically (along the rows axis).

flipped_ud = np.flipud(arr3d)
print(flipped_ud)


[[[ 9 10]
  [11 12]]

 [[ 5  6]
  [ 7  8]]

 [[ 1  2]
  [ 3  4]]]


**Visualizing the Difference:**

- **fliplr()** operates along the columns (left to right), reversing the column order in each row.
- **flipud()** operates along the rows (up to down), reversing the row order in the entire array.

- fliplr(): Reverses the order of columns in each row.
- flipud(): Reverses the order of rows in the entire array.

**Q.9- Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?**

**Ans.-** The array_split() method in NumPy is used to split an array into multiple sub-arrays. This method is more flexible than the simpler split() method because it allows you to split an array into uneven sub-arrays. It can handle situations where the number of elements in the array isn't evenly divisible by the number of splits, making it particularly useful when you don't require equal-sized sub-arrays.

**Functionality of array_split()**

- Basic Usage: The array_split() method splits an input array into multiple sub-arrays along a specified axis.

- Arguments:
1. ary: The array to be split.

2. indices_or_sections: This can either be:
- An integer, which specifies the number of equal parts to split the array into.
- A list of integers, which specifies the indices at which to split the array (i.e., it defines where the array is divided).

3. axis (optional): The axis along which to split the array. The default is 0 (rows for a 2D array).

4. array_split() also provides a dtype parameter (rarely used) to specify the data type of the output sub-arrays.

In [36]:
#Splitting an Array into Equal Parts

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

# Split into 3 equal parts
result = np.array_split(arr, 3)
print(result)


[array([1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]


In [37]:
#Splitting Based on Specific Indices

# Split at specific indices
result = np.array_split(arr, [3, 5])
print(result)


[array([1, 2, 3]), array([4, 5]), array([6, 7, 8, 9])]


**Handling Uneven Splits**

When the array cannot be evenly split into the specified number of parts (i.e., when the number of elements in the array is not divisible by the number of requested splits), array_split() handles this gracefully by distributing the "extra" elements across the sub-arrays.

**How Does It Handle Uneven Splits?**

- Uneven Distribution: When dividing an array into n parts, NumPy distributes the remainder (extra elements) as evenly as possible across the resulting sub-arrays. This means:
- If the total number of elements in the array is not exactly divisible by the number of splits, some sub-arrays will have one more element than others.
- The remainder is distributed starting from the first sub-array.

For example, if an array has 10 elements and you try to split it into 3 parts:

- Each part should ideally have 10 // 3 = 3 elements.
- However, there is a remainder of 10 % 3 = 1 element, which needs to be distributed. So, one of the sub-arrays will contain 4 elements, and the others will contain 3 elements each.

In [38]:
#Handling Uneven Splits

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Split into 3 parts
result = np.array_split(arr, 3)
print(result)


[array([1, 2, 3, 4]), array([5, 6, 7]), array([ 8,  9, 10])]


In [41]:
#Uneven Splits with More Than Two Parts

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Split into 4 parts
result = np.array_split(arr, 4)
print(result)


[array([1, 2, 3]), array([4, 5, 6]), array([7, 8]), array([ 9, 10])]


**Key Points of array_split()**

- Flexibility: Unlike the split() function, which requires the array to be evenly divisible by the number of splits, array_split() allows uneven splits.
- Automatic Remainder Handling: When the array cannot be evenly divided, the remainder (extra elements) is distributed across the sub-arrays as evenly as possible.
- Works on Multiple Dimensions: array_split() can be used on multi-dimensional arrays and can split along any axis (specified by the axis argument).

- array_split(): Splits an array into a specified number of sub-arrays along a given axis. It is flexible and can handle cases where the number of elements is not evenly divisible by the number of splits.
- Uneven Splits: If the number of elements is not divisible by the number of requested splits, array_split() distributes the extra elements across the resulting sub-arrays. Some sub-arrays may have one more element than others, and this distribution starts from the first sub-array.

**Q.10 -Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?**

**Ans.-** In NumPy, vectorization and broadcasting are two key concepts that significantly contribute to the efficiency of array operations, especially when dealing with large datasets. Both concepts allow for fast computation and avoid explicit Python loops, making NumPy a powerful tool for numerical computing.

1. **Vectorization**

**Definition:**

- Vectorization refers to the ability to express operations on arrays without using explicit loops (like for or while loops). Instead of iterating over individual elements, NumPy allows you to perform operations on entire arrays or sub-arrays at once. This is achieved using highly optimized, compiled C code in the background, which makes vectorized operations significantly faster than Python loops.

**How it Works:**

- In NumPy, array operations (such as addition, multiplication, etc.) are vectorized by default. This means you can apply mathematical operations on whole arrays or matrices without writing explicit iteration code.
- Vectorization leverages SIMD (Single Instruction, Multiple Data) architecture, which performs the same operation on multiple pieces of data simultaneously.

**Performance Benefits:**

- Faster Execution: Since vectorized operations use highly optimized C and Fortran libraries, they are much faster than manually looping through arrays.
- Cleaner Code: Vectorization leads to concise and readable code, reducing complexity and making the code easier to maintain.
- Reduced Python Overhead: Python loops have significant overhead compared to operations on entire arrays that are executed in compiled code.

2. **Broadcasting**

**Definition:**

- Broadcasting is a powerful concept in NumPy that allows arrays of different shapes to be used together in arithmetic operations. Broadcasting automatically adjusts the shapes of arrays to make them compatible for element-wise operations, without needing to explicitly reshape them or use loops.

**How it Works:**

- Broadcasting works by expanding the smaller array across the larger one along the appropriate axis. NumPy automatically applies a set of rules to make the shapes compatible.
- This allows operations to be performed on arrays of different sizes and shapes without explicit replication, which leads to memory efficiency and faster computation.

**Broadcasting Rules:**

NumPy follows these general rules for broadcasting:

- If the arrays have a different number of dimensions, the shape of the smaller array is padded with ones on the left side until both shapes have the same length.
- The dimensions of the two arrays are compared element by element. If the size of the dimension is the same for both arrays, or if one of the arrays has size 1 in that dimension, the arrays are compatible.
- The arrays are then broadcast to the shape of the larger array, and the operation is performed element-wise.

**Performance Benefits of Broadcasting:**

- Memory Efficiency: Broadcasting avoids the need to duplicate data when performing operations on arrays of different shapes. Instead of replicating smaller arrays, broadcasting virtually expands them, saving memory.
- Faster Execution: Broadcasting allows for efficient use of array operations on different-shaped arrays without loops. It reduces the need for reshaping or copying data and eliminates Python overhead.

**Conclusion**

- Vectorization and broadcasting are two essential techniques that help NumPy perform fast, efficient computations on arrays. By using vectorized operations, we avoid the need for slow Python loops, and with broadcasting, we can work with arrays of different shapes without explicitly reshaping or duplicating them.

- Together, these concepts lead to cleaner, faster, and more memory-efficient code, making NumPy a powerful tool for numerical computing, especially when working with large datasets.




# ***Practical***

**Q.1 - Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.**

**Ans.-**

In [45]:
import numpy as np

# Create a 3x3 array with random integers between 1 and 100
arr = np.random.randint(1, 101, size=(3, 3))

# Display the original array
print("Original Array:")
print(arr)

# Interchange rows and columns (Transpose)
transposed_arr = arr.T

# Display the transposed array
print("\nTransposed Array:")
print(transposed_arr)


Original Array:
[[57 62 58]
 [11 56 59]
 [71 92 70]]

Transposed Array:
[[57 11 71]
 [62 56 92]
 [58 59 70]]


- np.random.randint(1, 101, size=(3, 3)) generates a 3x3 array with random integers from 1 to 100 (inclusive).
- .T is a shorthand for the transpose operation, which interchanges rows and columns.

**Q.2 - Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array**

**Ans.-**

In [46]:

# Step 1: Generate a 1D NumPy array with 10 elements
arr = np.arange(1, 11)  # This will create an array with values [1, 2, ..., 10]

# Step 2: Reshape it into a 2x5 array
arr_2x5 = arr.reshape(2, 5)
print("2x5 Array:")
print(arr_2x5)

# Step 3: Reshape it into a 5x2 array
arr_5x2 = arr.reshape(5, 2)
print("\n5x2 Array:")
print(arr_5x2)


2x5 Array:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

5x2 Array:
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]


- np.arange(1, 11) creates a 1D array with elements from 1 to 10.
- arr.reshape(2, 5) reshapes the 1D array into a 2x5 array (2 rows and 5 columns).
- arr.reshape(5, 2) reshapes the array into a 5x2 array (5 rows and 2 columns).

**Q.3 - Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.**

**Ans.-**

In [47]:

# Step 1: Create a 4x4 array with random float values between 0 and 1
arr = np.random.rand(4, 4)
print("Original 4x4 Array:")
print(arr)

# Step 2: Add a border of zeros around the 4x4 array, resulting in a 6x6 array
arr_with_border = np.pad(arr, pad_width=1, mode='constant', constant_values=0)
print("\nArray with Border of Zeros (6x6):")
print(arr_with_border)


Original 4x4 Array:
[[0.47042432 0.1699449  0.09377661 0.9540419 ]
 [0.45575764 0.08680432 0.10953877 0.86541732]
 [0.14029875 0.20347691 0.41334996 0.85636566]
 [0.70578858 0.6296705  0.53680841 0.03212271]]

Array with Border of Zeros (6x6):
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.47042432 0.1699449  0.09377661 0.9540419  0.        ]
 [0.         0.45575764 0.08680432 0.10953877 0.86541732 0.        ]
 [0.         0.14029875 0.20347691 0.41334996 0.85636566 0.        ]
 [0.         0.70578858 0.6296705  0.53680841 0.03212271 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


- np.random.rand(4, 4) generates a 4x4 array of random float values between 0 and 1.

- np.pad(arr, pad_width=1, mode='constant', constant_values=0) adds a border of zeros:

- pad_width=1 specifies that a border of width 1 is added to all sides (top, bottom, left, right).

- mode='constant' indicates that the padding should be filled with constant values.

- constant_values=0 sets the constant value to 0, which fills the border.

**Q.4 -Using NumPy, create an array of integers from 10 to 60 with a step of 5.**

**Ans.-**

In [48]:

# Create an array from 10 to 60 with a step of 5
arr = np.arange(10, 61, 5)

# Print the array
print(arr)


[10 15 20 25 30 35 40 45 50 55 60]


- np.arange(10, 61, 5):
- 10 is the start value.
- 61 is the end value, but it is exclusive, so the array will stop at 60.
- 5 is the step size, which means the values in the array will increase by 5 starting from 10.

**Q.5 - Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.**

**Ans.-**

In [49]:

# Create a NumPy array of strings
arr = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations to each element

# Uppercase
uppercase_arr = np.char.upper(arr)

# Lowercase
lowercase_arr = np.char.lower(arr)

# Title Case (first letter of each word capitalized)
titlecase_arr = np.char.title(arr)

# Capitalize (only the first letter of the string)
capitalize_arr = np.char.capitalize(arr)

# Print the original and transformed arrays
print("Original Array:")
print(arr)

print("\nUppercase Array:")
print(uppercase_arr)

print("\nLowercase Array:")
print(lowercase_arr)

print("\nTitle Case Array:")
print(titlecase_arr)

print("\nCapitalize Array:")
print(capitalize_arr)


Original Array:
['python' 'numpy' 'pandas']

Uppercase Array:
['PYTHON' 'NUMPY' 'PANDAS']

Lowercase Array:
['python' 'numpy' 'pandas']

Title Case Array:
['Python' 'Numpy' 'Pandas']

Capitalize Array:
['Python' 'Numpy' 'Pandas']


- np.char.upper(): Converts all characters in the string to uppercase.
- np.char.lower(): Converts all characters in the string to lowercase.
- np.char.title(): Converts the first character of each word in the string to uppercase and the rest to lowercase.
- np.char.capitalize(): Capitalizes only the first character of the string and converts the rest to lowercase.

**Q.6 - Generate a NumPy array of words. Insert a space between each character of every word in the array.**

**Ans.-** To insert a space between each character of every word in a NumPy array, you can use NumPy's vectorized string operations with np.char.add() to concatenate a space between each character. Alternatively, you can also use the np.char.split() method and then join the characters with spaces.

In [50]:

# Create a NumPy array of words
arr = np.array(['python', 'numpy', 'pandas'])

# Insert a space between each character of every word
spaced_arr = np.char.add(' ', np.char.join(' ', arr))

# Print the resulting array
print("Array with spaces between characters:")
print(spaced_arr)


Array with spaces between characters:
[' p y t h o n' ' n u m p y' ' p a n d a s']


- np.char.add(): Adds the specified string (in this case, a space ' ') to the beginning or end of each element.
- np.char.join(): Joins the characters in each word with a space ' '.

**Q.7 - Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.**

**Ans.-**

In [52]:

# Create two 2D NumPy arrays
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([[7, 8, 9], [10, 11, 12]])

# Element-wise addition
addition_result = arr1 + arr2

# Element-wise subtraction
subtraction_result = arr1 - arr2

# Element-wise multiplication
multiplication_result = arr1 * arr2

# Element-wise division
# Note: Make sure no division by zero is involved.
division_result = arr1 / arr2

# Print the results
print("Array 1:")
print(arr1)

print("\nArray 2:")
print(arr2)

print("\nElement-wise Addition:")
print(addition_result)

print("\nElement-wise Subtraction:")
print(subtraction_result)

print("\nElement-wise Multiplication:")
print(multiplication_result)

print("\nElement-wise Division:")
print(division_result)


Array 1:
[[1 2 3]
 [4 5 6]]

Array 2:
[[ 7  8  9]
 [10 11 12]]

Element-wise Addition:
[[ 8 10 12]
 [14 16 18]]

Element-wise Subtraction:
[[-6 -6 -6]
 [-6 -6 -6]]

Element-wise Multiplication:
[[ 7 16 27]
 [40 55 72]]

Element-wise Division:
[[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]


- Addition (arr1 + arr2): Adds corresponding elements of arr1 and arr2.
- Subtraction (arr1 - arr2): Subtracts corresponding elements of arr2 from arr1.
- Multiplication (arr1 * arr2): Multiplies corresponding elements of arr1 and arr2.
- Division (arr1 / arr2): Divides corresponding elements of arr1 by those in arr2.

**Q.8 - Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.**

**Ans.-**

In [53]:

# Step 1: Create a 5x5 identity matrix
identity_matrix = np.eye(5)

# Step 2: Extract the diagonal elements
diagonal_elements = identity_matrix.diagonal()

# Print the results
print("5x5 Identity Matrix:")
print(identity_matrix)

print("\nDiagonal Elements:")
print(diagonal_elements)


5x5 Identity Matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

Diagonal Elements:
[1. 1. 1. 1. 1.]


- np.eye(5) creates a 5x5 identity matrix, where the diagonal elements are 1, and all other elements are 0.
- identity_matrix.diagonal() extracts the diagonal elements from the identity matrix.

**Q.9 - Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.**

**Ans.-**

In [60]:

# Step 1: Generate a NumPy array of 100 random integers between 0 and 1000
arr = np.random.randint(0, 1001, size=100)

# Step 2: Function to check if a number is prime
def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n ** 0.5) + 1):
        if n % i == 0:
            return False
    return True

# Step 3: Apply the function to each element in the array and find the primes
primes = arr[np.array([is_prime(x) for x in arr])]

# Step 4: Display the prime numbers
print("Prime numbers in the array:")
print(primes)


Prime numbers in the array:
[293 257 673  41 887 283 773 857  17 577 463 607 263 983 353  31 433 269
 617 599 883 353]


- np.random.randint(0, 1001, size=100) generates 100 random integers between 0 and 1000.
- is_prime() is a helper function that checks if a given number is prime.
- List comprehension is used to apply the is_prime() function to each element in the array. We convert the result into a boolean array (True for primes, False for non-primes) and use it to index the original array, extracting only the prime numbers.

**Q.10 -  Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.**

**Ans.-**

In [63]:

# Step 1: Generate an array of daily temperatures for a month (e.g., 30 days)
# For simplicity, let's generate random temperatures between 0 and 35 degrees Celsius
daily_temperatures = np.random.randint(0, 36, size=30)

# Step 2: Reshape the array to have 4 or 5 weeks (7 days per week)
# We can reshape the array into a 5x6 array (5 weeks, 6 days each week) or a 4x7 array (4 weeks, 7 days)
weekly_temperatures = daily_temperatures.reshape(5, 6)  # Using 5 weeks and 6 days per week

# Step 3: Calculate the weekly averages
weekly_averages = np.mean(weekly_temperatures, axis=1)

# Step 4: Display the results
print("Daily Temperatures (30 days):")
print(daily_temperatures)

print("\nWeekly Temperatures (5 weeks, 6 days each):")
print(weekly_temperatures)

print("\nWeekly Averages:")
print(weekly_averages)


Daily Temperatures (30 days):
[31  7 13 18 18 17 30 33  2  7 14 10 19 26 29  7  0 22 30 24 27  6  8  2
 29 13  3 14 16  6]

Weekly Temperatures (5 weeks, 6 days each):
[[31  7 13 18 18 17]
 [30 33  2  7 14 10]
 [19 26 29  7  0 22]
 [30 24 27  6  8  2]
 [29 13  3 14 16  6]]

Weekly Averages:
[17.33333333 16.         17.16666667 16.16666667 13.5       ]
