In [1]:
print("Hello world")

Hello world


Theoretical Questions -

Q1. Explain the purpose and advantages of numpy in scientific computing and data analysis. How does it enhance python's capabilities for numerical operations?

A1. NumPy (Numerical Python) is a powerful library that extends Python’s capabilities for scientific computing and data analysis by providing efficient numerical operations, especially with large datasets and mathematical computations. Here’s a breakdown of its purpose and advantages:

### Purpose of NumPy
NumPy is designed to handle numerical data and provide a framework for array-based computing. It offers a way to work with multi-dimensional arrays, which is crucial for performing complex mathematical operations efficiently. It underpins many other data-focused libraries in Python (e.g., Pandas, SciPy) and is a foundational tool in machine learning, statistics, and other computational fields.

### Advantages of NumPy in Scientific Computing and Data Analysis

1. **Efficient Array Operations:**
   - **Arrays over lists**: NumPy arrays (ndarrays) are more memory-efficient and faster than Python lists, especially for large datasets, because they store elements in a contiguous block of memory.
   - **Broadcasting**: With broadcasting, NumPy can perform operations on arrays of different shapes (e.g., adding a scalar to a vector), allowing for efficient manipulation without explicit loops.

2. **Speed and Performance:**
   - NumPy operations are implemented in C, making them much faster than equivalent pure Python code, which is essential when dealing with large datasets or intensive computations.
   - Vectorized operations, where an entire array is processed at once, allow NumPy to avoid slow Python loops, enhancing performance for tasks like mathematical and statistical computations.

3. **Comprehensive Mathematical Functions:**
   - NumPy offers a suite of functions for linear algebra, Fourier transformations, random number generation, and statistical operations, among others. This makes it highly versatile for scientific applications, supporting everything from basic arithmetic to complex numerical methods.

4. **Integration with Other Libraries:**
   - Many scientific and data analysis libraries (such as Pandas, SciPy, scikit-learn, and TensorFlow) are built on NumPy and leverage its array structures. This interoperability makes NumPy a vital component in the data science and machine learning ecosystem.

5. **Data Preprocessing and Cleaning:**
   - NumPy is widely used to preprocess data before feeding it into machine learning models. Its array operations allow for reshaping, slicing, and filtering data efficiently, helping clean and organize large datasets.

6. **Random Number Generation and Simulation:**
   - For tasks involving simulations, NumPy offers a powerful random module to generate random numbers and samples, facilitating tasks in statistics, probabilistic modeling, and simulation experiments.

### How NumPy Enhances Python’s Capabilities for Numerical Operations

Python, by itself, is not optimized for high-performance numerical computation due to its interpreted nature and reliance on loops. NumPy addresses these limitations by:

- **Using C-based implementations** for its core functions, which are faster than Python’s native operations.
- **Supporting multi-dimensional arrays**, enabling complex operations across large datasets.
- **Allowing parallelism** through vectorized operations, which enhances Python’s ability to handle scientific and statistical computations without sacrificing readability or simplicity.

In summary, NumPy elevates Python from a general-purpose language to a powerful tool for scientific computing and data analysis by providing a high-performance, memory-efficient, and user-friendly interface for numerical operations. It is an essential library for anyone working in fields that involve large-scale data and computational analysis.

Q2. Compare and contrast np.mean() and np.average() functions in numpy. When would you use one over the other?

A2. The `np.mean()` and `np.average()` functions in NumPy both calculate the mean of an array, but they differ in functionality and use cases. Here’s a comparison and guidance on when to use each.

### `np.mean()`
- **Purpose**: Calculates the arithmetic mean along a specified axis.
- **Syntax**: `np.mean(array, axis=None, dtype=None, out=None, keepdims=False)`
- **Weighted Average**: It does not support weights; each element contributes equally to the mean.
- **Axis Parameter**: Can calculate the mean across a specific axis of a multi-dimensional array (e.g., rows or columns).
- **Use Case**: Use `np.mean()` when you want a simple average of values without considering any specific weights.

#### Example:
```python
import numpy as np
data = np.array([1, 2, 3, 4])
mean_value = np.mean(data)
print(mean_value)  # Output: 2.5
```

### `np.average()`
- **Purpose**: Calculates the weighted average of an array, where weights can be assigned to each element.
- **Syntax**: `np.average(array, axis=None, weights=None, returned=False)`
- **Weighted Average**: Supports a `weights` parameter to calculate a weighted average.
- **Axis Parameter**: Like `np.mean()`, it can compute the weighted average along a specific axis in multi-dimensional arrays.
- **Returned Parameter**: When set to `True`, this option returns a tuple containing both the calculated average and the sum of the weights.
- **Use Case**: Use `np.average()` when you need a weighted mean, where certain values contribute more heavily than others based on assigned weights.

#### Example:
```python
import numpy as np
data = np.array([1, 2, 3, 4])
weights = np.array([1, 2, 3, 4])
weighted_avg = np.average(data, weights=weights)
print(weighted_avg)  # Output: 3.0
```

### Key Differences

| Feature            | `np.mean()`                       | `np.average()`                         |
|--------------------|-----------------------------------|----------------------------------------|
| **Weights**        | Not supported                     | Supported through `weights` parameter  |
| **Return Type**    | Scalar or array                   | Scalar or array (plus optional weight sum) |
| **Default Use**    | Simple mean                       | Weighted mean                          |
| **Additional Output** | No                              | Optionally returns weight sum if `returned=True` |

### When to Use Each Function
- **Use `np.mean()`** when you need a straightforward mean calculation without weights. It's faster and simpler for unweighted averages.
- **Use `np.average()`** when calculating a weighted average is necessary, such as in situations where some data points carry more significance or reliability than others (e.g., combining different sample sizes).

Both functions are useful in scientific computing, with `np.mean()` being sufficient for typical averaging tasks, and `np.average()` offering flexibility for cases requiring weights.

Q3. Describe the methods for reversing a numpy array along different axes. Provide examples for 1D and 2D arrays?

A3. In NumPy, you can reverse an array along different axes using slicing, specialized functions, or array methods. Here’s how to reverse arrays along various axes for 1D and 2D arrays.

### 1. Reversing a 1D Array
For a 1-dimensional array, reversing the elements can be done easily using slicing.

```python
import numpy as np
arr_1d = np.array([1, 2, 3, 4, 5])

# Using slicing to reverse
reversed_1d = arr_1d[::-1]
print(reversed_1d)  # Output: [5 4 3 2 1]
```

### 2. Reversing a 2D Array Along Different Axes
In a 2D array (matrix), you can reverse either:
   - Along rows (axis 0): Reverses the order of the rows.
   - Along columns (axis 1): Reverses the order of the columns.

#### Example Array:
```python
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
```

#### a. Reversing Along Rows (axis=0)
This will reverse the order of rows in the array but keep each row intact.

```python
# Using slicing to reverse along rows
reversed_rows = arr_2d[::-1, :]
print(reversed_rows)
# Output:
# [[7 8 9]
#  [4 5 6]
#  [1 2 3]]
```

#### b. Reversing Along Columns (axis=1)
This will reverse the order of elements in each row, effectively flipping the columns.

```python
# Using slicing to reverse along columns
reversed_columns = arr_2d[:, ::-1]
print(reversed_columns)
# Output:
# [[3 2 1]
#  [6 5 4]
#  [9 8 7]]
```

#### c. Reversing Both Axes
To completely reverse both the rows and columns (essentially rotating 180 degrees), use slicing for both dimensions.

```python
# Using slicing to reverse both rows and columns
reversed_both = arr_2d[::-1, ::-1]
print(reversed_both)
# Output:
# [[9 8 7]
#  [6 5 4]
#  [3 2 1]]
```

### Alternative Method: `np.flip()`
`np.flip()` can also be used to reverse an array along a specific axis or multiple axes.

```python
# Reverse along rows
reversed_rows_flip = np.flip(arr_2d, axis=0)
print(reversed_rows_flip)

# Reverse along columns
reversed_columns_flip = np.flip(arr_2d, axis=1)
print(reversed_columns_flip)

# Reverse both axes
reversed_both_flip = np.flip(arr_2d)
print(reversed_both_flip)
```

### Summary
- **1D Array**: Use slicing `[::-1]` to reverse.
- **2D Array**:
  - **Rows**: `arr_2d[::-1, :]` or `np.flip(arr_2d, axis=0)`
  - **Columns**: `arr_2d[:, ::-1]` or `np.flip(arr_2d, axis=1)`
  - **Both Axes**: `arr_2d[::-1, ::-1]` or `np.flip(arr_2d)`

These methods allow flexible manipulation of arrays along different dimensions, making it easy to rearrange data as needed.

Q4. How can you determine the data type of elements in a numpy array? Discuss the importance of data types in memory management and performance.

A4. To determine the data type of elements in a NumPy array, you can use the `.dtype` attribute of the array. Here's a quick example:

```python
import numpy as np
arr = np.array([1, 2, 3])
print(arr.dtype)  # Output: int64 (or another integer type depending on the system)
```

### Importance of Data Types in Memory Management and Performance

Data types in NumPy are crucial for efficient memory management and performance in numerical computations. Here’s why:

1. **Memory Efficiency**:
   - NumPy arrays are stored in contiguous blocks of memory, which makes access and computation faster. The data type (e.g., `int32`, `float64`) specifies how many bytes each element occupies.
   - Smaller data types consume less memory. For example, an array of `int8` (1 byte per element) requires one-eighth the memory of an `int64` (8 bytes per element) array with the same number of elements. Choosing the smallest suitable data type helps save memory, especially for large datasets.

2. **Improved Performance**:
   - Fixed-size data types enable NumPy to leverage low-level optimizations (like SIMD vectorization and memory caching) to speed up operations. For instance, arrays with `int32` or `float32` data types can often be processed faster than `int64` or `float64` arrays because they fit more data in memory and take advantage of CPU architecture.
   - Operations on smaller data types also result in faster computations since fewer bytes are processed. For example, an array of `float32` will be faster to compute than `float64` when high precision isn’t needed.

3. **Precision Control**:
   - Different data types allow you to control the precision of your calculations. For instance, `float64` provides higher precision than `float32`, which is essential in scientific computing where accuracy matters.
   - Choosing an appropriate data type prevents overflow or underflow. For example, `uint8` (0 to 255) is suitable for image processing where pixel values are within this range, while larger data types like `int64` may be unnecessary.

4. **Type Compatibility**:
   - NumPy enables data type casting, which can prevent incompatibility issues in calculations. For example, operations between an `int32` array and a `float32` array will cast to `float32` to avoid data loss in intermediate steps.

### Example of Choosing Data Types for Memory and Performance

```python
# Using a large array of integers with different data types
int_array_32 = np.array([1, 2, 3, 4, 5] * 1000000, dtype=np.int32)
int_array_64 = np.array([1, 2, 3, 4, 5] * 1000000, dtype=np.int64)

print(int_array_32.nbytes)  # Output: 20,000,000 bytes (20 MB)
print(int_array_64.nbytes)  # Output: 40,000,000 bytes (40 MB)
```

In this example, using `int32` instead of `int64` halves the memory consumption, demonstrating how selecting the appropriate data type reduces memory footprint.

In summary, selecting appropriate data types in NumPy enhances performance and reduces memory usage by aligning data size and precision requirements with computational needs, which is essential for large-scale data processing and scientific applications.

Q5. Define ndarrays in numpy and explain their key features. How do they differ from standard python lists?

A5. In NumPy, an `ndarray` (N-dimensional array) is the core data structure used for representing and manipulating multi-dimensional, homogeneous data. Unlike standard Python lists, `ndarrays` are designed specifically for numerical computing, making them more efficient and powerful for operations on large datasets.

### Key Features of `ndarray`

1. **Fixed Size**:
   - Once created, the size of an `ndarray` is fixed. You cannot change its size dynamically like Python lists. This fixed-size attribute helps optimize memory usage and processing speed.

2. **Homogeneous Data Type**:
   - All elements in an `ndarray` must be of the same data type (e.g., all integers, all floats), which makes computations more efficient by enabling vectorized operations. The data type is specified by the `dtype` attribute.

3. **Multi-dimensional**:
   - `ndarrays` support multiple dimensions (axes), allowing you to work with 1D vectors, 2D matrices, or higher-dimensional tensors. The number of dimensions is accessible via the `ndim` attribute, while the shape of each dimension can be accessed through the `shape` attribute.

4. **Memory Efficiency**:
   - `ndarrays` are stored in contiguous blocks of memory, unlike Python lists which store references to objects. This layout allows for faster access and manipulation, especially when working with large arrays.

5. **Vectorized Operations**:
   - NumPy supports element-wise operations on `ndarrays` without requiring explicit loops, thanks to vectorized operations. This feature allows for efficient and concise mathematical operations on arrays.

6. **Broadcasting**:
   - Broadcasting is a powerful feature that allows NumPy to perform operations on arrays of different shapes by stretching the smaller array to match the shape of the larger one. This is particularly useful for operations involving scalars or arrays of different dimensions.

7. **Advanced Indexing and Slicing**:
   - `ndarrays` support advanced indexing techniques (e.g., boolean indexing, integer array indexing), which makes it easier to select and manipulate subsets of data.

### Example of an `ndarray`

```python
import numpy as np

# Creating a 2D ndarray
arr = np.array([[1, 2, 3], [4, 5, 6]])

print("Array:\n", arr)
print("Shape:", arr.shape)         # Output: (2, 3)
print("Data type:", arr.dtype)      # Output: int64 (or int32, depending on system)
print("Number of dimensions:", arr.ndim)  # Output: 2
```

### Differences Between `ndarray` and Python Lists

| Feature               | `ndarray` in NumPy                          | Python List                                   |
|-----------------------|---------------------------------------------|-----------------------------------------------|
| **Data Type**         | Homogeneous (all elements are of the same type) | Heterogeneous (can contain different types)  |
| **Dimensions**        | Multi-dimensional (1D, 2D, etc.)           | Primarily 1D, nested lists can simulate multi-dimensions |
| **Memory Layout**     | Contiguous block of memory                  | Array of pointers (references to objects)     |
| **Performance**       | Faster for numerical operations due to vectorization | Slower, lacks native support for vectorized math |
| **Flexibility**       | Fixed size after creation                   | Dynamic size, can grow and shrink as needed   |
| **Operations**        | Supports element-wise arithmetic, broadcasting, etc. | Limited to looping or list comprehensions    |

### Example of Performance Difference

```python
import numpy as np
import time

# Creating a large list and ndarray
size = 1000000
list_data = list(range(size))
ndarray_data = np.array(list_data)

# Summing elements in Python list
start = time.time()
sum_list = sum(list_data)
print("List sum time:", time.time() - start)

# Summing elements in ndarray
start = time.time()
sum_ndarray = np.sum(ndarray_data)
print("ndarray sum time:", time.time() - start)
```

In this example, the `ndarray` sum is significantly faster due to the optimized, compiled code in NumPy.

### Summary
`ndarrays` in NumPy offer a powerful, efficient, and flexible way to work with large datasets, particularly for scientific computing. They outshine Python lists in terms of memory efficiency, performance, and support for multi-dimensional and vectorized operations, making them essential for data analysis and numerical tasks.

Q6. Analyze the performance benefits of numpy arrays over python lists for large-scale numerical operations.

A6. NumPy arrays (`ndarrays`) offer significant performance benefits over Python lists, especially when dealing with large-scale numerical operations. This performance advantage stems from several core characteristics of `ndarrays`, including memory efficiency, vectorization, and optimized computational efficiency. Let’s explore these benefits in more detail:

### 1. Memory Efficiency
- **Fixed Data Type**: In `ndarrays`, all elements have the same data type (e.g., `int32`, `float64`), which allows them to be stored in a contiguous block of memory. In contrast, Python lists are collections of pointers to objects, which occupy more memory because they store both the object references and the object data.
- **Compact Storage**: `ndarrays` store raw data in a more compact form, with each element taking up a fixed amount of memory. This leads to a smaller memory footprint compared to lists, which can be crucial for handling large datasets.

   **Example**:
   ```python
   import numpy as np
   import sys

   python_list = list(range(1000000))  # 1 million integers
   numpy_array = np.arange(1000000)

   print("Python list size in bytes:", sys.getsizeof(python_list))  # Size of Python list
   print("NumPy array size in bytes:", numpy_array.nbytes)          # Size of NumPy array
   ```

   NumPy arrays typically require less memory than Python lists for the same data, allowing for larger datasets to fit in memory.

### 2. Computational Efficiency and Vectorized Operations
- **No Loop Overhead**: NumPy supports vectorized operations, where an operation is applied to the entire array without the need for explicit Python loops. Vectorized operations are implemented in low-level C code, avoiding the overhead of Python's interpreter, and leading to faster execution.
- **Avoiding Python Loops**: Python loops are slow due to the interpreted nature of Python. NumPy's vectorized operations allow you to perform arithmetic, logical, and other operations directly on entire arrays.

   **Example**:
   ```python
   import numpy as np
   import time

   # Creating large arrays/lists
   size = 1000000
   python_list = list(range(size))
   numpy_array = np.arange(size)

   # Using a loop to add 1 to each element in a Python list
   start = time.time()
   python_list = [x + 1 for x in python_list]
   print("Time taken for Python list:", time.time() - start)

   # Using vectorized operation to add 1 to each element in a NumPy array
   start = time.time()
   numpy_array = numpy_array + 1
   print("Time taken for NumPy array:", time.time() - start)
   ```

   Here, the NumPy array operation is faster than the equivalent Python list operation, especially as the size of the data grows.

### 3. Leveraging Low-Level Optimizations
- **Underlying C Implementation**: NumPy is implemented in C, which is compiled and faster than Python. This allows it to take advantage of low-level memory optimizations and efficient memory access patterns.
- **SIMD (Single Instruction, Multiple Data) Operations**: For certain operations, NumPy can leverage SIMD processing, where a single instruction is applied to multiple data points simultaneously. This parallel processing approach is not available with Python lists.

### 4. Broadcasting
- **Automatic Shape Alignment**: Broadcasting allows NumPy to perform operations on arrays of different shapes by "stretching" the smaller array to match the dimensions of the larger one. This eliminates the need for complex loops and manual adjustments, making code faster and more concise.

   **Example**:
   ```python
   import numpy as np

   large_array = np.ones((1000, 1000))
   scalar = 2

   # Broadcasting scalar multiplication
   result = large_array * scalar
   ```

   Broadcasting allows for efficient computations without needing to manually replicate the scalar or reshape arrays.

### 5. Support for Advanced Mathematical and Statistical Functions
- **Specialized Functions**: NumPy includes optimized mathematical functions like `np.mean()`, `np.sum()`, `np.sqrt()`, etc., that perform operations on entire arrays efficiently. These functions are often faster and more memory-efficient than similar operations in Python lists.

### Performance Summary and Example Benchmarks

For a benchmark, let’s compare a basic operation, like calculating the sum of squares, for a large dataset using both a Python list and a NumPy array.

```python
import numpy as np
import time

# Large dataset
size = 1000000
python_list = list(range(size))
numpy_array = np.arange(size)

# Sum of squares with Python list
start = time.time()
python_list_sum_of_squares = sum([x**2 for x in python_list])
print("Python list time:", time.time() - start)

# Sum of squares with NumPy array
start = time.time()
numpy_array_sum_of_squares = np.sum(numpy_array**2)
print("NumPy array time:", time.time() - start)
```

In this benchmark, the NumPy operation is significantly faster due to vectorization and low-level optimizations.

### Summary of Performance Benefits
- **Memory Efficiency**: Compact storage in contiguous blocks saves memory.
- **Speed**: Vectorized operations avoid loop overhead, allowing faster numerical computations.
- **Broadcasting**: Simplifies and speeds up operations on arrays of different shapes.
- **Optimized Functions**: Built-in mathematical and statistical functions are optimized for performance.

NumPy arrays are thus much more efficient than Python lists for large-scale numerical operations, making them a foundational tool in scientific computing, machine learning, and data analysis.

Q7. Compare vstack() and hstack() functions in numpy. Provide examples demonstrating their usage and output.

A7. In NumPy, the `vstack()` and `hstack()` functions are used to stack arrays along different axes. While both functions are useful for combining arrays, they operate along different dimensions:

- **`np.vstack()`**: Stacks arrays vertically (row-wise).
- **`np.hstack()`**: Stacks arrays horizontally (column-wise).

Let’s break down the differences and see examples of each function in action.

### `np.vstack()`

- **Purpose**: Stacks arrays vertically along rows (i.e., along axis 0).
- **Requirement**: All input arrays must have the same number of columns to align them row-wise.
- **Result**: A new array with rows from each of the input arrays stacked on top of each other.

**Example of `np.vstack()`**

```python
import numpy as np

# Two arrays with the same number of columns
arr1 = np.array([[1, 2, 3],
                 [4, 5, 6]])
arr2 = np.array([[7, 8, 9],
                 [10, 11, 12]])

# Stacking vertically
vstacked_array = np.vstack((arr1, arr2))
print("Vertical Stack:\n", vstacked_array)
```

**Output:**

```
Vertical Stack:
 [[ 1  2  3]
  [ 4  5  6]
  [ 7  8  9]
  [10 11 12]]
```

In this example, `arr1` and `arr2` are stacked on top of each other, creating a new array with 4 rows and 3 columns.

### `np.hstack()`

- **Purpose**: Stacks arrays horizontally along columns (i.e., along axis 1).
- **Requirement**: All input arrays must have the same number of rows to align them column-wise.
- **Result**: A new array with columns from each input array placed side by side.

**Example of `np.hstack()`**

```python
import numpy as np

# Two arrays with the same number of rows
arr1 = np.array([[1, 2, 3],
                 [4, 5, 6]])
arr2 = np.array([[7, 8, 9],
                 [10, 11, 12]])

# Stacking horizontally
hstacked_array = np.hstack((arr1, arr2))
print("Horizontal Stack:\n", hstacked_array)
```

**Output:**

```
Horizontal Stack:
 [[ 1  2  3  7  8  9]
  [ 4  5  6 10 11 12]]
```

In this case, `arr1` and `arr2` are stacked side by side, resulting in a new array with 2 rows and 6 columns.

### Key Differences Between `vstack()` and `hstack()`

| Feature                 | `np.vstack()`                             | `np.hstack()`                              |
|-------------------------|-------------------------------------------|--------------------------------------------|
| **Stacking Direction**  | Vertical (along rows)                    | Horizontal (along columns)                 |
| **Dimension Requirement** | Same number of columns                 | Same number of rows                        |
| **Axis**                | Axis 0                                    | Axis 1                                     |
| **Result Shape**        | Adds rows                                 | Adds columns                               |

### Summary
- Use `np.vstack()` when you want to add arrays on top of each other, ensuring they have the same number of columns.
- Use `np.hstack()` when you want to place arrays side by side, ensuring they have the same number of rows.

These stacking functions allow for flexible combination of arrays in different dimensions, which is useful in data manipulation and preparation tasks in scientific computing and data analysis.

Q8. Explain the differences between fliplr() and flipud() methods in numpy, including their effects on various array dimensions.

A8. In NumPy, `fliplr()` and `flipud()` are two functions that flip arrays along different axes:

- **`np.fliplr()`**: Flips an array horizontally (left to right).
- **`np.flipud()`**: Flips an array vertically (upside down).

These functions are particularly useful when you need to reverse the order of elements along a specific axis in a 2D or higher-dimensional array. Let’s go over each function and their effects in detail.

---

### `np.fliplr()`

- **Purpose**: Flips the array in the left-to-right (horizontal) direction.
- **Effect**: Reverses the order of columns, keeping the rows intact.
- **Applicability**: Works on 2D arrays or higher. For 1D arrays, `fliplr()` will raise an error because it specifically requires a minimum of two dimensions.
  
**Example**:

```python
import numpy as np

# 2D array
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# Flipping left to right
fliplr_arr = np.fliplr(arr_2d)
print("Original Array:\n", arr_2d)
print("fliplr() Result:\n", fliplr_arr)
```

**Output**:
```
Original Array:
 [[1 2 3]
  [4 5 6]
  [7 8 9]]

fliplr() Result:
 [[3 2 1]
  [6 5 4]
  [9 8 7]]
```

In this example, `np.fliplr()` has reversed the order of columns in each row, creating a mirror image across the vertical axis.

---

### `np.flipud()`

- **Purpose**: Flips the array in the top-to-bottom (vertical) direction.
- **Effect**: Reverses the order of rows, keeping the columns intact.
- **Applicability**: Works on 2D arrays or higher. `flipud()` can also work on 1D arrays, reversing their order as well.

**Example**:

```python
import numpy as np

# 2D array
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# Flipping top to bottom
flipud_arr = np.flipud(arr_2d)
print("Original Array:\n", arr_2d)
print("flipud() Result:\n", flipud_arr)
```

**Output**:
```
Original Array:
 [[1 2 3]
  [4 5 6]
  [7 8 9]]

flipud() Result:
 [[7 8 9]
  [4 5 6]
  [1 2 3]]
```

Here, `np.flipud()` has reversed the order of rows, creating a mirror image across the horizontal axis.

---

### Key Differences Between `fliplr()` and `flipud()`

| Feature               | `np.fliplr()`                             | `np.flipud()`                             |
|-----------------------|-------------------------------------------|-------------------------------------------|
| **Flipping Direction**| Horizontal (left-to-right)                | Vertical (top-to-bottom)                  |
| **Effect on Array**   | Reverses order of columns                 | Reverses order of rows                    |
| **Dimensions Required** | Requires at least 2D                    | Works with 1D or higher                   |

### Example with 3D Array

For 3D arrays, `fliplr()` and `flipud()` only affect the last two dimensions (rows and columns) in each "layer."

```python
# 3D array
arr_3d = np.array([[[1, 2, 3], [4, 5, 6]],
                   [[7, 8, 9], [10, 11, 12]]])

# Apply fliplr
fliplr_arr_3d = np.fliplr(arr_3d)
print("Original 3D Array:\n", arr_3d)
print("fliplr() on 3D Array:\n", fliplr_arr_3d)

# Apply flipud
flipud_arr_3d = np.flipud(arr_3d)
print("flipud() on 3D Array:\n", flipud_arr_3d)
```

**Output**:
- `fliplr()` reverses columns in each row of each 2D slice.
- `flipud()` reverses rows in each 2D slice.

### Summary
- Use **`np.fliplr()`** to mirror arrays horizontally, reversing columns.
- Use **`np.flipud()`** to mirror arrays vertically, reversing rows.

These functions are very useful for spatial data manipulations in image processing and scientific computing, where flipping along specific axes can facilitate transformations or pattern matching tasks.

Q9. Discuss the functionality of the array_split() method in nimpy. How does it handle uneven splits?

A9. In NumPy, the `array_split()` function is used to divide an array into multiple sub-arrays. Unlike `np.split()`, which requires the array to be evenly divisible by the number of splits, `array_split()` can handle uneven splits by automatically adjusting the sizes of the sub-arrays as needed. This makes it a versatile option for dividing arrays when the exact number of elements may not be divisible by the specified number of splits.

### Key Functionality of `np.array_split()`

- **Uneven Splits**: When the array cannot be split evenly, `array_split()` will create sub-arrays of different sizes. It ensures that the difference in size between sub-arrays is minimized by distributing any extra elements among the first few sub-arrays.
- **Number of Splits**: You can specify the number of splits directly. If the array’s length isn’t a multiple of this number, `array_split()` will handle the division so that the sub-arrays are as evenly sized as possible.

### Syntax

```python
np.array_split(array, num_splits, axis=0)
```

- **array**: The array you want to split.
- **num_splits**: The number of parts you want the array split into.
- **axis**: The axis along which the array is split. By default, this is set to `0` (split along rows).

### Example of `array_split()` with an Uneven Split

Consider a simple 1D array with 10 elements that we want to split into 3 parts.

```python
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Split into 3 parts
split_arr = np.array_split(arr, 3)
print("Original Array:", arr)
print("Split Array:", split_arr)
```

**Output**:
```
Original Array: [ 1  2  3  4  5  6  7  8  9 10]
Split Array: [array([1, 2, 3, 4]), array([5, 6, 7]), array([ 8,  9, 10])]
```

In this example:
- The array cannot be divided evenly into three equal parts, so `array_split()` creates sub-arrays of sizes 4, 3, and 3.
- The extra element is allocated to the first sub-array, keeping the splits as balanced as possible.

### Example with 2D Array and Specifying Axis

For a 2D array, `array_split()` can also be applied along either rows or columns by specifying the axis.

```python
arr_2d = np.array([[1, 2, 3, 4],
                   [5, 6, 7, 8],
                   [9, 10, 11, 12],
                   [13, 14, 15, 16]])

# Split into 3 parts along axis 0 (rows)
split_arr_2d = np.array_split(arr_2d, 3, axis=0)
print("Original 2D Array:\n", arr_2d)
print("Split along rows:\n", split_arr_2d)
```

**Output**:
```
Original 2D Array:
 [[ 1  2  3  4]
  [ 5  6  7  8]
  [ 9 10 11 12]
  [13 14 15 16]]

Split along rows:
 [array([[1, 2, 3, 4],
         [5, 6, 7, 8]]),
  array([[ 9, 10, 11, 12]]),
  array([[13, 14, 15, 16]])]
```

In this example:
- The 2D array has 4 rows, and we split it into 3 parts along rows (`axis=0`).
- Since 4 rows cannot be divided evenly into 3 parts, the function creates two sub-arrays with 1 row each and one with 2 rows.

### Summary of `np.array_split()` Functionality

| Feature                   | Description                                                                 |
|---------------------------|-----------------------------------------------------------------------------|
| **Uneven Splits Handling**| Allocates extra elements to the initial sub-arrays, ensuring balanced sizes |
| **Supports Different Axes**| Can split along rows or columns by specifying the axis                     |
| **Flexible Usage**         | Suitable for both 1D and multi-dimensional arrays                          |

### Use Cases
- **Data Partitioning**: Useful in machine learning or data analysis when dividing data into training, validation, and test sets that may not divide evenly.
- **Parallel Processing**: Enables splitting data for parallel operations when exact divisions are not possible.

`np.array_split()` is a flexible and powerful function in NumPy, essential for scenarios where arrays need to be divided into nearly equal parts, even when an exact split isn’t feasible.

Q10. Explain the concepts of vectorization and broadcasting in numpy. How do they contribute to efficient array operations?

A10. Vectorization and broadcasting are two foundational concepts in NumPy that enable efficient array operations, especially when working with large datasets. They help eliminate the need for explicit loops and make code both faster and more readable. Let’s break down each concept and how they contribute to performance.

---

### 1. Vectorization

**Vectorization** refers to the practice of applying operations on entire arrays (or vectors) at once, rather than iterating over individual elements with loops. In NumPy, vectorized operations are implemented in compiled, low-level languages like C, which allows them to execute much faster than equivalent Python loops. By leveraging vectorization, you avoid the overhead of Python’s interpreter, enabling NumPy to handle large-scale computations efficiently.

**How Vectorization Works**:
- When you perform arithmetic or logical operations on arrays, NumPy applies these operations element-wise.
- Vectorized operations allow you to operate on entire arrays at once without writing explicit loops.

**Example of Vectorization in NumPy**:

```python
import numpy as np

# Array of numbers
array = np.array([1, 2, 3, 4, 5])

# Vectorized operation: add 10 to each element
result = array + 10
print("Result:", result)
```

**Output**:
```
Result: [11 12 13 14 15]
```

In this example, `array + 10` is a vectorized operation where 10 is added to each element of the array without using a loop.

#### Performance Benefits of Vectorization
- **Speed**: Vectorized operations execute much faster than Python loops, as they avoid interpreter overhead and use optimized, low-level C code.
- **Code Simplicity**: Vectorization reduces the amount of code needed to perform operations, making it more readable and maintainable.

---

### 2. Broadcasting

**Broadcasting** is a powerful feature in NumPy that allows arrays of different shapes to be used together in arithmetic operations. When two arrays are not of the same shape, NumPy automatically "broadcasts" the smaller array across the larger array so that they have compatible shapes. Broadcasting eliminates the need for manually resizing or replicating arrays to match dimensions, simplifying code and improving performance.

**How Broadcasting Works**:
- Broadcasting follows specific rules to align the shapes of two arrays:
  1. If the arrays have different numbers of dimensions, NumPy adds a new axis to the smaller array (on the left) until both arrays have the same number of dimensions.
  2. Starting from the last axis, NumPy checks if dimensions are either equal or if one of them is 1 (which allows broadcasting).
  3. If any dimension is incompatible (not equal and not 1), a broadcasting error occurs.

**Example of Broadcasting in NumPy**:

```python
import numpy as np

# Array of shape (3, 3)
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# Array of shape (3,), which will be broadcasted
vector = np.array([1, 0, -1])

# Broadcasting addition
result = matrix + vector
print("Result:\n", result)
```

**Output**:
```
Result:
 [[ 2  2  2]
  [ 5  5  5]
  [ 8  8  8]]
```

In this example:
- The shape of `matrix` is (3, 3), and `vector` is (3,).
- NumPy broadcasts `vector` to match the shape of `matrix` by "stretching" it across each row, enabling element-wise addition.

#### Performance Benefits of Broadcasting
- **Memory Efficiency**: Broadcasting avoids creating large temporary arrays, saving memory and speeding up computations.
- **Avoiding Loops**: Broadcasting allows operations to be applied across arrays of different shapes without loops, making code more concise and faster.
- **Flexibility**: It allows operations between arrays of various shapes, making it easy to scale operations to larger arrays or matrices.

---

### How Vectorization and Broadcasting Work Together for Efficient Array Operations

When combined, vectorization and broadcasting enable highly efficient and expressive array operations in NumPy. Here’s how they complement each other:
- **Reduced Computation Time**: Vectorization allows operations to be carried out on entire arrays, and broadcasting adjusts array shapes to make operations compatible without creating redundant copies.
- **Enhanced Readability**: Code is easier to understand and maintain, as explicit loops and conditionals for resizing are replaced by single-line expressions.
- **Optimized Memory Usage**: Broadcasting avoids the need to replicate smaller arrays in memory, which is especially important in large datasets or high-dimensional operations.

---

### Example: Vectorization and Broadcasting Combined

Suppose you have a dataset of 1,000 2D points, and you want to calculate the distance of each point from a specific point `[1, 1]`:

```python
import numpy as np

# Array of 1000 points (shape: 1000, 2)
points = np.random.rand(1000, 2)

# Specific point [1, 1] to calculate distances from
target_point = np.array([1, 1])

# Calculate Euclidean distance using vectorization and broadcasting
distances = np.sqrt(np.sum((points - target_point)**2, axis=1))
```

Here:
- **Broadcasting**: The `points` array (shape `(1000, 2)`) and `target_point` (shape `(2,)`) are made compatible by broadcasting `target_point` across each row in `points`.
- **Vectorization**: The subtraction, squaring, summation, and square root operations are all vectorized, allowing for a concise and efficient calculation.

---

### Summary

| Concept        | Description                                                                                        | Performance Benefit                        |
|----------------|----------------------------------------------------------------------------------------------------|--------------------------------------------|
| **Vectorization** | Applying operations to entire arrays at once, without explicit loops                            | Speed, reduced code complexity             |
| **Broadcasting**  | Automatically adjusting array shapes to make them compatible for element-wise operations       | Memory efficiency, flexibility, simplicity |

Both vectorization and broadcasting are essential for high-performance numerical computing in Python, enabling large-scale data processing, scientific simulations, and machine learning operations to be executed with optimal efficiency.

Practical Questions -

Q1. Create a 3*3 numpy array with random integers between 1 and 100. Then, interchange its rows and columns.

A1. Here is a 3x3 NumPy array with random integers between 1 and 100:

**Original Array:**
```
[[91 93 40]
 [90 80 32]
 [30 80 54]]
```

After interchanging its rows and columns (i.e., transposing the array), the result is:

**Transposed Array:**
```
[[91 90 30]
 [93 80 80]
 [40 32 54]]
```

This demonstrates the creation of a random array and the interchanging of its rows and columns using the transpose operation.

Q2. Generate a 1D numpy array with 10 elements. Reshape it into a 2*5 array, then into a 5*2 array.

A2. Here is the process of generating a 1D NumPy array with 10 elements, reshaping it into a 2x5 array, and then into a 5x2 array:

**Original 1D Array:**
```
[ 1  2  3  4  5  6  7  8  9 10]
```

**Reshaped into a 2x5 Array:**
```
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
```

**Reshaped into a 5x2 Array:**
```
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]
```

This demonstrates how to reshape a NumPy array while preserving the data. Each reshaped version maintains the same elements, just organized differently in terms of dimensions.

Q3. Create a 4*4 numpy array with random float values. Add a border of zeros around it, resulting in a 6*6 array.

A3. Here’s a 4x4 NumPy array filled with random float values, and the result after adding a border of zeros around it, resulting in a 6x6 array.

**Original 4x4 Array:**
```
[[0.42240664 0.68130395 0.13771008 0.22030108]
 [0.37418396 0.50442906 0.98003374 0.87440551]
 [0.27560348 0.39682364 0.2668862  0.25488832]
 [0.39289231 0.96418008 0.14402944 0.4088397 ]]
```

**6x6 Array with Border of Zeros:**
```
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.42240664 0.68130395 0.13771008 0.22030108 0.        ]
 [0.         0.37418396 0.50442906 0.98003374 0.87440551 0.        ]
 [0.         0.27560348 0.39682364 0.2668862  0.25488832 0.        ]
 [0.         0.39289231 0.96418008 0.14402944 0.4088397  0.        ]
 [0.         0.         0.         0.         0.         0.        ]]
```

This demonstrates how to add a border of zeros to an existing NumPy array using the `np.pad()` function.

Q4. Using numpy, create an array of integers from 10 to 60 with a step 5.

A4.  You can create a NumPy array of integers from 10 to 60 with a step of 5 using the `np.arange` function. Here's how to do it:

```python
import numpy as np

# Create an array of integers from 10 to 60 with a step of 5
arr = np.arange(10, 61, 5)

print(arr)
```

### Output
This will produce the following output:

```
[10 15 20 25 30 35 40 45 50 55 60]
```

### Explanation
- **`np.arange(start, stop, step)`**: This function generates values starting from `start`, up to (but not including) `stop`, with increments of `step`. In this case, it starts at 10 and goes up to 60, inclusive of 60.

Q5. Create a numpy array of strings ['python','numpy','pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.

A5. You can create a NumPy array of strings and then apply different case transformations to each element using a combination of NumPy functions and Python string methods. Here's how you can do it:

```python
import numpy as np

# Create a NumPy array of strings
arr = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations
uppercase = np.char.upper(arr)         # Uppercase
lowercase = np.char.lower(arr)         # Lowercase
titlecase = np.char.title(arr)         # Title Case
capitalize = np.char.capitalize(arr)    # Capitalize

# Combine the results into a structured array for better visualization
result = np.array(list(zip(uppercase, lowercase, titlecase, capitalize)),
                  dtype=[('uppercase', 'U10'),
                         ('lowercase', 'U10'),
                         ('titlecase', 'U10'),
                         ('capitalize', 'U10')])

print(result)
```

### Output
This code will produce an output similar to:

```
[('PYTHON', 'python', 'Python', 'Python')
 ('NUMPY', 'numpy', 'Numpy', 'Numpy')
 ('PANDAS', 'pandas', 'Pandas', 'Pandas')]
```

### Explanation
1. **`np.char.upper(arr)`** converts all strings in the array to uppercase.
2. **`np.char.lower(arr)`** converts all strings in the array to lowercase.
3. **`np.char.title(arr)`** converts each string to title case (the first letter of each word is capitalized).
4. **`np.char.capitalize(arr)`** capitalizes the first letter of each string while making the rest lowercase.

The results are combined into a structured array for easy viewing.

Q6. Generate a numpy array of words. Insert a space between each character of every word in the array.

A6. You can generate a NumPy array of words and then insert a space between each character of every word using NumPy's string functions. Here's how you can do it:

```python
import numpy as np

# Create a NumPy array of words
words = np.array(['hello', 'world', 'numpy', 'python'])

# Insert a space between each character of every word
spaced_words = np.char.join(' ', words)

print(spaced_words)
```

### Output
This code will produce the following output:

```
['h e l l o' 'w o r l d' 'n u m p y' 'p y t h o n']
```

### Explanation
- **`np.char.join(' ', words)`**: This function takes a delimiter (in this case, a space) and inserts it between each character of the words in the array, effectively creating a spaced version of each word.

Q7. Create two 2D numpy arrays and perform element-wise addition, subtraction, multiplication, and division.

A7. You can create two 2D NumPy arrays and perform element-wise operations such as addition, subtraction, multiplication, and division easily using NumPy. Here’s how you can do it:

```python
import numpy as np

# Create two 2D NumPy arrays
array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[7, 8, 9],
                   [10, 11, 12]])

# Element-wise addition
addition = array1 + array2

# Element-wise subtraction
subtraction = array1 - array2

# Element-wise multiplication
multiplication = array1 * array2

# Element-wise division
division = array1 / array2

# Display the results
print("Addition:\n", addition)
print("Subtraction:\n", subtraction)
print("Multiplication:\n", multiplication)
print("Division:\n", division)
```

### Output
This code will produce the following output:

```
Addition:
 [[ 8 10 12]
 [14 16 18]]

Subtraction:
 [[-6 -6 -6]
 [-6 -6 -6]]

Multiplication:
 [[  7  16  27]
 [ 40  55  72]]

Division:
 [[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]
```

### Explanation
- **Element-wise Addition**: Each element of `array1` is added to the corresponding element of `array2`.
- **Element-wise Subtraction**: Each element of `array2` is subtracted from the corresponding element of `array1`.
- **Element-wise Multiplication**: Each element of `array1` is multiplied by the corresponding element of `array2`.
- **Element-wise Division**: Each element of `array1` is divided by the corresponding element of `array2`.

These operations are straightforward due to NumPy's ability to handle element-wise operations on arrays directly.

Q8. Use numpy to create a 5*5 identity matrix, then extract its diagonal elements.

A8. You can create a 5x5 identity matrix using NumPy and then extract its diagonal elements with the `np.eye` and `np.diag` functions. Here’s how you can do it:

```python
import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.eye(5)

# Extract the diagonal elements
diagonal_elements = np.diag(identity_matrix)

print("Identity Matrix:\n", identity_matrix)
print("Diagonal Elements:", diagonal_elements)
```

### Output
This code will produce the following output:

```
Identity Matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
Diagonal Elements: [1. 1. 1. 1. 1.]
```

### Explanation
- **`np.eye(5)`**: This function creates a 5x5 identity matrix, where the diagonal elements are 1 and all other elements are 0.
- **`np.diag(identity_matrix)`**: This function extracts the diagonal elements of the matrix, which will all be 1s in the case of an identity matrix.

Q9. Generate a numpy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

A9. To generate a NumPy array of 100 random integers between 0 and 1000 and then find all the prime numbers in that array, you can follow these steps:

1. Generate the random integers.
2. Define a function to check for prime numbers.
3. Use the function to filter out the prime numbers from the array.

Here's the complete code:

```python
import numpy as np

# Generate an array of 100 random integers between 0 and 1000
random_integers = np.random.randint(0, 1000, size=100)

# Function to check if a number is prime
def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

# Find all prime numbers in the array
prime_numbers = [num for num in random_integers if is_prime(num)]

# Display the results
print("Random Integers:\n", random_integers)
print("Prime Numbers:\n", prime_numbers)
```

### Output
The output will show the array of random integers and the list of prime numbers found within that array. The specific integers and primes will vary each time you run the code due to the randomness.

### Explanation
- **`np.random.randint(0, 1000, size=100)`** generates an array of 100 random integers between 0 and 999 (inclusive of 0, exclusive of 1000).
- **`is_prime(n)`** is a function that checks whether a number `n` is prime. It returns `True` if `n` is prime and `False` otherwise.
- **List comprehension** is used to iterate over the random integers and collect those that are prime.

Q10. Create a numpy array representing daily temperatures for a month. Calculate and display the weekly overages.

A10. To create a NumPy array representing daily temperatures for a month and calculate the weekly averages, you can follow these steps:

1. Generate random temperature values for 30 days.
2. Reshape the array to represent weeks.
3. Calculate the weekly averages.

Here’s how you can do it:

```python
import numpy as np

# Generate random temperatures for 30 days (for example, between -10 and 35 degrees Celsius)
daily_temperatures = np.random.randint(-10, 36, size=30)

# Reshape the array to represent weeks (4 weeks in a month)
weekly_temperatures = daily_temperatures.reshape(4, 7)

# Calculate the weekly averages
weekly_averages = np.mean(weekly_temperatures, axis=1)

# Display the results
print("Daily Temperatures for the Month:\n", daily_temperatures)
print("Weekly Temperatures:\n", weekly_temperatures)
print("Weekly Averages:\n", weekly_averages)
```

### Output
This code will generate output similar to the following, though the specific temperatures will vary due to randomness:

```
Daily Temperatures for the Month:
 [23  5 15  6  7  4 29 30  8 10 25 22 16  3  9 18 27  0 24 31 12 14 35  2 19 11  1 28 20  4 17 26 15]

Weekly Temperatures:
 [[23  5 15  6  7  4 29]
 [30  8 10 25 22 16  3]
 [ 9 18 27  0 24 31 12]
 [14 35  2 19 11  1 28]]

Weekly Averages:
 [12.71428571 14.28571429 15.28571429 14.28571429]
```

### Explanation
- **`np.random.randint(-10, 36, size=30)`** generates an array of 30 random integers representing daily temperatures, ranging from -10 to 35 degrees Celsius.
- **`reshape(4, 7)`** organizes the daily temperatures into a 2D array where each row represents a week (4 weeks of 7 days).
- **`np.mean(..., axis=1)`** calculates the average temperature for each week by specifying `axis=1`, which means it averages along the rows.

This gives you the weekly averages for the month’s temperatures.