### Theoretical Questions

#### 1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?

**Purpose and Advantages:**
NumPy, short for Numerical Python, is a fundamental package for scientific computing in Python. Its primary purpose is to provide support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays efficiently.

**Advantages:**
- **Performance**: NumPy arrays are implemented in C, which makes them more compact and faster than Python lists due to the use of contiguous blocks of memory and optimized operations.
- **Convenience**: Provides a wide range of mathematical functions and operations for array manipulation, making it easier to perform complex calculations.
- **Functionality**: Supports advanced operations such as linear algebra, Fourier transforms, and random number generation.
- **Interoperability**: Works seamlessly with other scientific libraries such as SciPy, Pandas, and Matplotlib, enhancing the capabilities for scientific computing and data analysis.

**Enhancements to Python:**
- **Array Operations**: Enables element-wise operations and batch processing, which are faster and more efficient than iterating over lists.
- **Memory Efficiency**: Uses less memory due to its fixed data type and contiguous memory allocation, which reduces overhead.
- **Broadcasting**: Allows operations on arrays of different shapes without the need for explicit loops, making the code more intuitive and less error-prone.

*Real-life example:* 
In machine learning, NumPy is often used for handling large datasets and performing matrix operations, which are fundamental in training and optimizing machine learning models. For instance, calculating the dot product of large matrices is significantly faster with NumPy.

#### 2. Compare and contrast `np.mean()` and `np.average()` functions in NumPy. When would you use one over the other?

**Comparison:**
- **`np.mean()`**: Calculates the arithmetic mean along the specified axis. It is straightforward and does not allow for weighted calculations.
- **`np.average()`**: Computes the weighted average, allowing for different weights for different elements. If no weights are provided, it behaves like `np.mean()`.

**Usage:**
- Use `np.mean()` when you need a simple average of an array.
- Use `np.average()` when dealing with datasets where certain elements have more significance (weight) than others. For example, calculating the average score of a student where different assignments have different weights.

*Real-life example:* 
In a grading system, if you have exams with different weights, you would use `np.average()` to calculate the final grade, considering the weight of each exam.

#### 3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

**Methods:**
- **1D Array**: Use slicing `[::-1]` to reverse the array.

In [1]:
import numpy as np
arr = np.array([1, 2, 3, 4])
reversed_arr = arr[::-1]  # Output: [4, 3, 2, 1]
reversed_arr

array([4, 3, 2, 1])

- **2D Array**: Use slicing along different axes.

In [2]:
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
reversed_arr_2d = arr_2d[::-1, ::-1]  # Reverses both rows and columns
reversed_arr_2d  # Output: [[9, 8, 7], [6, 5, 4], [3, 2, 1]]

array([[9, 8, 7],
       [6, 5, 4],
       [3, 2, 1]])

*Real-life example:* 
Reversing an image matrix for image processing tasks like flipping an image horizontally or vertically.

#### 4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.

**Determining Data Type:**
Use `array.dtype` to find the data type of elements.

In [3]:
arr = np.array([1, 2, 3])
print(arr.dtype)  # Output: iint32

int32


**Importance:**
- **Memory Management**: Different data types consume different amounts of memory. Using the appropriate type can save memory and optimize performance.
- **Performance**: Operations on fixed-type arrays are faster due to optimized low-level implementations and reduced type checking.

*Real-life example:* 
Using `float32` instead of `float64` for image data can reduce memory usage and improve performance in image processing tasks.

#### 5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

**Definition:**
An ndarray is a multi-dimensional, homogeneous array of fixed-size items.

**Key Features:**
- **Homogeneous**: All elements are of the same type.
- **N-dimensional**: Can have multiple dimensions (e.g., 2D for matrices, 3D for tensors).
- **Indexing and Slicing**: More powerful and flexible than Python lists.
- **Broadcasting**: Allows arithmetic operations on arrays of different shapes, making code more efficient and concise.

**Differences from Python Lists:**
- **Performance**: NumPy arrays are faster due to contiguous memory allocation and lower-level optimizations.
- **Functionality**: Supports a wider range of mathematical operations and functions.
- **Memory**: More memory-efficient due to fixed data types and contiguous memory storage.

*Real-life example:* 
NumPy ndarrays are extensively used in scientific computing and machine learning for handling large datasets and performing complex mathematical computations.

#### 6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

**Performance Benefits:**
- **Speed**: NumPy arrays are implemented in C and are much faster for numerical operations due to optimized low-level operations.
- **Memory Efficiency**: Use less memory due to fixed data types and contiguous memory storage, which reduces overhead.
- **Vectorization**: Allows for batch operations, reducing the need for explicit loops and increasing efficiency.

*Real-life example:* 
Performing matrix multiplications in machine learning algorithms is significantly faster with NumPy arrays compared to Python lists, which can dramatically reduce training time for large models.

#### 7. Compare `vstack()` and `hstack()` functions in NumPy. Provide examples demonstrating their usage and output.

**Comparison:**
- **`vstack()`**: Stacks arrays vertically (row-wise).
- **`hstack()`**: Stacks arrays horizontally (column-wise).

In [4]:
#Usage
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Vertical stack
vstacked = np.vstack((arr1, arr2))
print(vstacked) # Output: [[1, 2, 3], [4, 5, 6]]

print()
# Horizontal stack
hstacked = np.hstack((arr1, arr2))
print(hstacked) # Output: [1, 2, 3, 4, 5, 6]

[[1 2 3]
 [4 5 6]]

[1 2 3 4 5 6]


*Real-life example:* 
Stacking feature vectors in machine learning datasets for model training. For instance, stacking multiple samples vertically to create a dataset matrix.

#### 8. Explain the differences between `fliplr()` and `flipud()` methods in NumPy, including their effects on various array dimensions.

**Differences:**
- **`fliplr()`**: Flips array horizontally (left to right).
- **`flipud()`**: Flips array vertically (upside down).

In [5]:
#Effects
arr_2d = np.array([[1, 2], [3, 4], [5, 6]])

# Horizontal flip
flipped_lr = np.fliplr(arr_2d)
print(flipped_lr)
# Output: [[2, 1], [4, 3], [6, 5]]

print()

# Vertical flip
flipped_ud = np.flipud(arr_2d)
print(flipped_ud)
# Output: [[5, 6], [3, 4], [1, 2]]

[[2 1]
 [4 3]
 [6 5]]

[[5 6]
 [3 4]
 [1 2]]


*Real-life example:* 
Image processing tasks where flipping an image horizontally or vertically is required, such as creating mirror images or performing data augmentation.

#### 9. Discuss the functionality of the `array_split()` method in NumPy. How does it handle uneven splits?

**Functionality:**
`array_split()` splits an array into multiple sub-arrays.

**Handling Uneven Splits:**
When the array cannot be evenly divided, `array_split()` distributes the elements as evenly as possible.

In [6]:
arr = np.array([1, 2, 3, 4, 5])
split_arr = np.array_split(arr, 3)
print(split_arr)
# Output: [array([1, 2]), array([3, 4]), array([5])]

[array([1, 2]), array([3, 4]), array([5])]


*Real-life example:* 
Dividing a dataset into training and testing subsets where the split is not necessarily even, ensuring that each subset has as close to equal elements as possible.

#### 10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

**Vectorization:**
Refers to the ability to perform element-wise operations on arrays without explicit loops, using optimized low-level code.

**Broadcasting:**
Allows operations on arrays of different shapes by expanding the smaller array to match the shape of the larger one.

**Contribution to Efficiency:**
- **Speed**: Operations are performed at C speed, avoiding Python loops and making the code much faster.
- **Memory**: Reduces the need for additional memory allocations and avoids unnecessary data replication.

*Real-life example:* 
In neural network computations, vectorized operations and broadcasting are used to efficiently process large datasets and matrices, significantly speeding up training and inference times.

### Practical Exercises

#### 1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.

**Explanation:**
- Create a 3x3 NumPy array with random integers between 1 and 100 using `np.random.randint()`.
- Use `np.transpose()` to change the shape of the array.

In [7]:
import numpy as np

# Step 1: Create a 3x3 NumPy array with random integers between 1 and 100
array_3x3 = np.random.randint(1, 101, size=(3, 3))
print("Original 3x3 Array:\n", array_3x3)

# Step 2: Transpose the array to interchange its rows and columns
transposed_array = np.transpose(array_3x3)
print("Transposed Array (Rows and Columns Interchanged):\n", transposed_array)

Original 3x3 Array:
 [[ 92  25  18]
 [ 61 100  76]
 [  5  15  96]]
Transposed Array (Rows and Columns Interchanged):
 [[ 92  61   5]
 [ 25 100  15]
 [ 18  76  96]]


#### 2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.

**Explanation:**
- Use `np.arange()` to create the array.
- Use `reshape()` to reshape the array into a 2x5 array & then to 5x2

In [8]:
import numpy as np

# Step 1: Generate a 1D NumPy array with 10 elements
array_1d = np.arange(1, 11)
print("1D Array with 10 elements:\n", array_1d)

# Step 2: Reshape the array into a 2x5 array
array_2x5 = array_1d.reshape(2, 5)
print("Reshaped to 2x5 Array:\n", array_2x5)

# Step 3: Reshape the 2x5 array into a 5x2 array
array_5x2 = array_2x5.reshape(5, 2)
print("Reshaped to 5x2 Array:\n", array_5x2)

1D Array with 10 elements:
 [ 1  2  3  4  5  6  7  8  9 10]
Reshaped to 2x5 Array:
 [[ 1  2  3  4  5]
 [ 6  7  8  9 10]]
Reshaped to 5x2 Array:
 [[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]


#### 3. Generate a 4x4 random matrix with values between 0 and 1. Add a border of zeros around it, resulting in a 6x6 array.

**Explanation:**
- Use `np.random.rand()` to generate the matrix.
- Use `np.pad()` to add a border of zeros.

In [9]:
# Generate a 4x4 random matrix
arr = np.random.rand(4, 4)
print("Generated Array:\n", arr)

print()
# Add a border of zeros
arr_with_border = np.pad(arr, pad_width=1, mode='constant', constant_values=0)
print("Array with Border:\n", arr_with_border)

Generated Array:
 [[0.61358442 0.76032607 0.86463691 0.05799793]
 [0.77000466 0.39261373 0.13652629 0.27230951]
 [0.32690507 0.93560414 0.10268371 0.43417463]
 [0.63997547 0.4600669  0.7992294  0.94936807]]

Array with Border:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.61358442 0.76032607 0.86463691 0.05799793 0.        ]
 [0.         0.77000466 0.39261373 0.13652629 0.27230951 0.        ]
 [0.         0.32690507 0.93560414 0.10268371 0.43417463 0.        ]
 [0.         0.63997547 0.4600669  0.7992294  0.94936807 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


#### 4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.

**Explanation:**
- Use `np.arange()` to create the array with a specified step.

In [10]:
# Create an array of integers from 10 to 60 with a step of 5
arr = np.arange(10, 61, 5)
print("Array from 10 to 60 with step 5:\n", arr)

Array from 10 to 60 with step 5:
 [10 15 20 25 30 35 40 45 50 55 60]


#### 5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case) to each element.

**Explanation:**
- Use `np.char` functions for string manipulations.

In [11]:
# Create an array of strings
arr = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations
uppercase = np.char.upper(arr)
lowercase = np.char.lower(arr)
titlecase = np.char.title(arr)

print("Uppercase:\n", uppercase)
print("Lowercase:\n", lowercase)
print("Titlecase:\n", titlecase)

Uppercase:
 ['PYTHON' 'NUMPY' 'PANDAS']
Lowercase:
 ['python' 'numpy' 'pandas']
Titlecase:
 ['Python' 'Numpy' 'Pandas']


#### 6. Generate a NumPy array of words. Insert a space between each character of every word in the array.

**Explanation:**
- Use `np.char.join()` to insert spaces.

In [12]:
# Create an array of words
arr = np.array(['hello', 'world'])

# Insert a space between each character
spaced_arr = np.char.join(' ', arr)
print("Array with spaces between characters:\n", spaced_arr)

Array with spaces between characters:
 ['h e l l o' 'w o r l d']


#### 7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

**Explanation:**
- Use basic arithmetic operations for element-wise calculations.

In [13]:
# Create two 2D arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# Perform element-wise operations
addition = arr1 + arr2
subtraction = arr1 - arr2
multiplication = arr1 * arr2
division = arr1 / arr2

print("Addition:\n", addition)
print("Subtraction:\n", subtraction)
print("Multiplication:\n", multiplication)
print("Division:\n", division)

Addition:
 [[ 6  8]
 [10 12]]
Subtraction:
 [[-4 -4]
 [-4 -4]]
Multiplication:
 [[ 5 12]
 [21 32]]
Division:
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


#### 8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.

**Explanation:**
- Use `np.eye()` to create the identity matrix.
- Use `np.diag()` to extract diagonal elements.

In [14]:
# Create a 5x5 identity matrix
identity_matrix = np.eye(5)

# Extract diagonal elements
diagonal_elements = np.diag(identity_matrix)

print("Identity Matrix:\n", identity_matrix)
print("Diagonal Elements:\n", diagonal_elements)

Identity Matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
Diagonal Elements:
 [1. 1. 1. 1. 1.]


#### 9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

**Explanation:**
- Use `np.random.randint()` to generate the array.
- Implement a function to check for prime numbers.

In [15]:
# Generate a NumPy array of 100 random integers between 0 and 1000
arr = np.random.randint(0, 1001, 100)

# Function to check if a number is prime
def is_prime(num):
    if num < 2:
        return False
    for i in range(2, int(np.sqrt(num)) + 1):
        if num % i == 0:
            return False
    return True

# Find prime numbers in the array
primes = np.array([num for num in arr if is_prime(num)])
print("Prime Numbers in the Array:\n", primes)

Prime Numbers in the Array:
 [523 239 439 167  59 199 857 971 211  59 827 641 307 479 883 127]


#### 10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.

**Explanation:**
- Use `np.random.rand()` to generate random temperatures.
- Use `reshape()` and `np.mean()` to calculate weekly averages.

**Steps:**
1. **Generate daily temperatures:** Creates an array of random temperatures for 30 days.
2. **Calculate padding length:** Determines how many values need to be added to make the array length a multiple of 7.
3. **Pad the array:** Pads the array with `NaN` values.
4. **Reshape the array:** Reshapes the array into weeks, now that its length is a multiple of 7.
5. **Calculate weekly averages:** Uses `np.nanmean` to compute the mean, ignoring `NaN` values.

This approach ensures that the array can be reshaped without errors and calculates accurate weekly averages even if the number of days is not a perfect multiple of 7.

In [16]:
import numpy as np

# Generate a NumPy array of daily temperatures for a month
daily_temperatures = np.random.rand(30) * 30  # Random temperatures between 0 and 30

# Calculate the number of days to pad to make the length a multiple of 7
padding_length = (7 - len(daily_temperatures) % 7) % 7

# Pad the array with NaNs to handle partial weeks properly
padded_temperatures = np.pad(daily_temperatures, (0, padding_length), constant_values=np.nan)

# Reshape the array to have weeks as rows
reshaped_temperatures = padded_temperatures.reshape(-1, 7)

# Calculate weekly averages, ignoring NaNs
weekly_averages = np.nanmean(reshaped_temperatures, axis=1)

print("Daily Temperatures:\n", daily_temperatures)
print("\n Padded Temperatures:\n", padded_temperatures)
print("\n Weekly Averages:\n", weekly_averages)


Daily Temperatures:
 [21.42229909 23.38007456  1.62071232 24.27363528 13.56514933 21.49515487
  7.87442151 27.07496669 19.49048802  6.8413274  13.36489369 17.16211408
 25.03537818 13.74051282 25.84158408  5.34035636 25.61019358 22.67053897
  1.07695161 25.21573453 15.22903569 13.68361997 12.54653703 15.87455698
 29.90917837 27.45895463 24.70402903 11.65658808 28.34947189  6.27929139]

 Padded Temperatures:
 [21.42229909 23.38007456  1.62071232 24.27363528 13.56514933 21.49515487
  7.87442151 27.07496669 19.49048802  6.8413274  13.36489369 17.16211408
 25.03537818 13.74051282 25.84158408  5.34035636 25.61019358 22.67053897
  1.07695161 25.21573453 15.22903569 13.68361997 12.54653703 15.87455698
 29.90917837 27.45895463 24.70402903 11.65658808 28.34947189  6.27929139
         nan         nan         nan         nan         nan]

 Weekly Averages:
 [16.23306385 17.52995441 17.28348497 19.40478058 17.31438164]


### End of Assignment!