## Numpy Assignment
**Student Name:** Twesigye John Davidson

## Instructions:
- Please complete the questions below.
- For each cell, complete it with good essays and code examples.
- Please comment your code indicating how you approached the problem.

# What is NumPy, and what are some of its key features?

NumPy, which stands for Numerical Python, is a fundamental library for scientific computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of high-level mathematical functions to operate on these arrays efficiently.

**Key Features:**

* **N-dimensional Array Object (ndarray):** This is the core of NumPy. The `ndarray` is a fast and memory-efficient data structure compared to Python's built-in lists, especially for numerical operations.
* **Broadcasting:** This feature allows NumPy to perform arithmetic operations on arrays of different shapes. It provides a powerful mechanism for vectorizing array operations, which simplifies code and improves performance.
* **Mathematical and Statistical Functions:** NumPy contains a comprehensive library of mathematical functions for performing element-wise operations, as well as functions for linear algebra, Fourier analysis, and random number generation.
* **Integration with Other Libraries:** NumPy serves as the foundation for many other scientific and data analysis libraries in Python, such as SciPy, Pandas, and Matplotlib.

# How do you create a NumPy array using Python's built-in range() function?

While you can't create a NumPy array directly with Python's `range()` function, you can use it to generate a list of numbers first and then convert that list into a NumPy array using the `np.array()` constructor.

However, the more direct and standard method in NumPy is to use the `np.arange()` function. It is similar to Python's `range()` but returns a NumPy array instead of a list.

In [None]:
import numpy as np

# Method 1: Using Python's range() and converting it to a NumPy array
# First, create a list using range()
python_list = list(range(10))
# Then, convert the list into a NumPy array
numpy_array_from_list = np.array(python_list)
print(f"Array from list: {numpy_array_from_list}")

# Method 2: The more direct NumPy approach using np.arange()
# This is generally preferred as it's more efficient
numpy_array_from_arange = np.arange(10)
print(f"Array from np.arange(): {numpy_array_from_arange}")

# What is the difference between a scalar value and a vector in NumPy?
with code example

In the context of NumPy, the main difference between a scalar and a vector relates to their dimensions.

* A **scalar** is a single number, essentially a zero-dimensional array. It has a shape of `()` and represents a single value.
* A **vector** is a one-dimensional array of numbers. It has a specific length and its shape is represented as `(n,)`, where `n` is the number of elements in the vector.

In [None]:
import numpy as np

# Creating a scalar
# A scalar is a 0-dimensional array in NumPy
scalar = np.array(15)
print(f"Scalar value: {scalar}")
print(f"Shape of the scalar: {scalar.shape}")
print(f"Number of dimensions: {scalar.ndim}")

print("-"*30)

# Creating a vector
# A vector is a 1-dimensional array
vector = np.array([2, 4, 6, 8, 10])
print(f"Vector: {vector}")
print(f"Shape of the vector: {vector.shape}")
print(f"Number of dimensions: {vector.ndim}")

# How do you calculate the mean of a NumPy array using the mean() function?

Examples 


You can calculate the mean (or average) of a NumPy array by using the `np.mean()` function. This function computes the arithmetic mean of all the elements in the array. You can also apply it to a specific axis of a multi-dimensional array.

In [1]:
import numpy as np

# Example 1: Mean of a 1D array
arr1d = np.array([10, 20, 30, 40, 50])
mean_1d = np.mean(arr1d)
print(f"1D Array: {arr1d}")
print(f"Mean of the 1D array: {mean_1d}")

print("-"*30)

# Example 2: Mean of a 2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"2D Array:\n{arr2d}")

# Calculate the mean of all elements in the 2D array
total_mean = np.mean(arr2d)
print(f"Overall mean of the 2D array: {total_mean}")

# Calculate the mean along a specific axis
# axis=0 calculates the mean of each column
mean_axis0 = np.mean(arr2d, axis=0)
print(f"Mean along axis 0 (columns): {mean_axis0}")

# axis=1 calculates the mean of each row
mean_axis1 = np.mean(arr2d, axis=1)
print(f"Mean along axis 1 (rows): {mean_axis1}")

1D Array: [10 20 30 40 50]
Mean of the 1D array: 30.0
------------------------------
2D Array:
[[1 2 3]
 [4 5 6]]
Overall mean of the 2D array: 3.5
Mean along axis 0 (columns): [2.5 3.5 4.5]
Mean along axis 1 (rows): [2. 5.]


# What is broadcasting in NumPy, and how can it be useful?

Broadcasting is a powerful mechanism in NumPy that allows for arithmetic operations to be performed on arrays of different shapes. When performing an operation between two arrays, NumPy compares their shapes. If the dimensions are compatible, it "broadcasts" the smaller array across the larger array so that they have compatible shapes.

**Usefulness:**
Broadcasting is incredibly useful because it avoids the need to create explicit copies of data to match array shapes, which leads to more efficient memory usage. It also allows for more concise and readable code, as it vectorizes operations and eliminates the need for explicit loops.

In [None]:
import numpy as np

# Create a 2D array (matrix)
matrix = np.array([[1, 2, 3], 
                     [4, 5, 6], 
                     [7, 8, 9]])

# Create a 1D array (vector)
vector = np.array([10, 20, 30])

# Without broadcasting, we would need to tile the vector to match the matrix shape
# Tiled vector would look like:
# [[10, 20, 30],
#  [10, 20, 30],
#  [10, 20, 30]]

# With broadcasting, NumPy handles this automatically
# The vector is 'broadcast' across each row of the matrix
result = matrix + vector

print("Matrix:")
print(matrix)
print("\nVector:")
print(vector)
print("\nResult of matrix + vector (with broadcasting):")
print(result)

# How can you slice a NumPy array to extract a subarray?

Slicing in NumPy is similar to slicing Python lists, but it can be extended to multiple dimensions. The syntax for slicing is `start:stop:step`, where `start` is the starting index (inclusive), `stop` is the ending index (exclusive), and `step` is the increment.

For a multi-dimensional array, you provide slices for each dimension, separated by commas.

In [None]:
import numpy as np

# Slicing a 1D array
arr1d = np.arange(10) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(f"Original 1D array: {arr1d}")

# Get elements from index 2 up to (but not including) index 5
slice1d = arr1d[2:5]
print(f"Slice from index 2 to 5: {slice1d}")

print("-"*30)

# Slicing a 2D array
arr2d = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
print(f"Original 2D array:\n{arr2d}")

# Slice to get the first two rows and columns from index 1 to the end
# For rows, we use :2 (from start up to index 2)
# For columns, we use 1: (from index 1 to the end)
slice2d = arr2d[:2, 1:]
print(f"\nSubarray (first 2 rows, columns 1-end):\n{slice2d}")

# What are some of the available functions for performing element-wise operations on NumPy arrays?

NumPy provides a large set of universal functions (ufuncs) that perform element-wise operations on arrays. These are highly optimized and much faster than performing the equivalent operations in a Python loop.

Some common element-wise functions include:
* **Arithmetic:** `np.add`, `np.subtract`, `np.multiply`, `np.divide`
* **Trigonometric:** `np.sin`, `np.cos`, `np.tan`
* **Exponents and Logarithms:** `np.exp`, `np.log`, `np.log10`
* **Square Root:** `np.sqrt`

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

# Element-wise addition
addition = np.add(arr1, arr2) # Same as arr1 + arr2
print(f"Element-wise addition: {addition}")

# Element-wise subtraction
subtraction = np.subtract(arr2, arr1) # Same as arr2 - arr1
print(f"Element-wise subtraction: {subtraction}")

# Element-wise multiplication
multiplication = np.multiply(arr1, arr2) # Same as arr1 * arr2
print(f"Element-wise multiplication: {multiplication}")

# Element-wise square root
sqrt_arr1 = np.sqrt(arr1)
print(f"Element-wise square root of arr1: {sqrt_arr1}")

# How do you reshape a NumPy array to have a different shape?

You can reshape a NumPy array using the `np.reshape()` function or the `reshape()` method of an array object. Reshaping allows you to change the dimensions of an array without changing its data. The total number of elements in the new shape must be the same as the original array.

In [None]:
import numpy as np

# Create a 1D array with 12 elements
arr = np.arange(12)
print(f"Original array (shape {arr.shape}):\n{arr}")

# Reshape the array into a 3x4 matrix (3 rows, 4 columns)
reshaped_arr_1 = arr.reshape(3, 4)
print(f"\nReshaped to 3x4 (shape {reshaped_arr_1.shape}):\n{reshaped_arr_1}")

# Reshape the array into a 4x3 matrix (4 rows, 3 columns)
reshaped_arr_2 = np.reshape(arr, (4, 3))
print(f"\nReshaped to 4x3 (shape {reshaped_arr_2.shape}):\n{reshaped_arr_2}")

# One dimension can be -1, which means NumPy will infer its size
# For example, reshape to have 2 rows and automatically calculate columns
reshaped_arr_3 = arr.reshape(2, -1)
print(f"\nReshaped to 2 rows (shape {reshaped_arr_3.shape}):\n{reshaped_arr_3}")

# How do you perform matrix multiplication on two NumPy arrays using the dot() function?

Matrix multiplication can be performed using the `np.dot()` function or the `@` operator (for Python 3.5+). For two matrices to be multiplied, the number of columns in the first matrix must be equal to the number of rows in the second matrix.

If `A` is an `m x n` matrix and `B` is an `n x p` matrix, their dot product will be an `m x p` matrix.

In [None]:
import numpy as np

# Create a 2x3 matrix
matrix_a = np.array([[1, 2, 3], [4, 5, 6]]) # Shape: (2, 3)

# Create a 3x2 matrix
matrix_b = np.array([[7, 8], [9, 10], [11, 12]]) # Shape: (3, 2)

print(f"Matrix A (2x3):\n{matrix_a}")
print(f"\nMatrix B (3x2):\n{matrix_b}")

# Perform matrix multiplication using np.dot()
dot_product = np.dot(matrix_a, matrix_b)
print(f"\nResult of dot product (2x2):\n{dot_product}")

# The @ operator can also be used for the same purpose
matmul_product = matrix_a @ matrix_b
print(f"\nResult using @ operator (2x2):\n{matmul_product}")

# How can you use the where() function to apply a condition to a NumPy array?

The `np.where()` function is very powerful for conditional logic on arrays. It takes three arguments: `np.where(condition, x, y)`. It returns elements from `x` where the `condition` is `True`, and elements from `y` where the `condition` is `False`.

This allows you to create a new array based on the values of another array without writing explicit loops.

In [None]:
import numpy as np

arr = np.arange(1, 11)
print(f"Original array: {arr}")

# Example 1: Replace numbers based on a condition
# Let's replace all even numbers with -1 and keep odd numbers as they are
# The condition is arr % 2 == 0
# If true, use -1. If false, use the original value from arr.
new_arr = np.where(arr % 2 == 0, -1, arr)
print(f"Array with even numbers replaced: {new_arr}")

# Example 2: Find the indices where a condition is met
# If only the condition is provided, np.where() returns a tuple of indices
indices = np.where(arr > 5)
print(f"\nIndices of elements greater than 5: {indices[0]}")
print(f"Elements greater than 5: {arr[indices]}")

# What is the difference between the flatten() and ravel() functions in NumPy?

Both `flatten()` and `ravel()` are used to convert a multi-dimensional array into a 1D array. The key difference between them lies in how they handle memory:

* **`flatten()`:** This function always returns a **copy** of the original array's data. This means that any modifications made to the new flattened array will not affect the original array.

* **`ravel()`:** This function returns a **view** of the original array whenever possible. A view is a reference to the original data, so modifying the `ravel`ed array may also modify the original array. It's generally faster and more memory-efficient because it avoids copying data if it doesn't have to.

In [None]:
import numpy as np

# Create an original 2D array
original_array = np.array([[1, 2], [3, 4]])
print(f"Original array:\n{original_array}\n")

# Using flatten()
flattened_array = original_array.flatten()
flattened_array[0] = 99 # Modify the flattened array
print(f"Flattened array after modification: {flattened_array}")
print(f"Original array is unchanged:\n{original_array}\n")

print("-"*30)

# Using ravel()
raveled_array = original_array.ravel()
raveled_array[0] = 99 # Modify the raveled array
print(f"Raveled array after modification: {raveled_array}")
print(f"Original array is now changed:\n{original_array}")

# How do you use NumPy's advanced indexing capabilities to select specific elements from an array?

Advanced indexing in NumPy refers to several sophisticated ways of selecting elements, which go beyond simple slicing. The two main types are integer array indexing and boolean array indexing.

1.  **Integer Array Indexing:** You can pass a list or NumPy array of integers to select specific elements. For a 1D array, this allows you to pick elements at desired indices. For a 2D array, you can provide a list of row indices and a list of column indices to select specific points.

2.  **Boolean Array Indexing:** You can use a boolean array (of the same shape as the original array) to select elements. The elements corresponding to `True` values in the boolean array are selected, while those corresponding to `False` are ignored.

In [None]:
import numpy as np

arr = np.arange(10, 20) # [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
print(f"Original array: {arr}\n")

# --- Integer Array Indexing ---
indices_to_select = [1, 4, 7]
selected_elements = arr[indices_to_select]
print(f"Selected elements using integer array: {selected_elements}")

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(f"\n2D Array:\n{arr2d}\n")
# Select elements at (row 0, col 1), (row 1, col 2), and (row 2, col 0)
selected_points = arr2d[[0, 1, 2], [1, 2, 0]]
print(f"Selected points from 2D array: {selected_points}")

print("-"*30)

# --- Boolean Array Indexing ---
# Create a boolean condition
is_even = (arr % 2 == 0)
print(f"Boolean mask for even numbers: {is_even}")

# Use the boolean array to select elements
even_numbers = arr[is_even]
print(f"Selected even numbers: {even_numbers}")

# How can you use NumPy's broadcasting rules to perform operations on arrays with different shapes?

NumPy's broadcasting rules allow operations on arrays with different but compatible shapes. Compatibility is determined by two rules:

1.  **Rule 1:** If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
2.  **Rule 2:** If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.

If these conditions are met, the arrays are compatible for broadcasting. If any dimension has mismatched sizes and neither is 1, a `ValueError` is raised.

In [None]:
import numpy as np

# Let's add a 2D array (3x3) and a 1D array (3,)
A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])
B = np.array([10, 20, 30])

print(f"Shape of A: {A.shape}")
print(f"Shape of B: {B.shape}\n")

# A has shape (3, 3). B has shape (3,).
# Rule 1 doesn't apply as much as Rule 2.
# NumPy compares shapes from right to left:
# - A's last dimension is 3, B's last dimension is 3. They match.
# - A has a preceding dimension of 3, B has no more dimensions.
# NumPy 'stretches' or broadcasts B so it can be added to each row of A.

result = A + B
print("Result of A + B:")
print(result)

### Project: Array Statistics Calculator
**Description:**

In this project, you will create a program that allows a user to enter a list of numbers, and then calculates and displays various statistics about those numbers using NumPy.

**Requirements:**

- The program should prompt the user to enter a list of numbers separated by commas.
- The program should use NumPy to convert the input into a 1D NumPy array.
- The program should calculate and display the following statistics:
- The mean of the numbers
- The median of the numbers
- The standard deviation of the numbers
- The maximum and minimum values of the numbers
- The program should use appropriate NumPy functions to calculate the statistics.
- The program should display the statistics with appropriate labels.

### Sample Output
```
Enter a list of numbers separated by commas: 2, 5, 7, 3, 1, 9
Statistics for the input array:
Mean: 4.5
Median: 4.0
Standard Deviation: 2.9154759474226504
Maximum: 9
Minimum: 1
```

In [4]:
import numpy as np

# Prompt the user for input
user_input = input("Enter a list of numbers separated by commas: ")

# Approach:
# 1. Split the input string by the comma to get a list of number strings.
# 2. Convert each number string into a floating-point number.
# 3. Convert the list of numbers into a 1D NumPy array.
# 4. Use NumPy functions to calculate the required statistics.
# 5. Print the results in a formatted way.
# A try-except block is used to handle potential errors if the user enters non-numeric input.

try:
    # Step 1 & 2: Split the string and convert to a list of floats
    str_list = user_input.split(',')
    num_list = [float(num) for num in str_list]
    
    # Step 3: Convert the list into a NumPy array
    data_array = np.array(num_list)
    
    # Step 4: Calculate the statistics using NumPy functions
    mean_val = np.mean(data_array)
    median_val = np.median(data_array)
    std_dev_val = np.std(data_array)
    max_val = np.max(data_array)
    min_val = np.min(data_array)
    
    # Step 5: Display the results with appropriate labels
    print("\nStatistics for the input array: " + user_input)
    print(f"Mean: {mean_val}")
    print(f"Median: {median_val}")
    print(f"Standard Deviation: {std_dev_val}")
    print(f"Maximum: {max_val}")
    print(f"Minimum: {min_val}")

except ValueError:
    # Handle cases where the input is not a valid list of numbers
    print("\nError: Please enter only numbers, separated by commas.")


Statistics for the input array: 1, 3, 5, 6
Mean: 3.75
Median: 4.0
Standard Deviation: 1.920286436967152
Maximum: 6.0
Minimum: 1.0
