1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?

Answer:
NumPy (Numerical Python) is a powerful library designed for scientific computing and data analysis. It provides an efficient multi-dimensional array (ndarray) and various mathematical functions optimized for numerical computations.

Advantages of NumPy:

Efficient Memory Usage: NumPy arrays consume less memory compared to Python lists.

Fast Computation: Operations on NumPy arrays are faster due to optimized C and Fortran libraries.

Vectorization: Eliminates the need for explicit loops, leading to efficient array operations.

Broadcasting: Allows operations on arrays of different shapes without manual reshaping.

Comprehensive Functions: Provides functions for linear algebra, Fourier transforms, and random number generation.

Interoperability: Works seamlessly with other scientific libraries like SciPy and pandas.

2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

Answer:

np.mean(): Computes the arithmetic mean (average) of an array along a specified axis.

np.average(): Computes the weighted average, where each element can have a different weight.

When to use which?

Use np.mean() when all elements contribute equally to the average.

Use np.average() when elements have different levels of importance (weights).

Example:

In [1]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(np.mean(arr))  # Output: 3.0 (Simple average)
print(np.average(arr, weights=[1, 2, 3, 4, 5]))  # Output: 3.666


3.0
3.6666666666666665


3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

Answer: Reversing can be done using np.flip() or slicing [::-1].

Example:

In [2]:
#For 1D Arrays:
arr = np.array([1, 2, 3, 4, 5])
print(arr[::-1])  # Output: [5 4 3 2 1]

#For 2D Arrays:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(np.flip(arr, axis=0))  # Flips along rows
print(np.flip(arr, axis=1))  # Flips along columns




[5 4 3 2 1]
[[4 5 6]
 [1 2 3]]
[[3 2 1]
 [6 5 4]]


4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.

Answer: To determine the data type, use the dtype attribute:

Importance of Data Types:

Memory Efficiency: Choosing appropriate data types saves memory (e.g., int8 uses less memory than int64).

Performance Optimization: Operations on smaller data types execute faster.

Precision Control: Using float32 instead of float64 can save space if high precision isn't needed.

In [3]:
arr = np.array([1, 2, 3])
print(arr.dtype)  # Output: int32 (or int64 depending on system)

int64


5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

Answer: An ndarray (N-dimensional array) is the core data structure in NumPy, offering high-performance multi-dimensional storage and operations.

Key Features:

Homogeneous: All elements must be of the same data type.

Efficient: Stored in contiguous memory locations for fast access.

Supports Broadcasting: Enables operations on different shaped arrays.

Optimized Computation: Uses vectorization instead of loops.

Differences from Python Lists:

Python lists can store elements of different types, whereas NumPy arrays require a uniform data type.

NumPy arrays occupy less memory and offer faster computations due to efficient memory handling.

Operations on NumPy arrays are vectorized, eliminating the need for loops, whereas Python lists require explicit iteration.

NumPy provides built-in functions for complex mathematical operations, which are not available for Python lists.

In summary, NumPy ndarrays are optimized for performance, making them ideal for large-scale numerical and scientific computations.

6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

Answer: NumPy arrays outperform Python lists due to:

Memory Efficiency: NumPy arrays use contiguous blocks, reducing overhead.

Vectorized Operations: Eliminates explicit loops, leveraging low-level optimizations.

Lower Latency: Uses optimized C/Fortran functions.

Example (Time Comparison):

In [4]:
import numpy as np
import time

size = 1000000
list1 = list(range(size))
list2 = list(range(size))

arr1 = np.array(list1)
arr2 = np.array(list2)

# Python list addition
start = time.time()
result = [x + y for x, y in zip(list1, list2)]
print("Python List Time:", time.time() - start)

# NumPy array addition
start = time.time()
result = arr1 + arr2
print("NumPy Array Time:", time.time() - start)


Python List Time: 0.07027387619018555
NumPy Array Time: 0.013423919677734375


7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.

Answer:

vstack(): Stacks arrays vertically (row-wise).

hstack(): Stacks arrays horizontally (column-wise).

Example:

In [5]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

print(np.vstack((a, b)))  # Stacks row-wise
print(np.hstack((a, b)))  # Stacks column-wise


[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]


8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.

Answer:

fliplr(arr): Flips the array left to right.

flipud(arr): Flips the array upside down.

In [6]:
arr = np.array([[1, 2], [3, 4]])

print(np.fliplr(arr))  # [[2, 1], [4, 3]]
print(np.flipud(arr))  # [[3, 4], [1, 2]]


[[2 1]
 [4 3]]
[[3 4]
 [1 2]]


9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

Answer: np.array_split(arr, sections) splits an array into specified parts. If the array cannot be split evenly, the last parts will have fewer elements.

Example:

In [7]:
arr = np.array([1, 2, 3, 4, 5])
print(np.array_split(arr, 3))  # [array([1, 2]), array([3, 4]), array([5])]


[array([1, 2]), array([3, 4]), array([5])]


10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

Answer:

Vectorization: Eliminates explicit loops by applying operations directly to arrays.

Broadcasting: Enables operations on arrays of different shapes by expanding them.

Example:

In [8]:
arr = np.array([1, 2, 3])
print(arr * 2)  # Vectorized multiplication

matrix = np.array([[1, 2], [3, 4]])
vector = np.array([1, 2])
print(matrix + vector)  # Broadcasting


[2 4 6]
[[2 4]
 [4 6]]


**PRACTICAL QUESTIONS** :

In [9]:
import numpy as np

# 1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.
arr = np.random.randint(1, 101, (3, 3))
transposed_arr = arr.T
print("Original Array:\n", arr)
print("Transposed Array:\n", transposed_arr)

# 2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.
arr_1d = np.arange(10)
arr_2x5 = arr_1d.reshape(2, 5)
arr_5x2 = arr_1d.reshape(5, 2)
print("2x5 Array:\n", arr_2x5)
print("5x2 Array:\n", arr_5x2)

# 3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.
arr_4x4 = np.random.rand(4, 4)
bordered_arr = np.pad(arr_4x4, pad_width=1, mode='constant', constant_values=0)
print("Original 4x4 Array:\n", arr_4x4)
print("6x6 Array with Zero Border:\n", bordered_arr)

# 4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.
arr_step = np.arange(10, 61, 5)
print("Array with step 5:\n", arr_step)

# 5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations.
arr_strings = np.array(['python', 'numpy', 'pandas'])
upper_case = np.char.upper(arr_strings)
lower_case = np.char.lower(arr_strings)
title_case = np.char.title(arr_strings)
print("Uppercase:\n", upper_case)
print("Lowercase:\n", lower_case)
print("Title Case:\n", title_case)

# 6. Generate a NumPy array of words. Insert a space between each character of every word in the array.
arr_words = np.array(['hello', 'world', 'numpy'])
spaced_words = np.char.join(" ", arr_words)
print("Words with spaces:\n", spaced_words)

# 7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
sum_arr = arr1 + arr2
diff_arr = arr1 - arr2
prod_arr = arr1 * arr2
div_arr = arr1 / arr2
print("Addition:\n", sum_arr)
print("Subtraction:\n", diff_arr)
print("Multiplication:\n", prod_arr)
print("Division:\n", div_arr)

# 8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.
identity_matrix = np.eye(5)
diagonal_elements = np.diag(identity_matrix)
print("Identity Matrix:\n", identity_matrix)
print("Diagonal Elements:\n", diagonal_elements)

# 9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.
def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(np.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True

rand_arr = np.random.randint(0, 1001, 100)
prime_numbers = np.array([num for num in rand_arr if is_prime(num)])
print("Random Array:\n", rand_arr)
print("Prime Numbers:\n", prime_numbers)

# 10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.
daily_temperatures = np.random.randint(15, 35, 30)  # Generating random temperatures
weekly_avg = np.mean(daily_temperatures.reshape(6, 5), axis=1)  # 6 weeks of 5 days
print("Daily Temperatures:\n", daily_temperatures)
print("Weekly Averages:\n", weekly_avg)


Original Array:
 [[61 16 89]
 [32 10 28]
 [21 51 73]]
Transposed Array:
 [[61 32 21]
 [16 10 51]
 [89 28 73]]
2x5 Array:
 [[0 1 2 3 4]
 [5 6 7 8 9]]
5x2 Array:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]
Original 4x4 Array:
 [[0.12814863 0.92058149 0.05731921 0.18594069]
 [0.41986571 0.8195217  0.36885691 0.28702492]
 [0.45387892 0.6908331  0.86722528 0.79124484]
 [0.95048477 0.51642057 0.08991721 0.5441124 ]]
6x6 Array with Zero Border:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.12814863 0.92058149 0.05731921 0.18594069 0.        ]
 [0.         0.41986571 0.8195217  0.36885691 0.28702492 0.        ]
 [0.         0.45387892 0.6908331  0.86722528 0.79124484 0.        ]
 [0.         0.95048477 0.51642057 0.08991721 0.5441124  0.        ]
 [0.         0.         0.         0.         0.         0.        ]]
Array with step 5:
 [10 15 20 25 30 35 40 45 50 55 60]
Uppercase:
 ['PYTHON' 'NUMPY' 'PANDAS']
Lowercase:
 ['python' 'numpy' 'pandas']
Title Case:
 [