In [5]:
import numpy as np

THEORY QUESTIONS

1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. 
How does it enhance Python's capabilities for numerical operations?

In [2]:
#NumPy is a fundamental package for scientific computing in Python.
#It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays efficiently.
#NumPy enhances Python by offering a more efficient storage system and faster operations for numerical data compared to Python’s built-in lists. 
#Its ability to handle large datasets makes it crucial for data analysis tasks, where performance is key. The core advantage lies in its array-based operations and broadcasting features, which allow for vectorized computations, minimizing the need for loops and resulting in more readable and faster code.

2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

In [3]:
#np.mean() calculates the arithmetic mean along a specified axis or the entire array.
#It does not consider any weights. 
#On the other hand, np.average() allows for weighted averages, meaning you can specify the weight of each element, which is particularly useful when different data points have varying significance.
#You would use np.mean() for a simple mean calculation where all values are considered equally, while np.average() is preferable when working with datasets where different values contribute differently to the overall mean.

3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

In [8]:
#To reverse a 1D array, you can use slicing: arr[::-1].
#For 2D arrays, reversing along the rows (axis 0) is done using arr[::-1, :], and along the columns (axis 1) with arr[:, ::-1].
#Example for 1D:
arr = np.array([1, 2, 3, 4])
reversed_arr = arr[::-1]
print(reversed_arr)

[4 3 2 1]


In [9]:
arr = np.array([[1, 2], [3, 4]])
reversed_rows = arr[::-1, :]
reversed_cols = arr[:, ::-1]
print(reversed_rows)
print(reversed_cols)

[[3 4]
 [1 2]]
[[2 1]
 [4 3]]


4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.

In [10]:
#You can determine the data type of a NumPy array’s elements using the .dtype attribute. 
#Data types are crucial in memory management because NumPy arrays store data in contiguous memory locations, making it essential to specify data types to optimize memory usage and performance. 
#For example, using int32 instead of int64 reduces memory consumption when large datasets are involved, speeding up operations due to better cache efficiency.

5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

In [11]:
#NumPy's ndarray is a n-dimensional array that supports advanced mathematical operations. 
#Unlike Python lists, which are heterogeneous and require looping for element-wise operations, ndarrays are homogeneous, allowing for efficient, element-wise operations without the need for explicit loops. 
#They support vectorized operations, broadcasting, and are more memory-efficient due to their fixed data types and contiguous storage. 
#ndarrays also provide numerous methods for statistical and mathematical calculations directly on the data.

6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

In [12]:
#NumPy arrays provide substantial performance benefits over Python lists because they are optimized for numerical operations. 
#Lists in Python are dynamically typed and stored in a fragmented memory structure, making them slower for large-scale operations. 
#NumPy arrays, being homogeneously typed and stored in contiguous memory blocks, allow faster computations due to vectorization, efficient memory usage, and reduced overhead of looping.

7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.

In [13]:
#vstack() stacks arrays vertically (row-wise), while hstack() stacks them horizontally (column-wise).
#Example:
a = np.array([1, 2])
b = np.array([3, 4])
np.vstack((a, b))  
np.hstack((a, b))  

array([1, 2, 3, 4])

8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.

In [14]:
#fliplr() flips an array horizontally, reversing the order of columns, while flipud() flips it vertically, reversing the order of rows.
arr = np.array([[1, 2], [3, 4]])
np.fliplr(arr)  
np.flipud(arr)  

array([[3, 4],
       [1, 2]])

9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

In [15]:
#array_split() splits an array into multiple sub-arrays.
#If the array cannot be split evenly, the method distributes the extra elements among the first few sub-arrays.
arr = np.array([1, 2, 3, 4, 5])
np.array_split(arr, 3)  

[array([1, 2]), array([3, 4]), array([5])]

10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

In [16]:
#Vectorization refers to the process of applying operations over entire arrays without explicit loops, leveraging CPU optimizations for speed. 
#Broadcasting allows operations on arrays of different shapes by stretching the smaller array to match the larger array’s dimensions.
#Both techniques lead to more concise, faster, and memory-efficient code as they minimize explicit iteration and take advantage of low-level optimizations.

PRACTICAL QUESTIONS

In [19]:
#1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.
arr = np.random.randint(1, 101, size=(3, 3))
transposed_arr = arr.T
print(transposed_arr)

[[20 54 18]
 [90 70 80]
 [64  2 85]]


In [21]:
#2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.
arr = np.arange(10)
arr_2x5 = arr.reshape(2, 5)
arr_5x2 = arr_2x5.reshape(5, 2)
print(arr_2x5)
print(arr_5x2)

[[0 1 2 3 4]
 [5 6 7 8 9]]
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [22]:
#3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.
arr = np.random.rand(4, 4)
arr_with_border = np.pad(arr, pad_width=1, mode='constant', constant_values=0)
print(arr_with_border)

[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.38136427 0.49242378 0.27401079 0.41329759 0.        ]
 [0.         0.00977041 0.94821609 0.4263655  0.86050246 0.        ]
 [0.         0.0484136  0.05841703 0.69417661 0.81250167 0.        ]
 [0.         0.92736595 0.65878714 0.75854463 0.59041272 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [24]:
#4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.
arr = np.arange(10, 61, 5)
arr

array([10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60])

In [25]:
#5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.
arr = np.array(['python', 'numpy', 'pandas'])
upper = np.char.upper(arr)
lower = np.char.lower(arr)
title = np.char.title(arr)
print(upper, lower, title)

['PYTHON' 'NUMPY' 'PANDAS'] ['python' 'numpy' 'pandas'] ['Python' 'Numpy' 'Pandas']


In [26]:
#6. Generate a NumPy array of words. Insert a space between each character of every word in the array.
arr = np.array(['hello', 'world'])
spaced = np.char.join(' ', arr)
print(spaced)

['h e l l o' 'w o r l d']


In [28]:
#7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.
arr1 = np.random.randint(1, 10, size=(2, 2))
arr2 = np.random.randint(1, 10, size=(2, 2))
add = arr1 + arr2
subtract = arr1 - arr2
multiply = arr1 * arr2
divide = arr1 / arr2
print(add)
print(subtract)
print(multiply)
print(divide)

[[16  7]
 [10 17]]
[[ 0 -3]
 [-8 -1]]
[[64 10]
 [ 9 72]]
[[1.         0.4       ]
 [0.11111111 0.88888889]]


In [29]:
#8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.
identity_matrix = np.eye(5)
diagonal = np.diag(identity_matrix)
print(diagonal)

[1. 1. 1. 1. 1.]


In [30]:
#9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.
arr = np.random.randint(0, 1000, 100)
primes = arr[np.vectorize(lambda x: all(x % i != 0 for i in range(2, int(np.sqrt(x)) + 1)))(arr)]
print(primes)

[383 631 827 643 991  71 379 409 349 557 311 673 199 137 449 337 137 691
 587]


In [33]:
#10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.
temps = np.random.randint(20, 40, size=30)
weekly_temps = np.array_split(temps, 5)  
weekly_avg = [np.mean(week) for week in weekly_temps]

for i, avg in enumerate(weekly_avg, 1):
    print(f"Week {i} average temperature: {avg:.2f}°C")


Week 1 average temperature: 30.17°C
Week 2 average temperature: 30.17°C
Week 3 average temperature: 30.83°C
Week 4 average temperature: 25.83°C
Week 5 average temperature: 29.50°C
