THEORY QUESTION

1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
enhance Python's capabilities for numerical operations?

=>NumPy is a powerful library for numerical computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a variety of mathematical functions to operate on these arrays. The advantages of NumPy in scientific computing and data analysis include:

Efficient Storage and Performance: NumPy arrays are stored in contiguous blocks of memory, which allows for efficient access and manipulation of large datasets, as opposed to Python lists, which are less memory-efficient.

Vectorized Operations: NumPy supports element-wise operations on arrays without needing explicit loops, enabling concise and faster computations compared to standard Python loops. This feature is known as vectorization and greatly enhances performance for numerical tasks.

Broad Functionality: NumPy provides a wide array of mathematical functions for operations like linear algebra, statistics, Fourier analysis, and random sampling.

Interoperability: It integrates well with other libraries in the Python ecosystem, such as SciPy, Pandas, and Matplotlib, making it essential for data analysis, machine learning, and scientific research.

2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the
other?

=>Both np.mean() and np.average() are used to compute the average of an array, but there are key differences:

np.mean(): It simply calculates the arithmetic mean (sum of elements divided by the number of elements) along the specified axis.


np.average(): It is more flexible and allows you to specify a weights array, which gives different weights to elements when calculating the mean.


When to Use:
Use np.mean() when you just need the simple arithmetic mean.
Use np.average() when you need to calculate a weighted average or specify which elements should have more influence on the result.

3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D
arrays.

=>To reverse a NumPy array along a specific axis, we can use slicing and the ::-1 notation.

In [2]:
#1D array
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
reversed_arr = arr[::-1]
print(reversed_arr)


[5 4 3 2 1]


In [3]:
#2D array
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
reversed_rows = arr2d[::-1, :]
print(reversed_rows)


[[7 8 9]
 [4 5 6]
 [1 2 3]]


4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types
in memory management and performance.

=>To determine the data type of a NumPy array, use the dtype attribute:

arr = np.array([1, 2, 3, 4])

print(arr.dtype)

Importance of Data Types:
Memory Management: Different data types consume different amounts of memory. For example, an int32 array consumes less memory than an int64 array. Choosing an appropriate data type can significantly reduce memory usage.
Performance: Operations on smaller data types  can be faster because they require less memory bandwidth.


5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

=>An ndarray (N-dimensional array) is the central data structure in NumPy. Key features include:

Multidimensional: Can represent arrays with more than one dimension.
Homogeneous: All elements of an ndarray have the same data type
Efficient Memory Usage: Memory is allocated in a contiguous block, reducing overhead and enabling fast access.
Vectorized Operations: Supports element-wise operations and broadcasting, which are much faster than standard Python loops.

Differences from Python Lists:
Homogeneity: NumPy arrays require all elements to have the same type, while Python lists can store different data types.
Performance: NumPy arrays are optimized for performance with large datasets, whereas Python lists are slower and less memory efficient for numerical tasks.

6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

=>Memory Efficiency: NumPy arrays are stored in contiguous memory blocks, which makes them more memory efficient and allows for faster element access compared to Python lists, which store references to objects.

Speed: NumPy uses optimized C and Fortran libraries under the hood, which means it can perform complex operations much faster than Python lists, especially for large datasets.

Vectorization: NumPy eliminates the need for explicit loops over elements  enabling operations to be performed in parallel at a low level, which speeds up computation.

7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and
output.

=>np.vstack(): Stacks arrays vertically. It takes a sequence of arrays and stacks them row-wise

In [4]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
result = np.vstack((arr1, arr2))
print(result)



[[1 2 3]
 [4 5 6]]


In [5]:
#np.hstack(): Stacks arrays horizontally  It stacks arrays side by side
result = np.hstack((arr1, arr2))
print(result)


[1 2 3 4 5 6]


8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various
array dimensions.

In [6]:
#np.fliplr(): Flips an array left to right
arr = np.array([[1, 2, 3], [4, 5, 6]])
flipped_lr = np.fliplr(arr)
print(flipped_lr)



[[3 2 1]
 [6 5 4]]


In [7]:
#np.flipud(): Flips an array upside down
flipped_ud = np.flipud(arr)
print(flipped_ud)



[[4 5 6]
 [1 2 3]]


9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

=>The array_split() method is used to split an array into multiple sub-arrays. It can handle uneven splits by distributing the elements as evenly as possible.

In [8]:
arr = np.array([1, 2, 3, 4, 5])
result = np.array_split(arr, 3)
print(result)



[array([1, 2]), array([3, 4]), array([5])]


10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array
operations?


In [9]:
#Vectorization: Refers to the ability to apply operations on entire arrays without explicit loops. This is possible because NumPy internally loops over elements in a highly optimized way, offering substantial performance gains.
arr = np.array([1, 2, 3])
result = arr * 2
print(result)


[2 4 6]


In [10]:
#Broadcasting: Allows NumPy to perform operations on arrays of different shapes in a way that is semantically valid. It automatically "broadcasts" the smaller array to match the shape of the larger one.
arr1 = np.array([1, 2, 3])
arr2 = np.array([10])
result = arr1 + arr2
print(result)


[11 12 13]


PRACTICAL QUESTION

1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.



In [11]:
import numpy as np


array_3x3 = np.random.randint(1, 101, (3, 3))


array_transposed = array_3x3.T

print("Original Array:\n", array_3x3)
print("Transposed Array:\n", array_transposed)


Original Array:
 [[38 18 18]
 [87 48  7]
 [85 45 67]]
Transposed Array:
 [[38 87 85]
 [18 48 45]
 [18  7 67]]


2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.



In [12]:
array_1d = np.arange(10)
array_2x5 = array_1d.reshape(2, 5)
array_5x2 = array_1d.reshape(5, 2)

print("1D Array:\n", array_1d)
print("Reshaped to 2x5:\n", array_2x5)
print("Reshaped to 5x2:\n", array_5x2)


1D Array:
 [0 1 2 3 4 5 6 7 8 9]
Reshaped to 2x5:
 [[0 1 2 3 4]
 [5 6 7 8 9]]
Reshaped to 5x2:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.



In [13]:
array_4x4 = np.random.rand(4, 4)


array_with_border = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)

print("Original 4x4 Array:\n", array_4x4)
print("Array with Border:\n", array_with_border)


Original 4x4 Array:
 [[0.28835409 0.19829801 0.70112146 0.71371041]
 [0.17288744 0.21170141 0.0500673  0.08040602]
 [0.54185452 0.17193496 0.06681339 0.1043398 ]
 [0.34430556 0.72909152 0.39333529 0.94481419]]
Array with Border:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.28835409 0.19829801 0.70112146 0.71371041 0.        ]
 [0.         0.17288744 0.21170141 0.0500673  0.08040602 0.        ]
 [0.         0.54185452 0.17193496 0.06681339 0.1043398  0.        ]
 [0.         0.34430556 0.72909152 0.39333529 0.94481419 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.

In [14]:

array_step_5 = np.arange(10, 61, 5)

print("Array from 10 to 60 with step 5:\n", array_step_5)


Array from 10 to 60 with step 5:
 [10 15 20 25 30 35 40 45 50 55 60]


5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations
(uppercase, lowercase, title case, etc.) to each element.

In [15]:

string_array = np.array(['python', 'numpy', 'pandas'])


uppercase = np.char.upper(string_array)
lowercase = np.char.lower(string_array)
titlecase = np.char.title(string_array)
capitalize = np.char.capitalize(string_array)

print("Uppercase:\n", uppercase)
print("Lowercase:\n", lowercase)
print("Titlecase:\n", titlecase)
print("Capitalize:\n", capitalize)


Uppercase:
 ['PYTHON' 'NUMPY' 'PANDAS']
Lowercase:
 ['python' 'numpy' 'pandas']
Titlecase:
 ['Python' 'Numpy' 'Pandas']
Capitalize:
 ['Python' 'Numpy' 'Pandas']


6. Generate a NumPy array of words. Insert a space between each character of every word in the array.



In [16]:
words_array = np.array(['hello', 'world', 'numpy'])
words_with_spaces = np.char.join(' ', words_array)

print("Words with spaces between characters:\n", words_with_spaces)


Words with spaces between characters:
 ['h e l l o' 'w o r l d' 'n u m p y']


7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.



In [18]:
array_a = np.array([[1, 2], [3, 4]])
array_b = np.array([[5, 6], [7, 8]])

addition = np.add(array_a, array_b)
subtraction = np.subtract(array_a, array_b)
multiplication = np.multiply(array_a, array_b)
division = np.divide(array_a, array_b)

print("Addition:\n", addition)
print("Subtraction:\n", subtraction)
print("Multiplication:\n", multiplication)
print("Division:\n", division)


Addition:
 [[ 6  8]
 [10 12]]
Subtraction:
 [[-4 -4]
 [-4 -4]]
Multiplication:
 [[ 5 12]
 [21 32]]
Division:
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.



In [19]:
identity_matrix = np.eye(5)
diagonal_elements = np.diag(identity_matrix)

print("Identity Matrix:\n", identity_matrix)
print("Diagonal Elements:\n", diagonal_elements)


Identity Matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
Diagonal Elements:
 [1. 1. 1. 1. 1.]


9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in
this array.

In [21]:
from sympy import primerange
random_integers = np.random.randint(0, 1001, 100)
primes_up_to_1000 = set(primerange(0, 1001))

prime_numbers_in_array = np.intersect1d(random_integers, list(primes_up_to_1000))

print("Random Integers:\n", random_integers)
print("Prime Numbers in the Array:\n", prime_numbers_in_array)


Random Integers:
 [480 589 384 189 715 381 362 897 330 249 915 215 847 133 973 998 900 165
 718 763 904 558 745 519 706 943 188  21 709 712 356 392 651 988 331 727
 687 319 821  58 165  91 454 854 817 612 414 511 744 956 497  36  85 719
 367 372 691  42 171 463 498 289  54  56 996 812 902 380 645  41 732 462
 474 213 671  97 101 309 617 863 946 727 632 157 973 462 349 567 402 480
 732  37 879 555 223 385 180 905 145 108]
Prime Numbers in the Array:
 [ 37  41  97 101 157 223 331 349 367 463 617 691 709 719 727 821 863]


10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly
averages.

In [22]:
temperatures = np.random.uniform(10, 35, 30)
weekly_averages = temperatures.reshape(5, 6).mean(axis=1)

print("Daily Temperatures:\n", temperatures)
print("Weekly Averages:\n", weekly_averages)


Daily Temperatures:
 [10.63940827 29.58983244 19.87820149 30.87471483 19.88331018 18.03605471
 25.55394401 13.0701899  14.5233466  13.79483356 10.13158238 16.62297959
 30.51633862 22.51286396 12.43362882 10.21313889 12.09277248 25.04916136
 32.64740296 23.22991687 20.41240992 30.47603435 14.27506387 19.29522459
 18.24595033 16.52875086 14.96710649 18.42518898 33.63103557 31.89326146]
Weekly Averages:
 [21.48358699 15.61614601 18.80298402 23.38934209 22.28188228]
