1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
enhance Python's capabilities for numerical operations?




Answer - Purpose of NumPy

Efficient Data Storage and Manipulation - NumPy provides an ndarray object, which is a powerful data structure for efficiently storing large datasets. Unlike Python's built-in lists, NumPy arrays are homogeneous.

Mathematical and Statistical Functions - NumPy includes a wide range of mathematical, statistical, and linear algebra functions that operate on arrays.

Advantages of NumPy

Speed - NumPy operations are generally faster than equivalent operations in pure Python. This is because NumPy operations are implemented in C and C++, which are compiled languages, and they are optimized for performance.

Memory Efficiency - NumPy arrays are more memory-efficient than Python lists because they store elements of the same type, reducing the overhead associated with dynamic typing in Python.

Multi-Dimensional Arrays: NumPy’s support for multi-dimensional arrays allows for complex data representations like matrices, tensors, and higher-dimensional data structures, which are crucial in fields like machine learning, physics, and engineering.

Integration with Other Libraries: NumPy is the foundation for many other scientific computing libraries in Python, such as SciPy, pandas, Matplotlib, and scikit-learn.


Enhancing Python's Capabilities

High-Performance Operations: NumPy operations on arrays are significantly faster than operations on Python lists or using loops.

Ease of Use: It simplifies complex mathematical operations, making Python more accessible for scientific computing.

Cross-Language Integration: NumPy serves as a bridge between Python and lower-level languages like C/C++, enabling the use of high-performance libraries in Python.

2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the
other?

np.mean() - Computes the arithmetic mean (average) of the elements along the specified axis.
it is use as - np.mean(a, axis=None, dtype=None, out=None, keepdims=<no value>)

When you need to calculate the simple arithmetic mean of an array, and all elements should contribute equally.
When you do not need to consider different weights for the elements.
It's straightforward and typically faster when no weighting is needed.

In [None]:
import numpy as np
data = np.array([1, 2, 3, 4, 5])
mean_value = np.mean(data)
print(mean_value)


3.0


np.average() -Computes the weighted average of the elements along the specified axis. If no weights are provided, it behaves like np.mean()

Usage - np.average(a, axis=None, weights=None, returned=False)

When you need to calculate a weighted average, where different elements should contribute differently based on their importance.
When you require both the weighted average and the sum of the weights in your calculation.
When you want more flexibility in how the average is computed.

In [None]:
import numpy as np
data = np.array([1, 2, 3, 4, 5])
weights = np.array([1, 1, 1, 2, 2])
weighted_average = np.average(data, weights=weights)
print(weighted_average)


3.4285714285714284


3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D
arrays.

Answer - Reversing a NumPy array can be achieved in different ways, depending on whether you want to reverse the array along a specific axis or the entire array. Below are the methods to reverse a NumPy array along different axes, with examples for both 1D and 2D arrays.

Reversing a 1D Array - A 1D array is essentially a list of elements. Reversing it means reversing the order of elements.

In [None]:
import numpy as np

arr_1d = np.array([1, 2, 3, 4, 5])

reversed_1d = arr_1d[::-1]
print(reversed_1d)


[5 4 3 2 1]


 Reversing a 2D Array - A 2D array is essentially a matrix, so reversing can occur along different axes

 Axis 0: Reverses the rows (flips the array vertically).
Axis 1: Reverses the columns (flips the array horizontally).

In [None]:

arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

reversed_2d_axis0 = arr_2d[::-1, :]
print(reversed_2d_axis0)


[[7 8 9]
 [4 5 6]
 [1 2 3]]


4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types
in memory management and performance

Answer - In NumPy, you can determine the data type of the elements in an array using the dtype attribute. The dtype attribute returns an object describing the type of the elements in the array.


In [None]:
import numpy as np

arr = np.array([1, 2, 3])

print(arr.dtype)

arr_float = np.array([1.1, 2.2, 3.3])

print(arr_float.dtype)


int64
float64


 Memory Management
Fixed Size and Memory Efficiency - NumPy arrays are homogeneously typed, meaning all elements in the array have the same data type. This allows NumPy to allocate a fixed amount of memory for each element, leading to more efficient memory usage compared to Python lists, which can store elements of different types.
For example, an int32 data type uses 32 bits (4 bytes) per element, while an int64 uses 64 bits (8 bytes). Choosing the appropriate data type based on the range of values in your data can save significant memory.

 Performance Optimization
Vectorization - NumPy’s operations are highly optimized for performance due to the use of fixed data types. Since NumPy arrays are stored in contiguous blocks of memory, operations can be vectorized, meaning they can be applied simultaneously to all elements in the array. This is much faster than applying operations element-wise using a Python loop.

5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

An ndarray in NumPy is the central data structure for numerical computing in Python. It represents a multidimensional, homogeneous array of fixed-size items. Here’s a detailed explanation of ndarrays and how they differ from standard Python lists:

Key Features of ndarrays in NumPy are - Homogeneity - All elements in an ndarray must be of the same data type (e.g., all integers, all floats). This homogeneity allows for more efficient memory usage and faster computations.

Multidimensional - An ndarray can have any number of dimensions (1D, 2D, 3D, etc.). This flexibility makes it suitable for representing a variety of data structures, such as vectors, matrices, and tensors.

Fixed Size - The size of an ndarray is determined at the time of its creation and cannot be changed. This is different from Python lists, which are dynamic in size.

Efficient Memory Management - ndarrays are stored in contiguous memory blocks, which allows for efficient access and manipulation of data, particularly in operations involving large datasets

Vectorized Operation,Advanced Indexing and Slicing,Interoperability with C/C++ and Fortran.

Differences Between ndarrays and Python Lists are

Data Type Homogeneity:

Python Lists: Can contain elements of different data types (e.g., integers, floats, strings).


ndarrays: All elements must be of the same data type, leading to more efficient memory usage.

Performance:

Python Lists: Iterating through and performing operations on large lists can be slow due to the overhead of dynamic typing and the lack of contiguous memory storage.
ndarrays: Operations are much faster due to the homogeneous nature of the data and the use of contiguous memory, which allows for optimizations like vectorization.

6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations

Answer - Memory Efficiency  - Contiguous Memory Allocation: NumPy arrays are stored in contiguous blocks of memory, unlike Python lists, which are arrays of pointers to objects. This contiguous allocation minimizes memory overhead and makes data access more efficient.

Fixed Data Type: Since all elements in a NumPy array are of the same data type, the memory required for each element is fixed, allowing NumPy to utilize memory more efficiently.

Vectorized Operations  - Element-Wise Operations: NumPy allows for vectorized operations, where a single operation is applied simultaneously to all elements in the array. This eliminates the need for explicit loops in Python, which can be slow.


SIMD (Single Instruction, Multiple Data): Under the hood, NumPy can leverage SIMD instructions that process multiple data points in a single CPU cycle, significantly speeding up computation.

Optimized C and Fortran Integration - Underlying Implementation: NumPy is implemented in C, and many of its operations are performed using highly optimized C and Fortran libraries. This means that even complex operations are executed with the efficiency of low-level languages, far outpacing Python’s native performance.
Avoidance of Python Overheads: Since the computational work is done outside the Python interpreter, NumPy avoids the overhead associated with Python's dynamic typing and garbage collection.

7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and
output.

Answer -  vstack() Function - The vstack() function stacks arrays along the vertical axis (row-wise). It essentially appends arrays on top of each other, increasing the number of rows in the resulting array.


Axis: Stacks along the first axis (axis=0).
Requirements: The arrays being stacked must have the same number of columns (i.e., their shapes must match in all dimensions except the first).

In [None]:
import numpy as np

array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[7, 8, 9],
                   [10, 11, 12]])

result_vstack = np.vstack((array1, array2))

print("Array 1:")
print(array1)
print("\nArray 2:")
print(array2)
print("\nResult of vstack:")
print(result_vstack)


Array 1:
[[1 2 3]
 [4 5 6]]

Array 2:
[[ 7  8  9]
 [10 11 12]]

Result of vstack:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


hstack() Function - The hstack() function stacks arrays along the horizontal axis (column-wise). It essentially appends arrays side by side, increasing the number of columns in the resulting array.


Axis: Stacks along the second axis (axis=1).
Requirements: The arrays being stacked must have the same number of rows (i.e., their shapes must match in all dimensions except the second).

In [None]:
import numpy as np

array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[7, 8, 9],
                   [10, 11, 12]])

result_hstack = np.hstack((array1, array2))

print("Array 1:")
print(array1)
print("\nArray 2:")
print(array2)
print("\nResult of hstack:")
print(result_hstack)


Array 1:
[[1 2 3]
 [4 5 6]]

Array 2:
[[ 7  8  9]
 [10 11 12]]

Result of hstack:
[[ 1  2  3  7  8  9]
 [ 4  5  6 10 11 12]]


8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various
array dimensions.




Answer - differences between fliplr() and flipud() methods in NumPy are

fliplr() - Flips the array horizontally, reversing the order of columns.
Only applicable to arrays with at least two dimensions.
The number of rows remains unchanged, while the columns are flipped.
Example: Transforms [[1, 2, 3]] to [[3, 2, 1]].


flipud() - Flips the array vertically, reversing the order of rows.
Applicable to arrays with at least one dimension.
The number of columns remains unchanged, while the rows are flipped.
Example: Transforms [[1], [2], [3]] to [[3], [2], [1]].

Effects on Various Array Dimensions -


1D Arrays - fliplr() will raise an error as it is not applicable.
flipud() will reverse the order of elements.


2D Arrays - fliplr() will reverse the order of columns.
flipud() will reverse the order of rows.


3D and Higher-Dimensional Arrays - fliplr() will reverse the order of columns in the last two dimensions.
flipud() will reverse the order of rows in the first dimension.

In [None]:
#Example of fliplr()

import numpy as np

array = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

# Applying fliplr()
flipped_array = np.fliplr(array)

print("Original Array:")
print(array)
print("\nArray after fliplr():")
print(flipped_array)



#Example flipud()
import numpy as np

array = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

# Applying flipud()
flipped_array = np.flipud(array)

print("Original Array:")
print(array)
print("\nArray after flipud():")
print(flipped_array)



Original Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Array after fliplr():
[[3 2 1]
 [6 5 4]
 [9 8 7]]
Original Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Array after flipud():
[[7 8 9]
 [4 5 6]
 [1 2 3]]


9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

Answer - The array_split() method in NumPy is used to split an array into multiple sub-arrays. It is a versatile function that can handle both even and uneven splits, making it particularly useful when the array cannot be divided evenly into the desired number of sub-arrays.

array_split() can handle both even and uneven splits of an array.
When an uneven split is required, array_split() distributes the remainder across the first few sub-arrays.


The method is flexible and allows splitting based on the number of sections or specific indices.


The output is always a list of sub-arrays, with sizes adjusted based on the splitting criteria.

In [None]:
import numpy as np

# Creating an array
array = np.array([1, 2, 3, 4, 5, 6])

# Splitting into 3 sub-arrays (even split)
sub_arrays = np.array_split(array, 3)

print("Original Array:")
print(array)
print("\nSub-arrays after even split:")
for sub_array in sub_arrays:
    print(sub_array)


Original Array:
[1 2 3 4 5 6]

Sub-arrays after even split:
[1 2]
[3 4]
[5 6]


In [None]:
#uneven split
import numpy as np

# Creating an array
array = np.array([1, 2, 3, 4, 5, 6, 7])

# Splitting into 3 sub-arrays (uneven split)
sub_arrays = np.array_split(array, 3)

print("Original Array:")
print(array)
print("\nSub-arrays after uneven split:")
for sub_array in sub_arrays:
    print(sub_array)


Original Array:
[1 2 3 4 5 6 7]

Sub-arrays after uneven split:
[1 2 3]
[4 5]
[6 7]


10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array
operations?

Answer - Vectorization refers to the process of applying operations to entire arrays (or vectors) without the need for explicit loops. Instead of iterating over individual elements, vectorized operations apply a function to every element in an array simultaneously.
This approach leverages low-level, highly optimized C and Fortran libraries to perform these operations, resulting in significant speed improvements over traditional Python loops.

Performance: Vectorized operations are much faster than looping through array elements in pure Python because they bypass the Python interpreter and use pre-compiled, low-level code.
Code Simplicity: Vectorization leads to more concise and readable code. Instead of writing complex loops, you can perform operations with simple expressions.

Broadcasting allows NumPy to perform arithmetic operations on arrays of different shapes by "stretching" the smaller array along a specified dimension so that they become compatible.
When operating on arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions and works its way backward. If dimensions are equal or one of them is 1, broadcasting is possible.

Rules of Broadcasting:

If the arrays do not have the same rank (number of dimensions), the shape of the smaller-rank array is padded with ones on its left side.
If the sizes of the dimensions match or if one of them is 1, then broadcasting can proceed.
The arrays are implicitly expanded to match each other in size.


Contributions to Efficient Array Operations
Performance and Speed:

Vectorization and broadcasting eliminate the need for explicit loops, significantly reducing the computational overhead associated with Python's for-loops. This is particularly important when dealing with large datasets.

Memory Efficiency:

Broadcasting allows operations to be performed without the need to create large intermediate arrays. Instead of replicating data, broadcasting treats the smaller array as if it had the same


In [None]:
import numpy as np

# Creating two arrays
array1 = np.array([1, 2, 3, 4])
array2 = np.array([5, 6, 7, 8])

# Vectorized operation (element-wise addition)
result = array1 + array2

print("Result of vectorized addition:")
print(result)




import numpy as np

# Creating an array
array = np.array([1, 2, 3, 4])

# Broadcasting with a scalar value
result = array + 10

print("Result of broadcasting with scalar:")
print(result)



Result of vectorized addition:
[ 6  8 10 12]
Result of broadcasting with scalar:
[11 12 13 14]


PRACTICAL QUESTIONS

1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.

In [None]:
import numpy as np

array = np.random.randint(1, 101, size=(3, 3))

transposed_array = np.transpose(array)

print("Original Array:")
print(array)

print("\nTransposed Array (Rows and Columns Interchanged):")
print(transposed_array)


Original Array:
[[73 65 77]
 [50 64 17]
 [ 5 44 18]]

Transposed Array (Rows and Columns Interchanged):
[[73 50  5]
 [65 64 44]
 [77 17 18]]


2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.

In [None]:
import numpy as np

# Generate a 1D array with 10 elements
array_1d = np.arange(1, 11)

# Reshape the array into a 2x5 array
array_2x5 = array_1d.reshape((2, 5))

# Reshape the 2x5 array into a 5x2 array
array_5x2 = array_2x5.reshape((5, 2))

print("Original 1D Array:")
print(array_1d)

print("\nReshaped into 2x5 Array:")
print(array_2x5)

print("\nReshaped into 5x2 Array:")
print(array_5x2)


Original 1D Array:
[ 1  2  3  4  5  6  7  8  9 10]

Reshaped into 2x5 Array:
[[ 1  2  3  4  5]
 [ 6  7  8  9 10]]

Reshaped into 5x2 Array:
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]]


3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.

In [None]:
import numpy as np

# 4x4 array with random float values
array_4x4 = np.random.rand(4, 4)

# Add a border of zeros around the array, resulting in a 6x6 array
array_6x6 = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)

print("Original 4x4 Array:")
print(array_4x4)

print("\n6x6 Array with Zero Border:")
print(array_6x6)


Original 4x4 Array:
[[0.47937559 0.54303535 0.56546041 0.51119577]
 [0.24991475 0.38039743 0.06561386 0.50092908]
 [0.06188333 0.81589247 0.2659624  0.18137275]
 [0.35551966 0.08788322 0.77766755 0.25858102]]

6x6 Array with Zero Border:
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.47937559 0.54303535 0.56546041 0.51119577 0.        ]
 [0.         0.24991475 0.38039743 0.06561386 0.50092908 0.        ]
 [0.         0.06188333 0.81589247 0.2659624  0.18137275 0.        ]
 [0.         0.35551966 0.08788322 0.77766755 0.25858102 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.

In [1]:
import numpy as np

array = np.arange(10, 60, 5)
print(array)


[10 15 20 25 30 35 40 45 50 55]


5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations
(uppercase, lowercase, title case, etc.) to each element.

In [2]:
import numpy as np

# array of strings
array = np.array(['python', 'numpy', 'pandas'])

uppercase_array = np.char.upper(array)
lowercase_array = np.char.lower(array)
titlecase_array = np.char.title(array)
capitalize_array = np.char.capitalize(array)

print("Original array:", array)
print("Uppercase:", uppercase_array)
print("Lowercase:", lowercase_array)
print("Title case:", titlecase_array)
print("Capitalize:", capitalize_array)


Original array: ['python' 'numpy' 'pandas']
Uppercase: ['PYTHON' 'NUMPY' 'PANDAS']
Lowercase: ['python' 'numpy' 'pandas']
Title case: ['Python' 'Numpy' 'Pandas']
Capitalize: ['Python' 'Numpy' 'Pandas']


6. Generate a NumPy array of words. Insert a space between each character of every word in the array.

In [4]:
import numpy as np

# array of words
words_array = np.array(['python', 'numpy', 'pandas'])

space = np.char.join(' ', words_array)

print(space)


['p y t h o n' 'n u m p y' 'p a n d a s']


7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.




In [5]:
import numpy as np

# Create two 2D arrays
array1 = np.array([[1, 2, 3], [4, 5, 6]])
array2 = np.array([[7, 8, 9], [10, 11, 12]])

# addition
addition = np.add(array1, array2)

# subtraction
subtraction = np.subtract(array1, array2)

# multiplication
multiplication = np.multiply(array1, array2)

# division
division = np.divide(array1, array2)

# Print the results
print("Array 1:\n", array1)
print("Array 2:\n", array2)
print("\nElement-wise addition:\n", addition)
print("\nElement-wise subtraction:\n", subtraction)
print("\nElement-wise multiplication:\n", multiplication)
print("\nElement-wise division:\n", division)


Array 1:
 [[1 2 3]
 [4 5 6]]
Array 2:
 [[ 7  8  9]
 [10 11 12]]

Element-wise addition:
 [[ 8 10 12]
 [14 16 18]]

Element-wise subtraction:
 [[-6 -6 -6]
 [-6 -6 -6]]

Element-wise multiplication:
 [[ 7 16 27]
 [40 55 72]]

Element-wise division:
 [[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]


8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements

In [6]:
import numpy as np

# 5x5 identity matrix
identity_matrix = np.eye(5)

diagonal_elements = np.diag(identity_matrix)

print("5x5 Identity Matrix:\n", identity_matrix)
print("\nDiagonal Elements:", diagonal_elements)


5x5 Identity Matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

Diagonal Elements: [1. 1. 1. 1. 1.]


9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in
this array.

In [7]:
import numpy as np

# check if a number is prime
def is_prime(num):
    if num < 2:
        return False
    for i in range(2, int(np.sqrt(num)) + 1):
        if num % i == 0:
            return False
    return True

random_array = np.random.randint(0, 1000, 100)

prime_numbers = np.array([num for num in random_array if is_prime(num)])

print("Random Array:\n", random_array)
print("\nPrime Numbers:\n", prime_numbers)


Random Array:
 [916 853 485 254 259 745 769 633 775 887 302  28 791 144 151 743 598 603
 912 545 884 842 334 936 704 594  66 493  69 657 599 632  87 516 767 830
 442 923 296  68 308 951 992 755 986 937 342  74 242 820 162 279 326 835
 496 128 422 364  87 107 359  38 348 468 155 335 107 102 879 221 125 682
 346 340 644 312 140  23 507 818 200 736 758  16 876 826 268 355 604 496
 180 466 445 587 179 410 961 331 323 929]

Prime Numbers:
 [853 769 887 151 743 599 937 107 359 107  23 587 179 331 929]


10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly
averages.

In [11]:
import numpy as np

daily_temperatures = np.random.randint(15, 35, 30)

weekly_temperatures = daily_temperatures[:28].reshape(4, 7)
remaining_days = daily_temperatures[28:]

weekly_averages = weekly_temperatures.mean(axis=1)

print("Daily Temperatures:\n", daily_temperatures)
print("\nWeekly Temperatures (reshaped):\n", weekly_temperatures)
print("\nWeekly Averages:\n", weekly_averages)

if remaining_days.size > 0:
    remaining_average = remaining_days.mean()
    print("\nRemaining Days Temperatures:", remaining_days)
    print("Average for Remaining Days:", remaining_average)


Daily Temperatures:
 [28 34 21 20 16 30 17 32 26 27 17 32 28 33 22 19 22 31 30 26 23 24 17 15
 17 31 21 31 15 32]

Weekly Temperatures (reshaped):
 [[28 34 21 20 16 30 17]
 [32 26 27 17 32 28 33]
 [22 19 22 31 30 26 23]
 [24 17 15 17 31 21 31]]

Weekly Averages:
 [23.71428571 27.85714286 24.71428571 22.28571429]

Remaining Days Temperatures: [15 32]
Average for Remaining Days: 23.5
