1.  Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?

NumPy (Numerical Python) is a powerful library in Python designed for numerical computing and data analysis. It provides support for handling large multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these data structures efficiently.

Key Purposes of NumPy:

Efficient Array Operations: NumPy offers highly optimized operations on arrays and matrices, which are much faster than Python’s built-in data structures like lists.

Scientific and Mathematical Functions: It provides a wide range of mathematical functions, such as linear algebra, statistical analysis, Fourier transforms, and random number generation.

Data Storage and Manipulation: NumPy allows for efficient storage, manipulation, and retrieval of large datasets, making it essential in data science and machine learning.

Bridge to Lower-Level Libraries: It acts as an interface for lower-level languages like C, C++, and Fortran, enabling faster computations.

Foundation for Other Libraries: Many popular libraries like Pandas, SciPy, and Scikit-learn are built on top of NumPy.

Advantages of NumPy:

Performance and Speed:

NumPy arrays use contiguous blocks of memory and are more compact than Python lists.

Operations are vectorized, avoiding the need for explicit loops, which leads to faster computations.

Ease of Use:

Simple syntax and functions for complex mathematical operations.
Supports broadcasting, allowing operations on arrays of different shapes.

Memory Efficiency:

Consumes less memory compared to traditional Python data structures.
Provides the dtype feature to define the data type of elements, optimizing memory usage.

Interoperability:

Can easily integrate with data from other sources like text files, CSVs, and databases.

Interoperates with other Python libraries and external tools for scientific computing.

Extensive Functionality:

Supports slicing, indexing, and matrix manipulations.
Provides tools for reshaping, transposing, and aggregating data efficiently.

Error Handling and Debugging:

Provides better error messages and warnings related to array operations, making debugging easier.

How NumPy Enhances Python’s Capabilities:

Vectorization: NumPy replaces slow Python loops with fast array operations.

Element-wise Operations: Supports element-wise addition, subtraction,
multiplication, and division.

Advanced Indexing and Slicing: Enables sophisticated indexing techniques for accessing and manipulating data.

Mathematical Functions: Includes an extensive library of functions for statistical analysis, random sampling, and complex mathematical computations.

Broadcasting: Allows operations on arrays with different shapes, reducing the need for explicit iteration.

Example Comparison:

Python List vs. NumPy Array for Element-wise Addition:

# Using Python lists

a = [1, 2, 3, 4]

b = [5, 6, 7, 8]

result = [x + y for x, y in zip(a, b)]

print(result)  # Output: [6, 8, 10, 12]

# Using NumPy arrays

import numpy as np

a = np.array([1, 2, 3, 4])

b = np.array([5, 6, 7, 8])

result = a + b

print(result)  # Output: [ 6  8 10 12]






2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

. np.mean()

Purpose: Calculates the arithmetic mean (average) of the elements in an array.

Syntax: np.mean(a, axis=None, dtype=None, out=None, keepdims=False)

Key Features:

Arithmetic Mean: Computes the sum of elements divided by the number of elements.

Axis Support: Can compute the mean along a specified axis.

No Weights: Does not support weighted averages.

Simplicity: Focuses solely on the arithmetic mean.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

mean_val = np.mean(arr)

print(mean_val)  # Output: 3.0

np.average()

Purpose: Computes the weighted average of the elements in an array.

Syntax: np.average(a, axis=None, weights=None, returned=False)

Key Features:

Weighted Average: Allows calculation using weights to assign different
importance to elements.

Axis Support: Can compute the average along a specified axis.

Weights Parameter: Takes a weights argument to compute the weighted mean.

Return of Sum of Weights: Can return the sum of the weights if returned=True.

Example (With Weights):

arr = np.array([1, 2, 3, 4, 5])

weights = np.array([1, 1, 2, 2, 3])  # Heavier weight on larger numbers

weighted_avg = np.average(arr, weights=weights)

print(weighted_avg)  # Output: 3.666...

When to Use One Over the Other:

Use np.mean() when:

You want the simple arithmetic mean.

No weights are involved in your calculation.

You need simplicity and fast performance.

Use np.average() when:

You need to compute a weighted average.

Different elements contribute differently to the final average.

You want to retrieve both the average and the sum of weights.




4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.

Determining the Data Type of Elements in a NumPy Array

In NumPy, every array has a fixed data type (dtype) that determines the type of elements it can hold. You can determine the data type of elements using various attributes and functions.

Methods to Check Data Type:
Using dtype Attribute: The dtype attribute of a NumPy array provides the data type of its elements.

import numpy as np

arr = np.array([1, 2, 3])

print(arr.dtype)  # Output: int64 (or int32 depending on the system)

Using type() Function: This can be used to check the overall type of the object.

Using np.result_type(): Determines the resulting data type when performing operations on arrays.

dtype_result = np.result_type(arr, 2.5)

print(dtype_result)  # Output: float64 (because of mixing int and float)

Using arr.astype() for Conversion and Verification: Convert the array to a different data type and verify the type.


float_arr = arr.astype(float)
print(float_arr.dtype)  # Output: float64

Importance of Data Types in Memory Management and Performance

Memory Efficiency:

Compact Storage: NumPy arrays store data in contiguous memory blocks, making them more memory-efficient than Python lists.

Fixed Size: The dtype determines the size of each element (e.g., int32 uses 4 bytes, float64 uses 8 bytes).

Example:

arr_int = np.array([1, 2, 3], dtype=np.int32)

arr_float = np.array([1, 2, 3], dtype=np.float64)

print(arr_int.nbytes)  # Output: 12 (3 elements * 4 bytes)

print(arr_float.nbytes)  # Output: 24 (3 elements * 8 bytes)

Performance Optimization:

Faster Operations: Operations on NumPy arrays are vectorized, and the use of low-level implementations in C/C++ ensures faster computations.

Avoiding Type Conversion Overhead: Explicitly specifying the data type avoids automatic type conversions, which can slow down computations.

Example:

large_arr = np.arange(1_000_000, dtype=np.int32)  # Faster due to smaller data size

large_arr_float = np.arange(1_000_000, dtype=np.float64)
Numerical Precision and Accuracy:

Choosing an appropriate data type ensures precision in scientific computing.

Example: Using float32 instead of float64 may save memory but could lead to loss of precision in calculations.
Cross-Platform Compatibility:

Fixed-size data types ensure consistency across different platforms, avoiding issues with platform-specific integer or float sizes.



5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

An ndarray (N-dimensional array) is the core data structure in NumPy, representing a multidimensional, homogeneous array of fixed-size items. It is optimized for numerical operations and provides a more efficient way to store and manipulate large datasets compared to Python's built-in lists.

Key Features of ndarray:

Homogeneous Data:

All elements in an ndarray must have the same data type (dtype), ensuring consistency and efficient memory usage.

Multidimensional Support:

ndarray supports multiple dimensions (1D, 2D, 3D, etc.), making it suitable for complex data structures like matrices and tensors.

Example:

import numpy as np

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

print(arr_2d.shape)  # Output: (2, 3)

Efficient Memory Usage:

Data is stored in contiguous blocks of memory, allowing for faster access and manipulation compared to Python lists.

Vectorized Operations:

Supports element-wise operations without explicit loops, leveraging optimized C-based computations.
Example:

arr = np.array([1, 2, 3])

result = arr * 2

print(result)  # Output: [2 4 6]

arr = np.array([1, 2, 3])

result = arr * 2

print(result)  # Output: [2 4 6]

Indexing and Slicing:

Offers powerful indexing and slicing capabilities similar to Python lists but extended to multiple dimensions.

print(arr_2d[0, 1])  # Output: 2 (element at row 0, column 1)

Broadcasting:

Allows operations on arrays of different shapes without explicitly reshaping them.
Example

arr_1d = np.array([1, 2, 3])

arr_2d = np.array([[1], [2], [3]])

print(arr_1d + arr_2d)

# Output:

# [[2 3 4]

#  [3 4 5]

#  [4 5 6]]






6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

Performance Benefits of NumPy Arrays Over Python Lists for Large-Scale Numerical Operations

NumPy arrays (ndarray) are optimized for numerical computations and offer significant performance advantages over Python lists, particularly for large datasets. These benefits stem from differences in memory management, data storage, and the way operations are executed.

Key Reasons for Performance Benefits:

1. Memory Efficiency
Contiguous Memory Allocation:
NumPy arrays are stored in contiguous memory blocks, allowing for faster access and manipulation.

Python lists are arrays of pointers, each pointing to a separate memory location, which increases memory overhead.
Fixed Data Type:

NumPy arrays are homogeneous, meaning all elements have the same data type (dtype). This reduces memory usage compared to Python lists, which can store heterogeneous data types.

import numpy as np
import sys

arr = np.arange(1_000_000, dtype=np.int32)

py_list = list(range(1_000_000))

print(arr.nbytes)        # Output: 4000000 bytes (4 bytes per int32)

print(sys.getsizeof(py_list))  # Output: much larger due to pointers

2. Vectorized Operations (No Loops Required)

NumPy performs element-wise operations using highly optimized C and C++ libraries under the hood.

Python lists require explicit loops for similar operations, which are slower due to Python’s interpreted nature.

3. Low-Level Implementation

NumPy uses efficient low-level implementations in C and Fortran, avoiding the overhead of Python’s high-level abstractions.

Operations are executed in compiled code, bypassing Python's slower interpreter.

4. Broadcasting

NumPy supports broadcasting, allowing operations on arrays of different shapes without explicit loops or reshaping.

Python lists require nested loops for similar functionality, making operations slower and more memory-intensive.

5. Built-in Mathematical Functions

NumPy provides optimized functions for mathematical operations like np.sum(), np.mean(), etc., which are faster than equivalent manual implementations in Python lists.



7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.

1. vstack() (Vertical Stack)

Purpose: Combines arrays along the vertical axis (row-wise stacking).

Shape Requirement: Arrays must have the same number of columns.

import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

result = np.vstack((arr1, arr2))

print(result)

[[1 2 3]
 [4 5 6]]

2. hstack() (Horizontal Stack)

Purpose: Combines arrays along the horizontal axis (column-wise stacking).

Shape Requirement: Arrays must have the same number of rows

arr3 = np.array([[1], [2], [3]])

arr4 = np.array([[4], [5], [6]])

result = np.hstack((arr3, arr4))

print(result)

[[1 4]
 [2 5]
 [3 6]]




8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.

1. fliplr() (Flip Left to Right)

Purpose: Reverses the elements in each row, flipping the array horizontally.
Axis Affected: Operates on the last axis (columns), equivalent to reversing each row.

Applicable to: 2D arrays (and higher, when viewed along the last axis).
No Effect on: 1D arrays (raises an error if applied)

import numpy as np

arr_2d = np.array([[1, 2, 3],

                   [4, 5, 6],

                   [7, 8, 9]])

result = np.fliplr(arr_2d)

print("fliplr result:\n", result)

fliplr result:
 [[3 2 1]
  [6 5 4]
  [9 8 7]]

2. flipud() (Flip Up to Down)

Purpose: Reverses the order of the rows, flipping the array vertically.
Axis Affected: Operates on the first axis (rows), equivalent to reversing the rows.

Applicable to: 2D arrays and higher.

Effect on 1D Arrays: Reverses the entire array.

result = np.flipud(arr_2d)

print("flipud result:\n", result)

flipud result:
 [[7 8 9]
  [4 5 6]
  [1 2 3]]



9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

 Functionality of array_split() in NumPy

array_split() is a method in NumPy that splits an array into multiple sub-arrays. It is more flexible than split() because it can handle uneven splits when the array cannot be divided evenly into the specified number of sub-arrays.

Syntax:

numpy.array_split(ary, indices_or_sections, axis=0)

ary: The array to split.

indices_or_sections: The number of sections or specific indices at which to split.

axis: The axis along which to split the array (default is 0, i.e., rows).

Handling Uneven Splits

Unlike split(), which raises an error for uneven splits, array_split() distributes the remainder across the first few sections, making them slightly larger.

Example 1: Even Split

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

result = np.array_split(arr, 3)

print(result)

Example 2: Uneven Split

arr = np.array([1, 2, 3, 4, 5])

result = np.array_split(arr, 3)

print(result)

Example 3: Splitting Along a Different Axis (2D Array)

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

result = np.array_split(arr_2d, 2, axis=1)

print(result)




1. Vectorization

Vectorization refers to the process of applying operations simultaneously on entire arrays or vectors without the need for explicit loops. NumPy achieves vectorization using optimized C-based implementations, making operations much faster compared to pure Python loops.

Benefits of Vectorization

Speed: Eliminates the overhead of Python loops, leading to faster execution.

Conciseness: Reduces complex loop-based code to simple one-liners.

Optimization: Uses low-level optimizations for numerical computations.

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Vectorized operation

result = arr * 2

print(result)

2. Broadcasting

Broadcasting allows NumPy to perform operations on arrays of different shapes by expanding the smaller array to match the shape of the larger array without creating new copies. This avoids memory overhead and ensures efficient computation.

Rules of Broadcasting

The dimensions of the arrays are compared element-wise.

Arrays with a smaller dimension are "stretched" or "broadcast" to match the shape of the larger array.

If dimensions are unequal but compatible (like 1 matching any size), broadcasting occurs.

arr1 = np.array([[1, 2, 3], [4, 5, 6]])

arr2 = np.array([1, 2, 3])

result = arr1 + arr2

print(result)






In [None]:
#1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.

import numpy as np

# 1. Create a 3x3 array with random integers between 1 and 100 and interchange rows and columns.
array_3x3 = np.random.randint(1, 101, (3, 3))
transposed_array = array_3x3.T
print("Original 3x3 Array:\n", array_3x3)
print("Transposed Array:\n", transposed_array)



Original 3x3 Array:
 [[47 78 85]
 [22 32 55]
 [85 62 29]]
Transposed Array:
 [[47 22 85]
 [78 32 62]
 [85 55 29]]


In [None]:
# 2. Generate a 1D array with 10 elements and reshape it to 2x5 and 5x2.
array_1d = np.arange(10)
array_2x5 = array_1d.reshape(2, 5)
array_5x2 = array_1d.reshape(5, 2)
print("Original 1D Array:\n", array_1d)
print("Reshaped to 2x5:\n", array_2x5)
print("Reshaped to 5x2:\n", array_5x2)



Original 1D Array:
 [0 1 2 3 4 5 6 7 8 9]
Reshaped to 2x5:
 [[0 1 2 3 4]
 [5 6 7 8 9]]
Reshaped to 5x2:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [None]:
# 3. Create a 4x4 array with random floats and add a zero border to make it 6x6.
array_4x4 = np.random.rand(4, 4)
array_6x6 = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)
print("Original 4x4 Array:\n", array_4x4)
print("6x6 Array with Zero Border:\n", array_6x6)

Original 4x4 Array:
 [[0.89855397 0.04094959 0.36463625 0.2481901 ]
 [0.66579663 0.94663098 0.79904224 0.1506431 ]
 [0.38528672 0.45643471 0.7877111  0.30629718]
 [0.97802411 0.26571091 0.48179899 0.61370285]]
6x6 Array with Zero Border:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.89855397 0.04094959 0.36463625 0.2481901  0.        ]
 [0.         0.66579663 0.94663098 0.79904224 0.1506431  0.        ]
 [0.         0.38528672 0.45643471 0.7877111  0.30629718 0.        ]
 [0.         0.97802411 0.26571091 0.48179899 0.61370285 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [None]:
# 4. Create an array of integers from 10 to 60 with a step of 5.
array_step = np.arange(10, 65, 5)
print("Array from 10 to 60 with step of 5:\n", array_step)

Array from 10 to 60 with step of 5:
 [10 15 20 25 30 35 40 45 50 55 60]


In [None]:
# 5. Create an array of strings and apply case transformations.
array_strings = np.array(['python', 'numpy', 'pandas'])
upper_case = np.char.upper(array_strings)
lower_case = np.char.lower(array_strings)
title_case = np.char.title(array_strings)
print("Uppercase:\n", upper_case)

Uppercase:
 ['PYTHON' 'NUMPY' 'PANDAS']


In [None]:
# 6. Insert a space between each character of every word in the array.
spaced_words = np.char.join(" ", array_strings)
print("Spaced Words:\n", spaced_words)


Spaced Words:
 ['p y t h o n' 'n u m p y' 'p a n d a s']


In [None]:
# 7. Perform element-wise operations on two 2D arrays.
array_a = np.array([[1, 2], [3, 4]])
array_b = np.array([[5, 6], [7, 8]])
addition = array_a + array_b
subtraction = array_a - array_b
multiplication = array_a * array_b
division = array_a / array_b
print("Addition:\n", addition)
print("Subtraction:\n", subtraction)
print("Multiplication:\n", multiplication)
print("Division:\n", division)

Addition:
 [[ 6  8]
 [10 12]]
Subtraction:
 [[-4 -4]
 [-4 -4]]
Multiplication:
 [[ 5 12]
 [21 32]]
Division:
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [None]:
# 7. Perform element-wise operations on two 2D arrays.
array_a = np.array([[1, 2], [3, 4]])
array_b = np.array([[5, 6], [7, 8]])
addition = array_a + array_b
subtraction = array_a - array_b
multiplication = array_a * array_b
division = array_a / array_b
print("Addition:\n", addition)
print("Subtraction:\n", subtraction)
print("Multiplication:\n", multiplication)
print("Division:\n", division)

Addition:
 [[ 6  8]
 [10 12]]
Subtraction:
 [[-4 -4]
 [-4 -4]]
Multiplication:
 [[ 5 12]
 [21 32]]
Division:
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [None]:

# 9. Generate an array of 100 random integers and find prime numbers.
random_integers = np.random.randint(0, 1000, 100)
def is_prime(n):
    if n < 2:
        return False
    for i in range(2, int(np.sqrt(n)) + 1):
        if n % i == 0:
            return False
    return True
prime_numbers = np.array([num for num in random_integers if is_prime(num)])
print("Prime Numbers:\n", prime_numbers)

Prime Numbers:
 [619 193  47 701 373 251 163 929 599 877 977 523  67 829  71 859 263]


In [None]:
# 10. Create an array of daily temperatures for a month and calculate weekly averages.
daily_temps = np.random.randint(15, 35, 30)
weekly_avg = daily_temps.reshape(6, 5).mean(axis=1)
print("Daily Temperatures:\n", daily_temps)
print("Weekly Averages:\n", weekly_avg)

Daily Temperatures:
 [17 24 33 26 27 32 30 29 23 33 23 31 32 24 17 21 21 24 28 29 25 34 18 23
 15 15 25 23 15 31]
Weekly Averages:
 [25.4 29.4 25.4 24.6 23.  21.8]
