# **Theoretical**

Q1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?

NumPy is a fundamental library in Python for scientific computing and data analysis, providing efficient and powerful tools for numerical operations. Its purpose and advantages are as follows:

Purpose:

* Array and Matrix Operations:

 NumPy's primary function is to provide efficient and flexible implementations for storing and manipulating large arrays and matrices of numerical data. It offers a wide range of functions for performing mathematical operations on these arrays, such as addition, subtraction, multiplication, division, and element-wise operations.

* Linear Algebra:

 NumPy includes a comprehensive set of tools for linear algebra, including matrix inversion, determinant calculation, eigenvalue and eigenvector analysis, and solving linear systems of equations. These capabilities are essential for many scientific and engineering applications.

* Random Number Generation:

 NumPy provides functions for generating various types of random numbers, including uniform, normal, Poisson, and binomial distributions. This is crucial for simulations, statistical analysis, and machine learning tasks.

* Fourier Transforms:

 NumPy offers efficient implementations of Fourier transforms, which are widely used in signal processing, image processing, and scientific data analysis.

 * Advantages of NumPy

1. Performance:

Vectorization:

NumPy operations are vectorized, meaning that they operate on whole arrays at once rather than through explicit loops in Python. This leads to significant performance improvements, as vectorized operations are executed in compiled C code rather than interpreted Python.

Memory Efficiency:

 NumPy arrays consume less memory and have a more compact internal representation than Python lists.

2. Broadcasting:

Broadcasting is a technique that allows NumPy to perform arithmetic operations on arrays of different shapes and sizes without explicitly reshaping them. This makes it easier to write concise and efficient code for operations that involve arrays of different dimensions.

3. Convenience:

Array Manipulation:

 NumPy provides a rich set of functions for creating, reshaping, and manipulating arrays. Operations like slicing, indexing, and reshaping are straightforward and efficient.

Integration with Other Libraries:

 Many scientific and data analysis libraries rely on NumPy arrays as their fundamental data structure. This allows for easy interoperability and data exchange between different libraries.

4. Mathematical and Statistical Operations:

Comprehensive Library:

 NumPy includes a wide array of mathematical functions, including operations for linear algebra (e.g., matrix multiplication, eigenvalues), statistical analysis (e.g., mean, median, standard deviation), and random number generation.

High-Level Functions:

 Functions like numpy.dot for dot products, numpy.linalg for linear algebra operations, and numpy.fft for fast Fourier transforms are optimized for performance and ease of use.

5. Scientific and Engineering Applications: NumPy is widely used in fields such as physics, engineering, and finance due to its ability to handle large datasets and perform complex numerical computations efficiently.

How it enhances Python's capabilities for numerical operations:

* Efficient Data Storage:

 NumPy arrays provide a more efficient way to store numerical data compared to Python lists, leading to faster operations.

* Optimized Numerical Functions:

 NumPy's functions are optimized for numerical computations, often implemented in C or Fortran for maximum performance.

* Linear Algebra Tools:

 NumPy's linear algebra capabilities enable complex mathematical operations to be performed efficiently.

* Random Number Generation:

NumPy's random number generation functions are essential for simulations and statistical analysis.

* Fourier Transforms:

 NumPy's Fourier transform functions are crucial for signal and image processing.

Q2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

In the Numpy library, there are two functions np.mean() and np.average() are present. Both are actually doing nearly the same job of calculating mean/average. The difference comes when we are calculating the weighted average. If that is the case then we have to use np.average(). With np.average function we can calculate both arithmetic mean and weighted average. In this article, we have shown the basic use case of both functions and how they are different from each other.

* np.mean()

In numpy library, np.mean() is a function used to calculate arithmetic mean of the given array along with the axis. Lets see the code implementation.

Example:

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
mean_value = np.mean(arr)
print(mean_value)

3.0


* np.average()

The arithmetic mean and weighted average calculations are more flexible with np.average(). It can calculate the weighted average if we pass the weight; if not, it returns the same value as np.mean(). This implies that the arithmetic mean is the result when we do not pass the weight condition. So let’s jump into implementing the code right away. So here goes the code.

Example :

In [None]:
arr = np.array([1, 2, 3, 4, 5])
weights = np.array([0.1, 0.2, 0.3, 0.2, 0.2])
weighted_average = np.average(arr, weights=weights)
print(weighted_average)

3.2


* Compare np.average() and np.mean()

Many more arguments, such as dtype, out, where, and others, are available in the np.mean() method that is not accessible in the np.average() function.

If the weights option is specified, the np.average() function may compute a weighted mean, but np.mean() cannot.

Since np.average does not take into consideration boolean masks, it will compute the average over the entire collection of data. While the np.mean() method takes into account boolean masks, compute the mean solely across unmasked data.

Q3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

As we know Numpy is a general-purpose array-processing package that provides a high-performance multidimensional array object, and tools for working with these arrays. Let’s discuss how can we reverse a Numpy array.

* Reversing NumPy Arrays Along Different Axes

In NumPy, reversing an array involves flipping its elements along a particular axis. This is often used for various operations, such as reversing the order of elements in a sequence or performing specific mathematical calculations.

* Methods for Reversing Arrays:

1. Using the [::-1] Slicing Syntax:

This is the most common and concise way to reverse an array.

The [::-1] slice specifies that all elements should be taken, but in reverse order.

Example:

In [None]:
# 1D array
arr1d = np.array([1, 2, 3, 4, 5])
reversed_1d = arr1d[::-1]
print(reversed_1d)

# 2D array
arr2d = np.array([[1, 2, 3],
                 [4, 5, 6]])
reversed_2d = arr2d[::-1, :]
print(reversed_2d)

[5 4 3 2 1]
[[4 5 6]
 [1 2 3]]


2. Using the flip() Function:

The np.flip() function provides more flexibility for reversing along specific axes.

It takes the array and an optional axis argument. If no axis is specified, it reverses along all axes.

Example:

In [None]:
# 1D array
arr1d = np.array([1, 2, 3, 4, 5])
reversed_1d = np.flip(arr1d)
print(reversed_1d)

# 2D array
arr2d = np.array([[1, 2, 3],
                 [4, 5, 6]])
reversed_2d_axis0 = np.flip(arr2d, axis=0)
reversed_2d_axis1 = np.flip(arr2d, axis=1)
print(reversed_2d_axis0)
print(reversed_2d_axis1)

[5 4 3 2 1]
[[4 5 6]
 [1 2 3]]
[[3 2 1]
 [6 5 4]]


* Both methods achieve the same result: reversing the array along the specified axis.

* The [::-1] slicing syntax is generally more concise for simple reversals.

* The np.flip() function offers more flexibility, especially for higher-dimensional arrays and reversing along multiple axes.

* Choose the method that best suits your specific requirements and coding style.

Q4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.




* Determining Data Types in NumPy Arrays

NumPy arrays are homogeneous data structures, meaning all elements within an array must have the same data type. This data type determines how the elements are stored in memory and how operations are performed on them.

Here are the primary methods to determine the data type of elements in a NumPy array:

1. Using the dtype Attribute:

The dtype attribute of a NumPy array directly provides the data type.

Example:

In [None]:
arr = np.array([1, 2, 3, 4, 5])
print(arr.dtype)

int64


2. Using the type() Function (Not Recommended):

While technically possible, using type() on a NumPy array will return the numpy.ndarray type, which doesn't reveal the underlying element data type.

Example:

In [None]:
arr = np.array([1, 2, 3, 4, 5])
print(type(arr))

<class 'numpy.ndarray'>


mportance of Data Types in Memory Management and Performance :

Understanding and choosing the appropriate data type for your NumPy arrays is crucial for efficient memory management and optimal performance:

* Memory Usage:

 Different data types occupy varying amounts of memory. For example, a float64 element requires twice as much memory as an int32 element. By selecting the smallest data type that can accurately represent your values, you can significantly reduce memory consumption.

* Computational Efficiency:

 NumPy is highly optimized for performing operations on arrays of the same data type. When elements have the same data type, the underlying algorithms can be more efficient, leading to faster computations.

* Data Integrity:

 Using the correct data type helps prevent data loss or unexpected behavior due to incorrect value representation. For instance, if you store integer values in a float array, you might introduce rounding errors.

Common Data Types in NumPy:

Integer : int8, int16, int32, int64

Floating-point : float16, float32, float64

Complex : complex64, complex128

Boolean : bool

String : object (for variable-length strings)

Q5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

* ndarrays in NumPy

NumPy arrays, also known as ndarrays (n-dimensional arrays), are the fundamental data structure in the NumPy library. They provide a powerful and efficient way to represent and manipulate numerical data in Python.

* Key Features of ndarrays:

1. Homogeneity:

 All elements within an ndarray must have the same data type (e.g., int, float, bool). This ensures efficient memory management and optimized operations.

2. Fixed Size:

 The size of an ndarray is fixed once it's created. This allows for efficient memory allocation and indexing.

3. Multi-Dimensionality:

 ndarrays can be multi-dimensional, representing data in various shapes such as 1D arrays (vectors), 2D arrays (matrices), and higher-dimensional arrays.

4. Vectorized Operations:

 NumPy provides vectorized operations, which allow you to perform operations on entire arrays element-wise without explicit loops. This significantly improves performance.

5. Broadcasting:

 NumPy supports broadcasting, which allows arrays of different shapes to be compatible in arithmetic operations. This simplifies calculations and makes code more concise.

6. Indexing and Slicing:

 Powerful indexing and slicing mechanisms enable you to access and manipulate specific elements or subsets of an ndarray.

7. Shape and Size:

 ndarrays have attributes like shape and size that provide information about their dimensions and the total number of elements.

* Differences from Standard Python Lists

1.  Homogeneity:

 Python lists can contain elements of different data types, while ndarrays require all elements to have the same type.

2. Performance:

 ndarrays are typically much faster than Python lists for numerical operations due to their optimized implementation and vectorization.

3. Fixed Size:

 Python lists can be dynamically resized, while the size of an ndarray is fixed after creation.

4. Multi-Dimensionality:

 While Python lists can be nested to represent multi-dimensional data, ndarrays are specifically designed for efficient handling of multi-dimensional arrays.

5. Vectorized Operations:
 Python lists require explicit loops for most numerical operations, while ndarrays support vectorized operations for faster computations.

Q6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

NumPy arrays offer significant performance advantages over Python lists for large-scale numerical operations due to several key factors:

1. Memory Efficiency:

 * Contiguous Storage: NumPy arrays store elements in contiguous memory locations, allowing for more efficient memory access and faster operations.

 * Optimized Data Types: NumPy provides a variety of data types specifically designed for numerical operations, such as int32, float64, and complex128. These data types are often more compact and efficient than Python's built-in int and float types.

2. Vectorized Operations:

 * Element-wise Operations: NumPy supports vectorized operations, which allow you to perform operations on entire arrays element-wise without explicit loops. This eliminates the overhead of Python's loop constructs and can significantly improve performance.

 * Optimized Implementations: NumPy's vectorized operations are often implemented in highly optimized C or Fortran code, providing a significant speedup compared to Python's interpreted code.

3. Broadcasting:

 * Automatic Shape Compatibility: NumPy's broadcasting mechanism allows arrays of different shapes to be compatible in arithmetic operations. This simplifies calculations and avoids unnecessary data copying, leading to performance improvements.

4. Optimized Libraries:

 * BLAS and LAPACK: NumPy leverages highly optimized libraries like BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra Package) for linear algebra operations. These libraries are often implemented in assembly language and can provide substantial performance gains, especially for large matrices.

5. Reduced Overhead:

 * Interpreter Overhead: NumPy's operations are often implemented in compiled code, reducing the overhead of Python's interpreter. This can be particularly beneficial for computationally intensive tasks.

Example:

In [None]:
import time

# Create large lists and arrays
python_list = list(range(10000000))
numpy_array = np.arange(10000000)

# Time the sum operation for both lists and arrays
start_time = time.time()
python_sum = sum(python_list)
end_time = time.time()
python_time = end_time - start_time

start_time = time.time()
numpy_sum = np.sum(numpy_array)
end_time = time.time()
numpy_time = end_time - start_time

print("Python list time:", python_time)
print("NumPy array time:", numpy_time)

Python list time: 0.08187603950500488
NumPy array time: 0.014528989791870117


Q7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.

Comparing vstack() and hstack() in NumPy

vstack() and hstack() are two essential functions in NumPy used to vertically and horizontally stack arrays, respectively.

vstack():

* Vertically stacks arrays: Appends arrays row-wise.
* Requires arrays with the same number of columns.

Example:

In [None]:
array1 = np.array([[1, 2],
                  [3, 4]])

array2 = np.array([[5, 6],
                  [7, 8]])

stacked_array = np.vstack((array1, array2))
print(stacked_array)

[[1 2]
 [3 4]
 [5 6]
 [7 8]]


hstack():

* Horizontally stacks arrays: Appends arrays column-wise.
* Requires arrays with the same number of rows.

Example:

In [None]:
array1 = np.array([[1, 2],
                  [3, 4]])

array2 = np.array([[5, 6],
                  [7, 8]])

stacked_array = np.hstack((array1, array2))
print(stacked_array)

[[1 2 5 6]
 [3 4 7 8]]


Key Differences:

* vstack() stacks arrays vertically, requiring the same number of columns.

* hstack() stacks arrays horizontally, requiring the same number of rows.

Q8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.




I'd be glad to explain the differences between fliplr() and flipud() methods in NumPy:

fliplr()

* Purpose: Flips an array along the left-right axis (the last axis).
* Effect on array dimensions:
 * Reverses the order of elements in each row.
 * Does not affect the shape of the array.

Example:

In [None]:
array = np.array([[1, 2, 3],
                 [4, 5, 6]])

flipped_array = np.fliplr(array)
print(flipped_array)

[[3 2 1]
 [6 5 4]]


flipud()

* Purpose: Flips an array along the up-down axis (the first axis).
* Effect on array dimensions:
 *  Reverses the order of rows in the array.
 *  Does not affect the shape of the array.

Example:

In [None]:
array = np.array([[1, 2, 3],
                 [4, 5, 6]])

flipped_array = np.flipud(array)
print(flipped_array)

[[4 5 6]
 [1 2 3]]


Key Differences:

* fliplr() is useful for reversing the order of elements within each row of an array.
* flipud() is useful for reversing the order of rows in an array.
* Both methods do not affect the overall shape of the array.

Q9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

The array_split() method in NumPy is used to split an array into a specified number of sub-arrays. It's a versatile tool for dividing data into smaller, more manageable chunks.

Functionality:

* Input: Takes an array and the desired number of splits as input.
* Output: Returns a list of sub-arrays, where each sub-array has approximately the same size.
* Handling uneven splits: If the array cannot be evenly divided, the first few sub-arrays will be larger than the rest.

Example:

In [None]:
array = np.arange(10)
subarrays = np.array_split(array, 3)
print(subarrays)

[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]


How array_split() handles uneven splits:

* The method calculates the approximate size of each sub-array by dividing the total number of elements by the number of splits.
* It then iterates over the array, assigning elements to each sub-array until the desired size is reached.
* If the remaining elements are not enough to fill the last sub-array, they are simply appended to it.

Example with uneven splits:

In [None]:
array = np.arange(11)
subarrays = np.array_split(array, 4)
print(subarrays)

[array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]), array([ 9, 10])]


As you can see, the first two sub-arrays have 3 elements, while the last two have 2 and 3 elements, respectively. This is because the array cannot be evenly divided into 4 parts.

In conclusion, the array_split() method provides a convenient way to split arrays in NumPy, and it handles uneven splits by ensuring that the first few sub-arrays are slightly larger than the rest.

Q10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

I'd be glad to explain the concepts of vectorization and broadcasting in NumPy and their contributions to efficient array operations:

Vectorization :

* Definition:

 Vectorization is a technique in NumPy that allows you to perform operations on entire arrays without explicit loops. Instead of iterating over elements one by one, NumPy applies the operation to the entire array at once.

* Benefits:

 * Efficiency: Vectorized operations are significantly faster than equivalent Python loops, especially for large arrays. NumPy leverages highly optimized C code for these operations.
 * Readability: Vectorized code is often more concise and easier to understand than loop-based code.
 * Memory Efficiency: Vectorized operations can be more memory-efficient in some cases, as they may avoid creating intermediate arrays.

Example:

In [None]:
import numpy as np
# Create two arrays
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Vectorized addition
result = a + b
print(result)

[5 7 9]


Broadcasting :

* Definition:

 Broadcasting is a set of rules in NumPy that allows arrays of different shapes to be compatible for arithmetic operations. NumPy automatically expands the smaller array to match the shape of the larger array before performing the operation.
* Rules:
 * Shape Compatibility: The arrays must have compatible shapes. The trailing dimensions of the arrays must be compatible, meaning they can either be equal or one of them can be 1.
 * Expansion: If the shape of an array is smaller than the shape of the other array, it is expanded by replicating the elements along the missing dimensions.
* Benefits:
 * Flexibility: Broadcasting enables you to perform operations on arrays of different sizes without explicit reshaping or looping.
 * Efficiency: Broadcasting can often be more efficient than manual reshaping or looping, especially for large arrays.

Example:

In [None]:
a = np.array([1, 2, 3])
b = np.array([[4], [5], [6]])

# Broadcasting
result = a * b
print(result)

[[ 4  8 12]
 [ 5 10 15]
 [ 6 12 18]]


Combined Benefits of Vectorization and Broadcasting :

* Efficiency: Combining vectorization and broadcasting can lead to highly efficient array operations, especially for large datasets.
* Readability: The code becomes more concise and easier to understand.
* Flexibility: Broadcasting allows you to work with arrays of different shapes without explicit reshaping or looping.

# Practical

In [None]:
import numpy as np

Q 1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.

In [None]:
array = np.random.randint(1, 101, (3, 3))

interchanged_array = array.T

print("Original array:")
print(array)

print("\nInterchanged array:")
print(interchanged_array)

Original array:
[[73 31 78]
 [45 49 77]
 [85 92 12]]

Interchanged array:
[[73 45 85]
 [31 49 92]
 [78 77 12]]


2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.

In [None]:
arr = np.arange(10)

# 2x5 array
arr_2x5 = arr.reshape(2, 5)
print("2x5 array:\n", arr_2x5)

# 5x2 array
arr_5x2 = arr.reshape(5, 2)
print("5x2 array:\n", arr_5x2)

2x5 array:
 [[0 1 2 3 4]
 [5 6 7 8 9]]
5x2 array:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.


In [None]:
arr = np.random.rand(4, 4)

arr_with_border = np.pad(arr, pad_width=1, mode='constant', constant_values=0)

print("4x4 array with random float values:\n", arr)
print("\n6x6 array with a border of zeros:\n", arr_with_border)

4x4 array with random float values:
 [[0.89987697 0.40694717 0.06346448 0.21469006]
 [0.05600603 0.40871521 0.46372695 0.9205368 ]
 [0.15324445 0.37374971 0.29537574 0.2861751 ]
 [0.61584341 0.63464886 0.6234998  0.86364476]]

6x6 array with a border of zeros:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.89987697 0.40694717 0.06346448 0.21469006 0.        ]
 [0.         0.05600603 0.40871521 0.46372695 0.9205368  0.        ]
 [0.         0.15324445 0.37374971 0.29537574 0.2861751  0.        ]
 [0.         0.61584341 0.63464886 0.6234998  0.86364476 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.

In [None]:
arr = np.arange(10, 61, 5)
print(arr)

[10 15 20 25 30 35 40 45 50 55 60]


Q 5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations
(uppercase, lowercase, title case, etc.) to each element.

In [None]:
arr = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations
arr_uppercase = np.char.upper(arr)
arr_lowercase = np.char.lower(arr)
arr_titlecase = np.char.title(arr)

print("Original array:", arr)
print("Uppercase:", arr_uppercase)
print("Lowercase:", arr_lowercase)
print("Title case:", arr_titlecase)

Original array: ['python' 'numpy' 'pandas']
Uppercase: ['PYTHON' 'NUMPY' 'PANDAS']
Lowercase: ['python' 'numpy' 'pandas']
Title case: ['Python' 'Numpy' 'Pandas']


Q 6. Generate a NumPy array of words. Insert a space between each character of every word in the array.

In [None]:
words = np.array(['python', 'numpy', 'pandas'])
words_spaced = np.char.join(' ', words)
print(words_spaced)

['p y t h o n' 'n u m p y' 'p a n d a s']


Q 7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

In [None]:
# Create two 2D NumPy arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# addition
result_add = arr1 + arr2
print("Addition:\n", result_add)

# subtraction
result_subtract = arr1 - arr2
print("Subtraction:\n", result_subtract)

# multiplication
result_multiply = arr1 * arr2
print("Multiplication:\n", result_multiply)

# division
result_divide = arr1 / arr2
print("Division:\n", result_divide)

Addition:
 [[ 6  8]
 [10 12]]
Subtraction:
 [[-4 -4]
 [-4 -4]]
Multiplication:
 [[ 5 12]
 [21 32]]
Division:
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


Q 8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.

In [None]:
identity_matrix = np.eye(5)
diagonal_elements = np.diag(identity_matrix)
print("Identity matrix:\n", identity_matrix)
print("Diagonal elements:", diagonal_elements)

Identity matrix:
 [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
Diagonal elements: [1. 1. 1. 1. 1.]


Q 9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

In [None]:
def is_prime(num):
  if num <= 1:
    return False
  if num <= 3:
    return True
  if num % 2 == 0 or num % 3 == 0:
    return False
  i = 5
  while i * i <= num:
    if num % i == 0 or num % (i + 2) == 0:
      return False
    i += 6
  return True

random_numbers = np.random.randint(0, 1001, 100)
prime_numbers = [num for num in random_numbers if is_prime(num)]
print("Prime numbers in the array:", prime_numbers)

Prime numbers in the array: [809, 829, 397, 659, 811, 563, 61, 293, 563, 137, 457]


Q 10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly
averages.

In [None]:
daily_temperatures = np.random.randint(60, 90, 30)
weekly_averages = np.mean(daily_temperatures.reshape(5, 6), axis=1)

print("Weekly averages:")
print(weekly_averages)

Weekly averages:
[73.66666667 80.66666667 69.         72.5        78.66666667]
