**Theoretical Questions**

1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
enhance Python's capabilities for numerical operations?

ANSWER:
NumPy (Numerical Python) is a fundamental library for scientific computing and data analysis in Python. Its main purpose is to provide support for large, multi-dimensional arrays and matrices, along with a vast collection of mathematical functions to operate on these arrays efficiently. Here’s how it enhances Python’s capabilities for numerical operations:

* Efficient Array Operations: NumPy arrays (ndarrays) are much faster and more memory-efficient than Python lists, especially for large datasets. They allow element-wise operations, enabling efficient batch processing without loops.

* Vectorization: NumPy enables vectorized operations, where complex calculations are performed on entire arrays without writing explicit loops, making the code more concise and faster.

* Mathematical and Statistical Functions: NumPy provides an extensive suite of mathematical, statistical, and linear algebra functions, which are essential for scientific computations.

* Interoperability with Other Libraries: Many scientific libraries (like SciPy, Pandas, and TensorFlow) are built on top of or closely integrate with NumPy, leveraging its fast array computations.

* Memory Management: NumPy manages memory more efficiently, reducing the load on the CPU by using contiguous memory blocks and optimized data types, leading to faster performance.

Overall, NumPy’s capabilities make Python a powerful tool for handling large data sets and performing complex calculations, supporting efficient and scalable scientific computing and data analysis.

2.Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the
other?

Answer:
      Both np.mean() and np.average() are used in NumPy to calculate the central tendency of data, but they have subtle differences in functionality and use cases:

1. Basic Functionality:
np.mean(): Calculates the arithmetic mean of the elements along a specified axis. It computes the simple average without considering any weights.
np.average(): Calculates a weighted average if weights are provided; otherwise, it behaves like np.mean().
2. Parameters and Usage:
np.mean():
Syntax: np.mean(array, axis=None, dtype=None, out=None)
No support for weights; simply takes an array and computes its mean.
Useful when you need an unweighted mean and don’t require additional flexibility with weights.
np.average():
Syntax: np.average(array, weights=None, axis=None, returned=False)
Accepts an additional weights parameter, allowing each element to have a different level of importance in the mean calculation.
Can return a tuple with the average and the sum of weights if returned=True.
3. Example Comparison:

```
import numpy as np

data = np.array([1, 2, 3, 4])
weights = np.array([1, 2, 3, 4])

# Using np.mean()
mean_result = np.mean(data)  # Output: 2.5

# Using np.average() without weights
average_no_weights = np.average(data)  # Output: 2.5

# Using np.average() with weights
weighted_average = np.average(data, weights=weights)  # Output: 3.0
```

4. When to Use Each:
Use np.mean(): When you need a straightforward, unweighted average of your data.
Use np.average(): When you need a weighted average or when the weights of data points are relevant to your calculation (e.g., combining data from different sources with varying importance).
In summary, np.mean() is simpler and suitable for typical mean calculations, while np.average() offers more flexibility with weights, making it preferable when weighting is needed.

3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D
arrays.

Answer:
      To reverse a NumPy array, you can use slicing or np.flip():

1D Array:

Use slicing [::-1] to reverse the array.
```
arr_1d = np.array([1, 2, 3, 4, 5])
reversed_1d = arr_1d[::-1]  # Output: [5, 4, 3, 2, 1]
```

2D Array:

Reverse Rows: array[::-1, :]

Reverse Columns: array[:, ::-1]

Reverse Both: array[::-1, ::-1]

 ```
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
reversed_rows = arr_2d[::-1, :]      # Reverses rows
reversed_columns = arr_2d[:, ::-1]   # Reverses columns
reversed_both = arr_2d[::-1, ::-1]   # Reverses both
```

Using np.flip(): Reverses along a specific axis or all axes.
  
```
flipped = np.flip(arr_2d, axis=0)      # Flip along rows
flipped_both = np.flip(arr_2d)         # Flip along both axes
Summary: Use slicing for quick reversal, or np.flip() for clear axis-specific control.

```








4.How can you determine the data type of elements in a NumPy array? Discuss the importance of data types
in memory management and performance

Answer:
      To determine the data type of elements in a NumPy array, you can use the .dtype attribute:

```
import numpy as np

arr = np.array([1, 2, 3])
print(arr.dtype)  # Output: int64 (or another integer type depending on your system)
```


Importance of Data Types in Memory Management and Performance
1. Memory Efficiency:

* Each data type in NumPy is associated with a specific amount of memory. For example, int32 uses 4 bytes, while float64 uses 8 bytes.
* Choosing the appropriate data type can save memory, especially for large datasets. For instance, using int8 instead of int64 for small integers reduces memory consumption significantly.

2. Performance Optimization:

* Data types affect computational speed. Operations on smaller data types (e.g., int8 or float32) are generally faster than on larger types (int64 or float64) because they require less memory and processing power.
* Using the smallest possible data type that can accurately represent your data can enhance performance, especially in high-volume computations common in scientific computing.

3. Avoiding Overflow and Precision Errors:

* Selecting the right data type helps avoid overflow (e.g., in integer operations) and precision issues (e.g., in floating-point calculations).
* For instance, if your calculations require high precision, you should use float64 rather than float32 to prevent rounding errors.


5.  Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

Answer:

In NumPy, an ndarray (n-dimensional array) is the primary data structure used for numerical computations. It's a multi-dimensional, fixed-size array with elements of the same data type.

Key Features of ndarrays:
1. Homogeneous Elements: All elements have the same data type, which enhances performance and memory efficiency.
2. Multi-dimensional: Supports multiple dimensions (1D, 2D, etc.), useful for handling complex data structures like matrices.
3. Fast and Memory-efficient: NumPy arrays are stored in contiguous memory, making them faster and more memory-efficient than lists.
4. Vectorized Operations: Allows element-wise operations without loops, enabling efficient mathematical computations.
5. Broadcasting: Automatically expands dimensions to perform operations on arrays of different shapes.


Difference from Python Lists:
1. Performance: ndarrays are faster due to contiguous memory storage and support for vectorized operations.
2. Fixed Size and Type: Unlike lists, they have a fixed size and require all elements to be the same type.
3. Built-in Mathematical Operations: NumPy provides extensive functions for array manipulation and mathematical operations that lists lack.

6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations

Answer:
NumPy arrays provide significant performance benefits over Python lists for large-scale numerical operations due to the following reasons:

1. Contiguous Memory Allocation:

* NumPy arrays are stored in contiguous memory blocks, allowing the CPU to access elements more quickly compared to Python lists, which store elements in scattered memory locations.
* This reduces memory overhead and improves data retrieval speed.

2. Homogeneous Data Type:

* All elements in a NumPy array have the same data type, which reduces the need for type checking during operations, leading to faster execution.
* Python lists can contain elements of different types, which adds overhead during iteration and arithmetic operations.

3. Vectorized Operations:

* NumPy supports vectorized operations, meaning you can perform arithmetic directly on entire arrays without explicit loops, leveraging optimized C and Fortran code underneath.
* For instance, adding two large arrays element-wise is significantly faster in NumPy than using a loop in Python.

4. Broadcasting:

* NumPy's broadcasting allows operations on arrays of different shapes without manually resizing them, reducing the complexity and time taken for certain operations.

5. Efficient Memory Usage:

* NumPy arrays use less memory compared to lists, as they don’t store type or pointer information for each element separately.
* This is crucial for large datasets, where memory efficiency directly affects performance.

7.  Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and
output

Answer:

In NumPy, vstack() and hstack() are functions used to stack arrays along different axes:

1. np.vstack():

Stacks arrays vertically (row-wise).
Takes arrays with the same number of columns and combines them along a new row axis.

2. np.hstack():

Stacks arrays horizontally (column-wise).
Takes arrays with the same number of rows and combines them along a new column axis.
Examples

```
import numpy as np

# Define two sample arrays
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
```

Using np.vstack()
Stacks arr1 and arr2 vertically, adding rows.

```
vstack_result = np.vstack((arr1, arr2))
print("Vertical Stack (vstack):\n", vstack_result)
# Output:
# [[1 2]
#  [3 4]
#  [5 6]
#  [7 8]]
```


Using np.hstack()
Stacks arr1 and arr2 horizontally, adding columns.


```
hstack_result = np.hstack((arr1, arr2))
print("Horizontal Stack (hstack):\n", hstack_result)
# Output:
# [[1 2 5 6]
#  [3 4 7 8]]
```


Key Differences
* vstack(): Extends the array by adding rows; arrays must have the same number of columns.
* hstack(): Extends the array by adding columns; arrays must have the same number of rows.


8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various
array dimensions

Answer:
In NumPy, fliplr() and flipud() are functions used to reverse the order of elements in arrays along specific axes:

np.fliplr():

* Stands for "flip left to right".
* Reverses the order of columns (horizontal flip) in a 2D array, or the last axis for higher-dimensional arrays.
* Only works on arrays with two or more dimensions.

Effect:

Each row remains the same, but the columns within each row are reversed.

```
import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6],
                [7, 8, 9]])

result = np.fliplr(arr)
print("fliplr:\n", result)
# Output:
# [[3 2 1]
#  [6 5 4]
#  [9 8 7]]
```

np.flipud():

* Stands for "flip up to down".
* Reverses the order of rows (vertical flip) in a 2D array, or the first axis for higher-dimensional arrays.
* Can work on arrays with any number of dimensions.


Effect:

Each column remains the same, but the rows are reversed.

```
result = np.flipud(arr)
print("flipud:\n", result)
# Output:
# [[7 8 9]
#  [4 5 6]
#  [1 2 3]]
```

9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

Answer:

The array_split() method in NumPy is used to divide an array into multiple sub-arrays. This method allows for flexible splitting of arrays, making it useful for tasks where data needs to be segmented for analysis or processing.

Functionality of array_split()

Syntax:

```
numpy.array_split(ary, indices_or_sections, axis=0)
```

Parameters:

* ary: The input array to be split.
* indices_or_sections:
** This can be an integer specifying the number of equal sections to split the array into.
** Alternatively, it can be an array of indices specifying where to split the array.
* axis: The axis along which to split the array. The default is 0 (vertical split for 2D arrays).

Handling Uneven Splits

When the array cannot be split evenly, array_split() handles the remainder by distributing the extra elements among the resulting sub-arrays. Here’s how it works:

If you specify an integer for indices_or_sections, NumPy will try to split the array into that many parts. If the array length is not perfectly divisible by the specified number, some sub-arrays will have one more element than others.

If you provide specific indices for splitting, it will cut the array at those indices, regardless of whether the resulting sub-arrays are of equal size.

Example Usage
Here’s an example illustrating the use of array_split():

```
import numpy as np

# Create an example array
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

# Split into 3 parts
split_even = np.array_split(arr, 3)
print("Split into 3 parts:")
for i, sub_arr in enumerate(split_even):
    print(f"Part {i+1}: {sub_arr}")

# Split using specific indices
split_indices = np.array_split(arr, [2, 5])  # Split at indices 2 and 5
print("\nSplit using specific indices:")
for i, sub_arr in enumerate(split_indices):
    print(f"Sub-array {i+1}: {sub_arr}")
```

Output

yaml

Split into 3 parts:

Part 1: [1 2 3]

Part 2: [4 5 6]

Part 3: [7 8 9]

Split using specific indices:

Sub-array 1: [1 2]

Sub-array 2: [3 4 5]

Sub-array 3: [6 7 8 9]

10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array
operations?

Answer:
Vectorization and broadcasting are two powerful concepts in NumPy that enhance the efficiency of array operations, making computations faster and more intuitive.

Vectorization

Definition: Vectorization refers to the ability to perform operations on entire arrays or large datasets without the need for explicit loops in Python. Instead of iterating through elements one by one, NumPy allows operations to be applied directly to whole arrays at once.

Benefits:

* Performance: Operations are executed in optimized, low-level C code, which is significantly faster than Python loops.
* Simplicity: Code is more concise and readable, reducing complexity in array manipulation.

Example: Instead of using a loop to add two arrays element-wise, you can directly use:

```
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Vectorized addition
result = a + b
print(result)  # Output: [5 7 9]
```

Broadcasting

Definition: Broadcasting is a technique that allows NumPy to perform operations on arrays of different shapes and sizes. When the shapes of the arrays are compatible, NumPy automatically expands the smaller array to match the shape of the larger one during the operation.

Rules for Broadcasting:

* If the arrays have different numbers of dimensions, the shape of the smaller-dimensional array is padded with ones on the left until both shapes are the same.
* The sizes of the dimensions are compared element-wise. They are compatible when:
They are equal, or
One of them is 1 (which allows for expansion).

Benefits:

Flexibility: Allows operations between arrays of different shapes without needing to manually reshape them.

Efficiency: Reduces memory overhead since no actual copying of data is required; broadcasting creates a view instead of duplicating the data.

Example: Adding a scalar to a 1D array automatically broadcasts the scalar to each element of the array:

```
c = np.array([1, 2, 3])
scalar = 5

# Broadcasting the scalar addition
result_broadcast = c + scalar
print(result_broadcast)  # Output: [6 7 8]
```

Or for two arrays of different shapes:

```
d = np.array([[1], [2], [3]])  # Shape (3, 1)
e = np.array([4, 5, 6])        # Shape (3,)

# Broadcasting to add
result_broadcast_multi = d + e
print(result_broadcast_multi)
# Output:
# [[5 6 7]
#  [6 7 8]
#  [7 8 9]]
```


**Practical Questions:**

1.Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns

In [2]:
import numpy as np
a=np.random.randint(1,101,size=(3,3))
b=a.T
print("original array")
print(a)
print("transposed array")
print(b)

original array
[[ 8  3 68]
 [26 67 95]
 [82 43 99]]
transposed array
[[ 8 26 82]
 [ 3 67 43]
 [68 95 99]]


2.Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array

In [3]:
import numpy as np
a=np.random.randint(1,11,size=(10))
print(a)
b=a.reshape(2,5)
print(b)
c=b.reshape(5,2)
print(c)

[ 4  3  7  9 10 10  2  8  4  8]
[[ 4  3  7  9 10]
 [10  2  8  4  8]]
[[ 4  3]
 [ 7  9]
 [10 10]
 [ 2  8]
 [ 4  8]]


3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.

In [4]:
import numpy as np
a=np.random.rand(4,4)
print(a)
b=np.pad(a,pad_width=1,mode='constant',constant_values=0)
print(b)

[[0.56267405 0.68805937 0.73161797 0.47929254]
 [0.7335037  0.33086599 0.18867488 0.61028296]
 [0.31293938 0.20276326 0.02937456 0.28094606]
 [0.65203472 0.25941178 0.60090147 0.16539991]]
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.56267405 0.68805937 0.73161797 0.47929254 0.        ]
 [0.         0.7335037  0.33086599 0.18867488 0.61028296 0.        ]
 [0.         0.31293938 0.20276326 0.02937456 0.28094606 0.        ]
 [0.         0.65203472 0.25941178 0.60090147 0.16539991 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


4. Using NumPy, create an array of integers from 10 to 60 with a step of 5

In [5]:
import numpy as np
a=np.arange(10,61,5)
print(a)

[10 15 20 25 30 35 40 45 50 55 60]


5.Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations
(uppercase, lowercase, title case, etc.) to each element.

In [6]:
import numpy as np
a=np.array(['python', 'numpy', 'pandas'])
print(a)
b=np.char.upper(a)
print(b)
c=np.char.lower(a)
print(c)
d=np.char.title(a)
print(d)

['python' 'numpy' 'pandas']
['PYTHON' 'NUMPY' 'PANDAS']
['python' 'numpy' 'pandas']
['Python' 'Numpy' 'Pandas']


6.Generate a NumPy array of words. Insert a space between each character of every word in the array.

In [7]:
import numpy as np
a=np.array(['python', 'numpy', 'pandas'])
b=np.char.join(' ',a)
print(b)


['p y t h o n' 'n u m p y' 'p a n d a s']


7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

In [8]:
import numpy as np
a=np.random.randint(1,10,size=(3,3))
b=np.random.randint(1,10,size=(3,3))
print(a)
print(b)
c=a+b
print(c)
d=a-b
print(d)
e=a*b
print(e)
f=a/b
print(f)

[[6 5 8]
 [3 8 4]
 [3 4 1]]
[[3 1 3]
 [6 1 4]
 [2 5 9]]
[[ 9  6 11]
 [ 9  9  8]
 [ 5  9 10]]
[[ 3  4  5]
 [-3  7  0]
 [ 1 -1 -8]]
[[18  5 24]
 [18  8 16]
 [ 6 20  9]]
[[2.         5.         2.66666667]
 [0.5        8.         1.        ]
 [1.5        0.8        0.11111111]]


8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.

In [10]:
import numpy as np
a=np.eye(5,dtype=int)
print(a)
b=np.diag(a)
print(b)

[[1 0 0 0 0]
 [0 1 0 0 0]
 [0 0 1 0 0]
 [0 0 0 1 0]
 [0 0 0 0 1]]
[1 1 1 1 1]


9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in
this array.

In [11]:
import numpy as np
a=np.random.randint(0,1001,size=(100))
for i in a:
    if i>1:
        for j in range(2,i//2):
            if i%j==0:
                break
        else:
            print(i)

313
241
3
919
2
727
389
367
379
607
4
2
719
907
421
619
401
883
761
641
313
607


10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly
averages.

In [15]:
import numpy as np
a=np.random.uniform(0,50,size=(28))
b=a.reshape(4,7)
c=np.mean(b,axis=1)
print(c)


[15.34120327 28.17890329 19.53675682 17.92007955]
