**Theoretical Questions**

**1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
enhance Python's capabilities for numerical operations?**

Ans:

NumPy (Numerical Python) is a library for working with arrays and mathematical operations in Python. Its primary purpose is to provide support for large, multi-dimensional arrays and matrices, along with a wide range of high-performance mathematical functions to manipulate them.

Purpose:


1. Numerical Computing: NumPy is designed for efficient numerical computation, providing data structures and operations for scientific computing, data analysis, and machine learning.
2. Array Operations: It provides support for large, multi-dimensional arrays and matrices, enabling vectorized operations and reducing the need for loops.
3. Mathematical Functions: NumPy offers a wide range of high-performance mathematical functions, including basic arithmetic, trigonometric, exponential, and statistical functions.

Advantages:



1. Speed: NumPy's optimized C code and vectorized operations provide significant speedups compared to Python's built-in data structures.
2. Memory Efficiency: NumPy arrays require less memory than Python lists, making them ideal for large datasets.
3. Convenience: NumPy's array-based operations simplify complex numerical computations.
4. Integration: NumPy integrates seamlessly with other popular scientific computing libraries in Python, such as Pandas, SciPy, and Matplotlib.

Enhancing Python's Capabilities:

1. Multi-Dimensional Arrays: NumPy introduces multi-dimensional arrays, enabling complex numerical computations.
2. Vectorized Operations: NumPy's array-based operations replace loops, improving performance and readability.
3. Matrix Operations: NumPy provides efficient matrix multiplication, decomposition, and other linear algebra operations.
4. Statistical Functions: NumPy offers a range of statistical functions, including mean, median, standard deviation, and correlation.

**2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the
other?**

Ans:

np.mean() and np.average() are two commonly used functions in NumPy for calculating the central tendency of a dataset. While they appear similar, there are subtle differences between them.

np.mean()

np.mean() calculates the arithmetic mean of an array, which is the sum of all elements divided by the number of elements.

np.average()

np.average() calculates the weighted average of an array, allowing for optional weights.

Key differences:

1. Weights: np.average() allows for optional weights, whereas np.mean() does not.
2. Axis: Both functions can operate along specific axes, but np.average() requires explicit specification.
3. Returned value: np.mean() always returns the mean value. np.average() returns the average value and, optionally, the sum of weights.

Usage:

1. Use np.mean():
    - When calculating the simple arithmetic mean.
    - When working with unweighted data.
2. Use np.average():
    - When calculating weighted averages.
    - When working with data that has varying importance or probabilities.

**3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D
arrays.**

Ans:

Reversing a NumPy array can be achieved using the following methods:

1. np.flip()

Reverses the elements of an array along the specified axis.

2. np.flipud()

Reverses the elements of an array along the 0th axis (vertical flip).

3. np.fliplr()

Reverses the elements of an array along the 1st axis (horizontal flip).

4. Slicing (array[::-1])

Reverses the entire array or along a specific axis.




# Examples:

# 1D Array:

In [4]:

import numpy as np

# Create a 1D array
arr_1d = np.array([1, 2, 3, 4, 5])
print(arr_1d)

[1 2 3 4 5]


In [5]:
# Reverse using np.flip()
reversed_arr_1d = np.flip(arr_1d)
print(reversed_arr_1d)

[5 4 3 2 1]


In [6]:
# Reverse using slicing
reversed_arr_1d_slice = arr_1d[::-1]
print(reversed_arr_1d_slice)

[5 4 3 2 1]


#2d Array:

In [8]:
# Create a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_2d

array([[1, 2, 3],
       [4, 5, 6]])

In [9]:
# Reverse along 0th axis (vertical flip) using np.flipud()
reversed_arr_2d_vert = np.flipud(arr_2d)
print(reversed_arr_2d_vert)

[[4 5 6]
 [1 2 3]]


In [10]:
# Reverse along 1st axis (horizontal flip) using np.fliplr()
reversed_arr_2d_horiz = np.fliplr(arr_2d)
print(reversed_arr_2d_horiz)

[[3 2 1]
 [6 5 4]]


In [11]:
# Reverse using np.flip() along 0th axis
reversed_arr_2d = np.flip(arr_2d, axis=0)
print(reversed_arr_2d)

[[4 5 6]
 [1 2 3]]


In [12]:
# Reverse using slicing along 0th axis
reversed_arr_2d_slice = arr_2d[::-1, :]
print(reversed_arr_2d_slice)

[[4 5 6]
 [1 2 3]]


**4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types
in memory management and performance.**

Ans:

Data types play a crucial role in:

1. Memory Management: Efficient memory allocation and usage depend on the chosen data type.

2. Performance: Operations on arrays with matching data types are faster.

    - Integer operations are generally faster than floating-point operations.
    - Using smaller data types (e.g., int32 instead of int64) reduces memory usage.

3. Precision and Accuracy: Choosing the right data type ensures accurate calculations.

    - Floating-point data types (float32, float64) for decimal numbers.
    - Integer data types (int32, int64) for whole numbers.

4. Compatibility: Ensuring data type consistency when interacting with other libraries or languages.

Common NumPy Data Types:

- Integer: int8, int16, int32, int64
- Unsigned Integer: uint8, uint16, uint32, uint64
- Floating Point: float32, float64
- Complex: complex64, complex128
- Boolean: bool_
- String: S (fixed-length), U (variable-length), object (Python strings)


we can determine the data type of elements in a NumPy array using:

In [13]:
#array.dtype



import numpy as np
array = np.array([1, 2, 3])
print(array.dtype)

int64


**5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?**

Ans:

In NumPy, an ndarray (N-dimensional array) is a multi-dimensional collection of values of the same data type stored in a contiguous block of memory. ndarrays are the core data structure in NumPy, providing efficient and flexible numerical computation capabilities.

Key Features of ndarrays:

1. Multi-dimensionality: ndarrays can have any number of dimensions, allowing for representation of vectors, matrices, and higher-dimensional data.
2. Homogeneous data type: All elements in an ndarray must have the same data type, ensuring efficient memory usage and computation.
3. Contiguous memory allocation: ndarrays store elements in contiguous memory locations, enabling fast access and manipulation.
4. Vectorized operations: ndarrays support element-wise operations, eliminating the need for loops.
5. Broadcasting: ndarrays can perform operations on arrays with different shapes and sizes

Comparison with Standard Python Lists:

|  | NumPy ndarrays | Python Lists |
| --- | --- | --- |
| Data Type | Homogeneous | Heterogeneous |
| Memory Layout | Contiguous | Non-contiguous |
| Performance | Optimized for numerical computations | General-purpose |
| Dimensionality | Multi-dimensional | One-dimensional |
| Vectorized Operations | Supported | Not supported |
| Memory Usage | Efficient | Less efficient |

**6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.**

Ans:

NumPy arrays offer significant performance benefits over Python lists for large-scale numerical operations due to their design and implementation.

Reasons for Performance Benefits:

1. Contiguous Memory Allocation: NumPy arrays store elements in contiguous memory locations, enabling fast access and manipulation.
2. Vectorized Operations: NumPy arrays support element-wise operations, eliminating the need for loops.
3. Homogeneous Data Type: NumPy arrays ensure all elements have the same data type, reducing memory usage and improving computation efficiency.
4. Compiled C Code: NumPy's core operations are implemented in compiled C code, providing a significant speedup.
5. Cache Locality: NumPy arrays optimize memory access patterns, minimizing cache misses.

Performance Comparison:

| Operation | NumPy Array | Python List |
| --- | --- | --- |
| Element-wise Addition | O(n) | O(n^2) |
| Matrix Multiplication | O(n^3) | O(n^4) |
| Element-wise Multiplication | O(n) | O(n^2) |
| Sorting | O(n log n) | O(n^2) |

**7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and
output.**

Ans:

np.vstack() and np.hstack() are two essential functions in NumPy for stacking arrays vertically and horizontally, respectively.

np.vstack()

np.vstack() stacks arrays vertically (row-wise).

Syntax: np.vstack((array1, array2, ...))

np.hstack()

np.hstack() stacks arrays horizontally (column-wise).

Syntax: np.hstack((array1, array2, ...))

In [14]:
# Create arrays
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
print(array1)
print(array2)

[1 2 3]
[4 5 6]


In [15]:
# Vertical stacking using np.vstack()
vertical_stack = np.vstack((array1, array2))
print("Vertical Stack:\n", vertical_stack)

Vertical Stack:
 [[1 2 3]
 [4 5 6]]


In [16]:
# Horizontal stacking using np.hstack()
horizontal_stack = np.hstack((array1, array2))
print("\nHorizontal Stack:\n", horizontal_stack)


Horizontal Stack:
 [1 2 3 4 5 6]


**8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various
array dimensions.**

Ans:

np.fliplr() and np.flipud() are two methods in NumPy used for flipping arrays. The primary difference between them lies in the direction of flipping.

np.fliplr()

- Flips the array horizontally (left-right).
- Reverses the order of columns.

np.flipud()

- Flips the array vertically (up-down).
- Reverses the order of rows.

Effects on Array Dimensions

| Method | 1D Array | 2D Array | 3D Array |
| --- | --- | --- | --- |
| np.fliplr() | No effect | Flips columns | Flips 2nd dimension (axis=1) |
| np.flipud() | Reverses elements | Flips rows | Flips 1st dimension (axis=0) |

In [18]:
# 1D Array
array_1d = np.array([1, 2, 3, 4, 5])
print("Original 1D Array:", array_1d)

# Use np.flip for a 1D array
flipped_1d_array = np.flip(array_1d)
print("Flipped 1D Array:", flipped_1d_array)

Original 1D Array: [1 2 3 4 5]
Flipped 1D Array: [5 4 3 2 1]


In [19]:
# 2D Array
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("\nOriginal 2D Array:\n", array_2d)
print("Flipped 2D Array (fliplr):\n", np.fliplr(array_2d))
print("Flipped 2D Array (flipud):\n", np.flipud(array_2d))


Original 2D Array:
 [[1 2 3]
 [4 5 6]]
Flipped 2D Array (fliplr):
 [[3 2 1]
 [6 5 4]]
Flipped 2D Array (flipud):
 [[4 5 6]
 [1 2 3]]


In [20]:
# 3D Array
array_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print("\nOriginal 3D Array:\n", array_3d)
print("Flipped 3D Array (fliplr):\n", np.fliplr(array_3d))
print("Flipped 3D Array (flipud):\n", np.flipud(array_3d))


Original 3D Array:
 [[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]
Flipped 3D Array (fliplr):
 [[[3 4]
  [1 2]]

 [[7 8]
  [5 6]]]
Flipped 3D Array (flipud):
 [[[5 6]
  [7 8]]

 [[1 2]
  [3 4]]]


**9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?**

Ans:

Functionality:

np.array_split(ary, indices_or_sections, axis=0)

- ary: The input array.
- indices_or_sections: Integer or array of indices.
    - If integer, splits the array into indices_or_sections equal parts.
    - If array of indices, splits the array at these indices.
- axis: Axis along which to split (default=0).

Handling Uneven Splits:

When indices_or_sections is an integer, np.array_split() attempts to divide the array into equal parts. If the array cannot be divided evenly, the remaining elements are distributed among the sub-arrays.

In [21]:
# Create an array
ary = np.arange(10)
print("Original Array:", ary)

Original Array: [0 1 2 3 4 5 6 7 8 9]


In [22]:
# Split into 3 equal parts
sub_arrays = np.array_split(ary, 3)
print(sub_arrays)

[array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]


In [23]:
# Split at specific indices
sub_arrays = np.array_split(ary, [3, 7])
print(sub_arrays)

[array([0, 1, 2]), array([3, 4, 5, 6]), array([7, 8, 9])]


In [24]:
# Split 2D array along axis=1
ary_2d = np.arange(12).reshape(3, 4)
sub_arrays = np.array_split(ary_2d, 2, axis=1)
print(sub_arrays)

[array([[0, 1],
       [4, 5],
       [8, 9]]), array([[ 2,  3],
       [ 6,  7],
       [10, 11]])]


**10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array
operations?**

Ans:

Vectorization and broadcasting are fundamental concepts in NumPy that enable efficient array operations.

Vectorization

Vectorization refers to the ability of NumPy to perform operations on entire arrays at once, without the need for loops.

Advantages

1. Faster execution: Vectorized operations are typically faster than looping over individual elements.
2. Concise code: Vectorization reduces the amount of code needed for array operations.

Broadcasting

Broadcasting is a powerful feature in NumPy that allows arrays with different shapes and sizes to be operated on element-wise.

Rules for Broadcasting

1. If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
2. If the shape of the two arrays does not match in a particular dimension, the array with size one in that dimension is broadcasted (i.e., replicated) to match the size of the other array.

Advantages

1. Flexible array operations: Broadcasting enables operations between arrays with different shapes.
2. Efficient memory usage: Broadcasting avoids creating temporary arrays.

Contribution to Efficient Array Operations

Vectorization and broadcasting contribute to efficient array operations in the following ways:

1. Reduced looping: Vectorization eliminates the need for explicit loops.
2. Optimized memory access: Broadcasting minimizes memory allocation and copying.
3. Parallelization: Vectorized and broadcasted operations can be parallelized.

**Practical Questions:**

In [1]:
import numpy as np

In [4]:
#1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.
arr=np.random.randint(1,100,(3,3))
print(arr)
print (arr.T)

[[61 14 36]
 [84 93 83]
 [36  9 98]]
[[61 84 36]
 [14 93  9]
 [36 83 98]]


In [5]:
#2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.
arr=np.random.randint(1,18,10)
print(arr)
print(arr.reshape(2,5))
print(arr.reshape(5,2))

[ 3 17  6 14  8  9  8 14 16 14]
[[ 3 17  6 14  8]
 [ 9  8 14 16 14]]
[[ 3 17]
 [ 6 14]
 [ 8  9]
 [ 8 14]
 [16 14]]


In [12]:
#3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.
arr=np.random.rand(4,4)
print("original array:\n",arr)
print("0 bordered array:\n",np.pad(arr,1))

original array:
 [[0.30233659 0.24334486 0.48089437 0.30108761]
 [0.67017662 0.43435438 0.98349486 0.51228441]
 [0.54416831 0.08074167 0.49794926 0.00102187]
 [0.41141023 0.3967688  0.21245583 0.03633103]]
0 bordered array:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.30233659 0.24334486 0.48089437 0.30108761 0.        ]
 [0.         0.67017662 0.43435438 0.98349486 0.51228441 0.        ]
 [0.         0.54416831 0.08074167 0.49794926 0.00102187 0.        ]
 [0.         0.41141023 0.3967688  0.21245583 0.03633103 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [19]:
#4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.
arr=np.arange(10,60,5)
print(arr)

[10 15 20 25 30 35 40 45 50 55]


In [22]:
#5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations
#(uppercase, lowercase, title case, etc.) to each element.
arr=np.array(['python', 'numpy', 'pandas'])
print(f"original array:{arr}")
print(f'uppercase: {np.char.upper(arr)}')
print(f'lowercase: {np.char.lower(arr)}')
print(f'titlecase: {np.char.title(arr)}')


original array:['python' 'numpy' 'pandas']
uppercase: ['PYTHON' 'NUMPY' 'PANDAS']
lowercase: ['python' 'numpy' 'pandas']
titlecase: ['Python' 'Numpy' 'Pandas']


In [23]:
#6. Generate a NumPy array of words. Insert a space between each character of every word in the array.
arr=np.array(['python', 'numpy', 'pandas'])
print(np.char.join(' ',arr))

['p y t h o n' 'n u m p y' 'p a n d a s']


In [25]:
#7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.
arr1=np.random.randint(1,10,(3,3))
arr2=np.random.randint(1,15,(3,3))
print(f"arr1:\n{arr1}\narr2:\n{arr2}")
print(f"arr1+arr2:\n{arr1+arr2}")
print(f"arr1-arr2:\n{arr1-arr2}")
print(f"arr1*arr2:\n{arr1*arr2}")
print(f"arr1/arr2:\n{arr1/arr2}")

arr1:
[[2 5 8]
 [3 6 9]
 [4 7 3]]
arr2:
[[ 6  4 10]
 [14  5  8]
 [ 9  8  8]]
arr1+arr2:
[[ 8  9 18]
 [17 11 17]
 [13 15 11]]
arr1-arr2:
[[ -4   1  -2]
 [-11   1   1]
 [ -5  -1  -5]]
arr1*arr2:
[[12 20 80]
 [42 30 72]
 [36 56 24]]
arr1/arr2:
[[0.33333333 1.25       0.8       ]
 [0.21428571 1.2        1.125     ]
 [0.44444444 0.875      0.375     ]]


In [30]:
#8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.
arr=np.eye(5)
print(f"arr:\n{arr}")
print(f"diagonal elements:\n{np.diag(arr)}")

arr:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
diagonal elements:
[1. 1. 1. 1. 1.]


In [37]:
#9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in
#this array.
from sympy import isprime
arr=np.random.randint(0,1000,100)
print(f"arr:\n{arr}")
def is_prime(n):
  if n<=2:
    return False
  for i in range(2,int(np.sqrt(n))+1):
    if n%i==0:
      return False
  return True
print(f"prime numbers:\n{list(filter(is_prime,arr))}")


arr:
[428 442 570 596 732 836 257  55 731 210 408 358 982 452 313 839 249  68
 582 597  47 543 561  10 853 960 583 367 993 997 405 341 590 200 138 369
 472  76 775 119 755 655 189 732 996 827 230 190  84 665 890 453   4 589
 741 886 149 514 372 781   4 623  37 711 612 257  89 272 940 728 529 747
 722 628 205 994 282 728 876 779  95 152 272 482 609 405 785 192 128  31
 714 997 235 918 483 353 569 624 109 480]
prime numbers:
[257, 313, 839, 47, 853, 367, 997, 827, 149, 37, 257, 89, 31, 997, 353, 569, 109]


In [47]:
#10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly
#averages.
daily_temp=np.random.randint(30,40,size=28) #Assumin 28days month
print(f"Daily temperature °C:\n{daily_temp}")
weekly_temp=daily_temp.reshape(4,7)
print(f"Weekly temperature °C:\n{weekly_temp}")
print(f"Weekly average temperature °C:\n{np.mean(weekly_temp,axis=1)}")

Daily temperature °C:
[31 38 30 30 39 38 34 38 30 39 37 32 39 33 39 35 33 32 36 31 36 35 30 38
 35 32 34 35]
Weekly temperature °C:
[[31 38 30 30 39 38 34]
 [38 30 39 37 32 39 33]
 [39 35 33 32 36 31 36]
 [35 30 38 35 32 34 35]]
Weekly average temperature °C:
[34.28571429 35.42857143 34.57142857 34.14285714]
