In [4]:
'''
Theoretical Questions

1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
enhance Python's capabilities for numerical operations?

Purpose of NumPy in Scientific Computing and Data Analysis
NumPy (Numerical Python) is a powerful library in Python designed to support efficient numerical computations. It provides tools for working with large, multi-dimensional arrays and matrices, along with a vast collection of mathematical functions to operate on these arrays. NumPy is fundamental for scientific computing, data analysis, machine learning, and many other domains due to its efficient handling of data and array-based calculations.

Advantages of NumPy
Efficient Array and Matrix Operations:

NumPy provides the ndarray object, which is much more efficient for storing and processing large amounts of numerical data compared to Python’s native lists. It supports multi-dimensional arrays and performs operations on these arrays without the need for explicit loops, leading to faster code execution.
Vectorized Operations:

One of the biggest advantages of NumPy is that it allows vectorized operations, meaning mathematical functions can be applied to entire arrays at once, rather than element by element. This leads to cleaner, more concise code and significantly faster execution due to internal optimizations and avoidance of Python-level loops.
Example: Instead of looping through an array to add a scalar to each element, you can simply
'''
import numpy as np
arr = np.array([1, 2, 3, 4])
arr = arr + 5  # Adds 5 to each element in the array

'''
Memory Efficiency:

NumPy arrays are stored in contiguous blocks of memory, unlike Python lists, which store pointers to the actual data. This leads to more efficient use of memory, especially for large datasets. NumPy also allows you to specify data types (e.g., int32, float64), giving control over the precision and memory footprint of the arrays.
Broad Mathematical Functionality:

NumPy includes a wide range of built-in mathematical functions such as linear algebra operations, Fourier transforms, random number generation, statistical functions, and more. This makes it a comprehensive tool for numerical computations in scientific computing.
Integration with Other Libraries:

NumPy forms the foundation of many other libraries in Python, such as pandas (for data manipulation), SciPy (for scientific computing), Matplotlib (for data visualization), TensorFlow (for machine learning), and others. NumPy’s array structure is widely adopted across these libraries, allowing seamless integration and consistent data handling.
Broadcasting:

NumPy supports broadcasting, which allows operations on arrays of different shapes and sizes without requiring explicit looping or resizing of arrays. This is particularly useful in element-wise operations between arrays of different dimensions, enhancing both flexibility and efficiency.
Example:
'''
a = np.array([1, 2, 3])
b = np.array([[1], [2], [3]])
result = a + b  # Broadcasting allows this element-wise addition

'''
Data Handling and Transformation:

NumPy offers convenient ways to reshape, slice, index, and filter arrays. You can easily manipulate array dimensions, select subarrays, and perform operations along specific axes, making data handling more flexible.
Example:
'''
arr = np.array([[1, 2, 3], [4, 5, 6]])
reshaped_arr = arr.reshape(3, 2)  # Reshape a 2x3 array into a 3x2 array

'''
High Performance:

NumPy operations are implemented in C, which makes them significantly faster than Python's built-in operations on lists. For large-scale numerical problems, NumPy offers performance close to compiled languages like C or Fortran, while allowing you to write code in Python.
Handling Large Datasets:

NumPy is particularly useful when dealing with large datasets, as it can handle large arrays efficiently in both memory and computation. Operations that would be slow and memory-intensive in pure Python become manageable using NumPy.
How NumPy Enhances Python’s Capabilities for Numerical Operations
Optimized for Numerical Calculations:

While Python itself is not optimized for numerical computations, NumPy provides a highly optimized interface for performing mathematical operations, including those on large datasets. This allows Python, traditionally a general-purpose language, to excel in scientific computing.
Extending Python's Data Structures:

Python lists, while versatile, are not efficient for numerical operations. NumPy extends Python's capability by introducing the ndarray, a powerful data structure designed specifically for numerical data, supporting multi-dimensional arrays with a variety of data types.
Reduction in Code Complexity:

Without NumPy, performing operations on arrays would require manual looping, explicit memory management, and type handling. NumPy abstracts these complexities, enabling concise, readable code that is easier to maintain.


Example:

Without NumPy (sum of two lists):
'''
x = [1, 2, 3]
y = [4, 5, 6]
z = [x[i] + y[i] for i in range(len(x))]

'''With NumPy (vectorized sum):
'''
import numpy as np
x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
z = x + y


#Cross-Language Compatibility:

#NumPy arrays can be easily shared with other languages like C, C++, and Fortran. This makes it possible to write high-performance functions in these languages and call them from Python, with NumPy arrays as inputs and outputs.
#Scalability for Big Data and Machine Learning:

#NumPy forms the foundation for more advanced numerical libraries (like TensorFlow, PyTorch) used in big data and machine learning. These libraries rely on NumPy arrays for storing and manipulating large datasets, enabling scalable computations with GPUs and distributed computing.




In [7]:
''' 2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other

Comparison of np.mean() and np.average() in NumPy
Both np.mean() and np.average() are used for calculating the central tendency of data in NumPy, but they differ in functionality, particularly when handling weights.

np.mean()
Purpose: Computes the arithmetic mean (average) of the elements along a specified axis or the entire array.
Syntax: np.mean(arr, axis=None, dtype=None, out=None, keepdims=False)
Weighted Mean: Does not support weighted averages; it calculates the unweighted mean.
Use Case: Use np.mean() when you want the simple arithmetic mean and when weights are not involved.
Example:
'''
import numpy as np

arr = np.array([1, 2, 3, 4, 5])
mean_value = np.mean(arr)
print(mean_value)  # Output: 3.0

'''
In this example, the mean is calculated by summing the elements (1+2+3+4+5 = 15) and dividing by the number of elements (5), resulting in 3.0.

np.average()
Purpose: Computes the weighted average of the array elements. If no weights are specified, it defaults to calculating the unweighted mean, similar to np.mean().
Syntax: np.average(arr, axis=None, weights=None, returned=False)
weights: An array of weights. If provided, it calculates a weighted mean.
returned: If True, it also returns the sum of weights along with the average.
Weighted Mean: Supports weighted averages by providing a weights parameter.
Use Case: Use np.average() when you need to calculate a weighted average, where some elements contribute more to the final result than others.


Example (Without Weights):
'''
arr = np.array([1, 2, 3, 4, 5])
average_value = np.average(arr)
print(average_value)  # Output: 3.0

# This behaves just like np.mean() in this case because no weights are provided.

#Example (With Weights):

weights = np.array([1, 2, 3, 4, 5])  # Higher weights for larger numbers
weighted_avg = np.average(arr, weights=weights)
print(weighted_avg)  # Output: 3.6666666666666665


#When to Use One Over the Other
#Use np.mean():

#When you want the simple arithmetic mean.
#When no weights are involved.
#For clarity and simplicity in most cases where weighted averages are not required.
#Use np.average():

#When you need a weighted average.
#When you want the option to return the sum of the weights along with the average.
#When you have different levels of importance for different elements in your array, and you want these to be reflected in the calculation.



3.0
3.0
3.6666666666666665


In [9]:
'''
3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays

Reversing a NumPy array can be done in multiple ways depending on whether it's a 1D array or a multi-dimensional (e.g., 2D) array. NumPy provides slicing techniques and built-in functions to reverse arrays along different axes.

Methods for Reversing a NumPy Array
1. Reversing a 1D Array
In a 1D array, reversing is simple and can be done using Python slicing.

Method: Use slicing with a step of -1:

arr[::-1]: This slices the array from the end to the start, effectively reversing it.

Example
'''
import numpy as np

# Create a 1D NumPy array
arr = np.array([1, 2, 3, 4, 5])

# Reverse the array
reversed_arr = arr[::-1]

print("Original Array:", arr)
print("Reversed Array:", reversed_arr)

'''
output
Original Array: [1 2 3 4 5]
Reversed Array: [5 4 3 2 1]

2. Reversing a 2D Array Along Rows (Axis=0)
In a 2D array, reversing along rows means flipping the rows while keeping the columns unchanged. This is equivalent to reversing the array along the first axis (axis=0).

Method: Slice the array along the rows using arr[::-1, :].

'''

# Create a 2D NumPy array
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Reverse the array along rows
reversed_rows = arr_2d[::-1, :]

print("Original Array:\n", arr_2d)
print("Reversed Along Rows:\n", reversed_rows)

'''
Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Reversed Along Rows:
 [[7 8 9]
 [4 5 6]
 [1 2 3]]

3. Reversing a 2D Array Along Columns (Axis=1)
Reversing along the columns means flipping the elements in each row, leaving the rows in the same order. This corresponds to reversing along the second axis (axis=1).

Method: Slice the array along the columns using arr[:, ::-1].

Example:
'''
# Reverse the array along columns
reversed_columns = arr_2d[:, ::-1]

print("Original Array:\n", arr_2d)
print("Reversed Along Columns:\n", reversed_columns)

'''
output

Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Reversed Along Columns:
 [[3 2 1]
 [6 5 4]
 [9 8 7]]

4. Reversing a 2D Array Along Both Axes
To reverse both the rows and the columns of a 2D array (flip the array completely), you can slice the array in both dimensions.

Method: Slice both rows and columns using arr[::-1, ::-1].

Example:
'''
# Reverse the array along both axes
reversed_both = arr_2d[::-1, ::-1]

print("Original Array:\n", arr_2d)
print("Reversed Along Both Axes:\n", reversed_both)


#output
#[[1 2 3]
# [4 5 6]
# [7 8 9]]
#Reversed Along Both Axes:
# [[9 8 7]
# [6 5 4]
# [3 2 1]]



Original Array: [1 2 3 4 5]
Reversed Array: [5 4 3 2 1]
Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Reversed Along Rows:
 [[7 8 9]
 [4 5 6]
 [1 2 3]]
Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Reversed Along Columns:
 [[3 2 1]
 [6 5 4]
 [9 8 7]]
Original Array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Reversed Along Both Axes:
 [[9 8 7]
 [6 5 4]
 [3 2 1]]


In [13]:
'''
4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.\

Determining the Data Type of Elements in a NumPy Array
In NumPy, you can easily determine the data type of the elements in an array using the dtype attribute. This attribute gives you the specific data type that the array is using, which could be integer, float, complex, or other types.
'''
import numpy as np

# Create a NumPy array
arr = np.array([1, 2, 3])

# Check the data type of the elements
print(arr.dtype)  # Output: int64 (or another integer type depending on the platform)

#Methods to Determine Data Types in NumPy
#dtype attribute: Returns the data type of the array elements.


arr.dtype
#np.result_type(): Determines the resulting data type if you perform operations between arrays or scalars.

result_dtype = np.result_type(np.array([1]), np.array([1.0]))
print(result_dtype)  # Output: float64

#arr.astype(): Although primarily used for type conversion, it can also return the current data type after conversion (if any).


float_arr = arr.astype(float)
print(float_arr.dtype)  # Output: float64

'''
Importance of Data Types in Memory Management and Performance
Choosing the right data type is crucial for memory management and performance in NumPy, especially when dealing with large datasets or computationally heavy applications.

1. Memory Efficiency
Smaller Data Types Save Memory: NumPy allows you to specify the data type explicitly (e.g., int8, int16, float32), which lets you use only the necessary amount of memory for each element. This can drastically reduce memory usage, especially for large arrays.

Example:

int32 uses 4 bytes per element, while int64 uses 8 bytes.
Using int32 instead of int64 for a large array can save substantial memory.
'''
arr = np.array([1, 2, 3], dtype=np.int32)
print(arr.nbytes)  # Output: 12 (3 elements * 4 bytes each)
'''
Larger Data Types Use More Memory: If you use unnecessarily large data types (e.g., using float64 when float32 is sufficient), you may waste memory.

2. Performance Optimization
Faster Computation with Smaller Types: Smaller data types (like int32 or float32) allow NumPy to perform calculations more quickly because they require fewer resources and less memory bandwidth. This is important in scientific computing, where even minor optimizations can lead to significant speed improvements when working with large datasets.

Trade-off Between Precision and Speed: Choosing a smaller data type like float32 can result in faster computations but with reduced precision. In some cases, such as machine learning, the loss in precision is acceptable for the performance gain. In other cases, such as scientific simulations, precision may be critical, so you may prefer float64 or higher.

Example:

'''
arr_float32 = np.array([1.1, 2.2, 3.3], dtype=np.float32)
arr_float64 = np.array([1.1, 2.2, 3.3], dtype=np.float64)

print(arr_float32.dtype)  # Output: float32
print(arr_float64.dtype)  # Output: float64

'''
Vectorized Operations: NumPy uses highly optimized C libraries for performing operations in parallel over arrays. The choice of data types impacts how efficiently these operations can be executed. Operations on smaller data types (like int32 or float32) can often be completed faster than operations on int64 or float64, due to reduced computational overhead.

3. Avoiding Overflow and Underflow
Overflow: If the data type is too small, operations on large values may lead to overflow. For example, using int8 (which stores values from -128 to 127) for a dataset that contains numbers greater than 127 can cause errors or unexpected behavior.

Example:

'''
arr = np.array([127], dtype=np.int8)
arr += 1
print(arr)  # Output: -128 (overflow)
'''
Underflow: Similarly, choosing too small of a floating-point data type (like float16) could cause precision issues when dealing with very small numbers, leading to underflow.

4. Type Compatibility in Operations
Mixed-Type Operations: When performing operations between arrays of different data types, NumPy will automatically upcast to the higher precision type to prevent loss of information. This ensures accuracy but can result in increased memory usage and slower performance.

Example:

'''
arr_int = np.array([1, 2, 3], dtype=np.int32)
arr_float = np.array([1.1, 2.2, 3.3], dtype=np.float32)

result = arr_int + arr_float
print(result.dtype)  # Output: float32 (upcasted to float)

#5. Type-Specific Operations
#Some operations are more efficient or only possible with certain data types. For example, complex numbers use the complex64 or complex128 types, which are necessary for calculations involving imaginary numbers.


int64
float64
float64
12
float32
float64
[-128]
float64


In [None]:
'''
5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

Definition of ndarray in NumPy
An ndarray (N-dimensional array) in NumPy is the primary data structure for storing and manipulating numerical data. It represents a grid of values (elements) that can have any number of dimensions (axes), such as 1D, 2D, 3D, or higher-dimensional arrays.

Example:
'''
import numpy as np

# Creating a 2D ndarray (matrix)
arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)
'''
Key Features of ndarray
Multidimensionality:

The ndarray supports multiple dimensions (axes). You can create 1D arrays (vectors), 2D arrays (matrices), or even higher-dimensional arrays.
The number of dimensions is given by the ndim attribute.
Each dimension's size is stored in the shape attribute.
Example:
'''

arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr.ndim)    # Output: 2 (2D array)
print(arr.shape)   # Output: (2, 3) (2 rows, 3 columns)

'''
Homogeneous Data:

Unlike Python lists, all elements in an ndarray must be of the same data type. This ensures memory efficiency and faster computation.
The data type (dtype) of the array is stored in the dtype attribute.
Example:
'''
arr = np.array([1, 2, 3], dtype=np.float32)
print(arr.dtype)  # Output: float32

'''
Element-wise Operations:

ndarray supports element-wise arithmetic operations and mathematical functions. Operations like addition, subtraction, multiplication, and division are applied to each element without the need for loops.
Example:
'''
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
result = arr1 + arr2
print(result)  # Output: [5 7 9]

'''
Efficient Memory Management:

NumPy arrays store data in contiguous blocks of memory, which allows for efficient access and modification.
They are more memory-efficient than Python lists, which store references to objects.
Broadcasting:

NumPy allows operations on arrays of different shapes through broadcasting. Smaller arrays can be "broadcast" across larger arrays to make operations feasible without explicitly resizing them.

Example:

'''
arr1 = np.array([1, 2, 3])
arr2 = np.array([[1], [2], [3]])
result = arr1 + arr2  # Broadcasting: arr1 is added to each row of arr2
print(result)
# Output:
# [[2 3 4]
#  [3 4 5]
#  [4 5 6]]
'''
Indexing and Slicing:

You can access and modify elements in an ndarray using indexing and slicing, similar to Python lists. However, in multi-dimensional arrays, you can use multiple indices or slices to target specific elements or subarrays.
Example:
'''
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr[1, 2])  # Output: 6 (access element at row 1, column 2)
print(arr[:, 1])  # Output: [2 5] (access all rows in column 1)
'''
Mathematical Functions and Aggregation:

NumPy provides a wide range of built-in mathematical functions (like sin, cos, exp, etc.) and aggregation functions (like sum, mean, max, min, etc.) that operate directly on ndarray objects.
Example:
'''
arr = np.array([1, 2, 3, 4])
print(np.sum(arr))  # Output: 10
print(np.mean(arr)) # Output: 2.5


In [16]:
'''
6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

Performance Benefits of NumPy Arrays over Python Lists for Large-Scale Numerical Operations
When working with large datasets or performing heavy numerical computations, NumPy arrays (ndarray) offer significant performance advantages over Python lists. These benefits arise from the way NumPy handles memory, computations, and optimizations. Let’s break down the key reasons why NumPy arrays are superior to Python lists in terms of performance for large-scale numerical operations.

1. Memory Efficiency
Contiguous Memory Layout: NumPy arrays store data in a contiguous block of memory, which allows efficient access to elements. Python lists, on the other hand, store references to objects, meaning that the actual data could be scattered in memory.

Homogeneous Data: NumPy arrays are homogeneously typed, meaning all elements must be of the same type (e.g., int32, float64). This allows NumPy to allocate memory more efficiently and avoid the overhead associated with Python lists, which can hold heterogeneous data.

Smaller Memory Footprint: Due to the homogeneous nature and contiguous memory layout, NumPy arrays consume far less memory compared to Python lists, especially when dealing with large datasets.

Example:
'''

import numpy as np

# NumPy array
arr = np.arange(1000, dtype=np.int32)
print(arr.nbytes)  # Output: 4000 bytes (1000 elements * 4 bytes)

# Python list
lst = list(range(1000))
print(lst.__sizeof__())  # Output: larger size (depends on system, but typically much larger)

'''
2. Vectorization and Element-Wise Operations
Vectorized Operations: NumPy arrays support vectorized operations, which means operations are applied element-wise to the entire array without needing explicit loops. This is done using highly optimized C and Fortran libraries under the hood. In contrast, Python lists require explicit loops for element-wise operations, which results in slower performance due to the overhead of Python’s interpreted nature.

Avoidance of Python Loop Overhead: In NumPy, operations like addition, subtraction, or multiplication can be performed across entire arrays without using for loops, reducing the overhead and making the code more efficient.

Example:

'''
import numpy as np

arr = np.array([1, 2, 3])
result = arr + arr  # Element-wise addition
print(result)  # Output: [2 4 6]

# With Python lists, you'd need a loop
lst = [1, 2, 3]
result = [x + x for x in lst]
print(result)  # Output: [2, 4, 6]
'''

3. Optimized Memory Access and CPU Cache Utilization
Cache-Friendly Data Layout: NumPy arrays are stored in contiguous memory blocks, which allows better utilization of the CPU cache. This reduces the time required for data access and improves computational speed. Python lists, on the other hand, store object references that are scattered throughout memory, leading to inefficient cache utilization.

Reduced Overhead for Data Access: Accessing elements in a NumPy array is faster because of the uniform data type and memory layout. With Python lists, there’s additional overhead in dereferencing pointers to access the actual data.

Example: Operations involving sequential access of data in a NumPy array (like summing values) are significantly faster compared to Python lists, as shown below:

'''
import time
import numpy as np

# NumPy array
arr = np.arange(1000000)
start = time.time()
np.sum(arr)
print("NumPy sum time:", time.time() - start)

# Python list
lst = list(range(1000000))
start = time.time()
sum(lst)
print("Python list sum time:", time.time() - start)
'''
4. Low-Level Optimizations in NumPy
Written in C: NumPy is implemented in C, which means it benefits from low-level optimizations. Operations on NumPy arrays are compiled into machine-level instructions, resulting in significant performance gains.

Leverage of SIMD Instructions: NumPy can take advantage of Single Instruction, Multiple Data (SIMD) operations, where a single CPU instruction operates on multiple data points simultaneously. This parallelization boosts performance, especially for operations like matrix multiplication, element-wise computations, and reductions (sum, mean, etc.).

5. Broadcasting
No Need for Manual Reshaping or Looping: NumPy supports broadcasting, allowing arrays with different shapes to be used together in arithmetic operations without explicit loops or resizing. Broadcasting makes it easier to work with arrays of different shapes and improves the efficiency of operations without copying large amounts of data.

Example:
'''

import numpy as np

# Broadcasting a 1D array over a 2D array
arr1 = np.array([1, 2, 3])
arr2 = np.array([[10], [20], [30]])
result = arr1 + arr2
print(result)
# Output:
# [[11 12 13]
#  [21 22 23]
#  [31 32 33]]
'''
6. Built-in Functions and Libraries for Numerical Operations
Extensive Set of Functions: NumPy provides a wide range of built-in mathematical functions that are optimized for performance, such as np.sum(), np.mean(), np.sin(), and np.exp(). These functions are implemented using low-level optimized code, making them far faster than manually implementing similar functionality with Python lists.

Aggregation and Reduction: Functions like np.sum(), np.prod(), np.mean(), and np.max() can compute aggregate statistics across entire arrays in an optimized manner. Performing similar operations with Python lists would require manually iterating through the list and applying the operation.

Example:

'''
arr = np.array([1, 2, 3, 4, 5])
print(np.sum(arr))  # Output: 15

# With Python lists
lst = [1, 2, 3, 4, 5]
print(sum(lst))  # Output: 15
'''
7. Handling Large Datasets
Scalability: NumPy arrays are designed to handle large datasets efficiently. Python lists may become prohibitively slow and memory-intensive when working with large-scale data due to their internal representation.

Efficient Use of Disk and Memory: When dealing with large datasets, NumPy allows the use of memory-mapped files (np.memmap) to avoid loading entire datasets into memory at once, improving performance and reducing memory usage.

Example:

'''
# Creating a large array (1 billion elements) with np.memmap
large_array = np.memmap('data.dat', dtype='float32', mode='w+', shape=(1000000000,))

#8. Concurrency and Parallelism
#Multithreading: Many NumPy operations, such as element-wise addition, multiplication, and matrix operations, are implemented using parallel algorithms, taking advantage of multiple cores and threads for faster execution.

#Avoidance of Python’s Global Interpreter Lock (GIL): NumPy operations are often executed in low-level C or Fortran, bypassing Python’s Global Interpreter Lock (GIL). This allows for true parallelism in numerical computations, further improving performance in multi-core systems.


4000
8040
[2 4 6]
[2, 4, 6]
NumPy sum time: 0.0013556480407714844
Python list sum time: 0.00856471061706543
[[11 12 13]
 [21 22 23]
 [31 32 33]]
15
15


In [19]:
'''
7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.

In NumPy, the functions vstack() and hstack() are used to stack (combine) arrays along different axes. These functions are very useful when you want to merge or concatenate arrays either vertically or horizontally.

1. vstack(): Vertical Stacking
Purpose: vstack() stacks arrays vertically, row-wise. It concatenates arrays along the vertical axis (axis 0).
Shape Compatibility: The arrays being stacked must have the same number of columns.
Example:
'''
import numpy as np

# Two 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Vertical stacking
result = np.vstack((arr1, arr2))
print(result)
'''
 Output:
 [[1 2 3]
 [4 5 6]]
Here, the two 1D arrays (arr1 and arr2) are stacked on top of each other to form a 2D array.

Example with 2D arrays:
'''
arr1 = np.array([[1, 2, 3]])
arr2 = np.array([[4, 5, 6]])

result = np.vstack((arr1, arr2))
print(result)
'''
Output:


 [[1 2 3]
 [4 5 6]]
In this case, two 2D arrays with the same number of columns are stacked vertically.

2. hstack(): Horizontal Stacking
Purpose: hstack() stacks arrays horizontally, column-wise. It concatenates arrays along the horizontal axis (axis 1).
Shape Compatibility: The arrays being stacked must have the same number of rows.
Example:

'''
import numpy as np

# Two 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Horizontal stacking
result = np.hstack((arr1, arr2))
print(result)
'''
Output:


[1 2 3 4 5 6]
Here, two 1D arrays (arr1 and arr2) are concatenated side-by-side into a single 1D array.

Example with 2D arrays:
'''
arr1 = np.array([[1], [2], [3]])
arr2 = np.array([[4], [5], [6]])

result = np.hstack((arr1, arr2))
print(result)

#Output:


 #[[1 4]
 #[2 5]
 #[3 6]]
#In this case, two 2D arrays with the same number of rows are stacked horizontally.

#Key Differences:
#vstack(): Stacks arrays vertically, row-wise. The arrays must have the same number of columns.
#hstack(): Stacks arrays horizontally, column-wise. The arrays must have the same number of rows.

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[1 2 3 4 5 6]
[[1 4]
 [2 5]
 [3 6]]


In [21]:
'''
8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.

In NumPy, the fliplr() and flipud() methods are used to reverse the elements of an array along specific axes. The key difference between these two methods lies in the direction in which they flip the array:

1. fliplr() (Flip Left to Right)
Purpose: fliplr() reverses the order of the elements along the horizontal axis (i.e., left to right). This method only works on 2D arrays or arrays with more dimensions where at least the second axis exists (axis 1).

Effect on 2D Arrays: It flips the elements along the horizontal axis, so the columns are reversed.

Effect on Higher-Dimensional Arrays: It still operates on axis 1, reversing the left-to-right order for each 2D slice.

Example:
'''
import numpy as np

# 2D array
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

# Applying fliplr
result = np.fliplr(arr)
print(result)
'''
Output:


 [[3 2 1]
 [6 5 4]]
Here, each row's elements are flipped from left to right.

Example with a 3D array:
'''
arr = np.array([[[1, 2, 3],
                 [4, 5, 6]],

                [[7, 8, 9],
                 [10, 11, 12]]])

# Applying fliplr
result = np.fliplr(arr)
print(result)
'''
Output:


  [[[ 3  2  1]
  [ 6  5  4]]

   [[ 9  8  7]
  [12 11 10]]]
In this 3D example, the fliplr() function reverses the second axis (axis 1) of each 2D slice.

2. flipud() (Flip Up to Down)
Purpose: flipud() reverses the order of the elements along the vertical axis (i.e., top to bottom). This method works on 2D arrays and also operates on higher-dimensional arrays along the first axis (axis 0).

Effect on 2D Arrays: It flips the rows vertically, so the rows are reversed.

Effect on Higher-Dimensional Arrays: It reverses the first axis (axis 0), flipping the top-to-bottom order of the elements.

Example:
'''
import numpy as np

# 2D array
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

# Applying flipud
result = np.flipud(arr)
print(result)
'''
Output:

[[4 5 6]
 [1 2 3]]
In this case, the rows are flipped vertically.

Example with a 3D array:
'''
arr = np.array([[[1, 2, 3],
                 [4, 5, 6]],

                [[7, 8, 9],
                 [10, 11, 12]]])

# Applying flipud
result = np.flipud(arr)
print(result)

#Output:


#[[[ 7  8  9]
#  [10 11 12]]

# [[ 1  2  3]
#  [ 4  5  6]]]
#In this 3D example, the flipud() function reverses the elements along the first axis (axis 0), flipping the slices vertically.

#Key Differences:
#fliplr() flips arrays left to right along the horizontal axis (axis 1), reversing the order of columns in each row.
#flipud() flips arrays up to down along the vertical axis (axis 0), reversing the order of rows in the array.


[[3 2 1]
 [6 5 4]]
[[[ 4  5  6]
  [ 1  2  3]]

 [[10 11 12]
  [ 7  8  9]]]
[[4 5 6]
 [1 2 3]]
[[[ 7  8  9]
  [10 11 12]]

 [[ 1  2  3]
  [ 4  5  6]]]


In [22]:
'''
9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

The array_split() function in NumPy is used to split an array into multiple sub-arrays. One of its key features is its ability to handle uneven splits, unlike the split() function, which requires the array to be evenly divisible by the number of sections. This makes array_split() more flexible when dealing with arrays that cannot be divided into equal parts.

Functionality of array_split()
Purpose: Splits an array into multiple sub-arrays along a specified axis.
Parameters:
ary: The input array to split.
indices_or_sections: Either an integer (number of sections to split the array into) or a list of indices (defining specific split points).
axis: The axis along which to split the array (defaults to 0).
Uneven Splits Handling
When the array cannot be divided evenly, array_split() ensures that all sub-arrays are as equal in size as possible.
If an integer n is provided, the function tries to split the array into n parts. If the array's size is not perfectly divisible by n, some parts will have one extra element to accommodate the uneven division.
Example of Uneven Split:
'''
import numpy as np

# Array with 10 elements
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Splitting into 3 parts (uneven)
result = np.array_split(arr, 3)
for sub_arr in result:
    print(sub_arr)
'''
output

[1 2 3 4]
[5 6 7]
[8 9 10]


In this example, the array of size 10 is split into 3 parts. Since 10 is not divisible evenly by 3, the first sub-array contains 4 elements, while the remaining sub-arrays contain 3 elements each.

Example of Even Split:
'''
arr = np.array([1, 2, 3, 4, 5, 6])

# Splitting into 2 even parts
result = np.array_split(arr, 2)
for sub_arr in result:
    print(sub_arr)

    '''
Output:


[1 2 3]
[4 5 6]
Here, the array size (6) is divisible by 2, so the array_split() function divides it evenly into two parts.

Handling Uneven Split with a List of Indices
You can also provide a list of specific indices to define how the array should be split.

Example:
'''
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Splitting at specific indices
result = np.array_split(arr, [3, 6, 8])
for sub_arr in result:
    print(sub_arr)
    '''
Output:


[1 2 3]
[4 5 6]
[7 8]
[9 10]
Here, the array is split at indices 3, 6, and 8, resulting in four sub-arrays of varying lengths.

Key Features of array_split():
Flexible Splitting: Works with both integer divisions and custom index lists.
Handles Uneven Splits: Automatically adjusts for uneven splits, ensuring that the sub-arrays have as equal a number of elements as possible.
Works on Different Axes: By changing the axis parameter, you can split the array along different dimensions (useful for multi-dimensional arrays).
'''
#

[1 2 3 4]
[5 6 7]
[ 8  9 10]
[1 2 3]
[4 5 6]
[1 2 3]
[4 5 6]
[7 8]
[ 9 10]


In [29]:
'''
10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?


Vectorization in NumPy
Vectorization refers to the process of performing operations on entire arrays (or vectors) in one step, rather than applying operations element-by-element through loops. It takes advantage of low-level optimizations, often utilizing CPU instructions (SIMD — Single Instruction, Multiple Data) to perform operations faster.

Purpose: Vectorization allows you to express operations in a concise and readable manner, making the code faster and more efficient. It removes the need for explicit Python loops, reducing overhead.
Key Benefit: Significant performance improvements for large-scale data processing.
Example of Vectorization:
Without vectorization (using a Python loop):

'''
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Squaring each element using a loop
squared = []
for i in arr:
    squared.append(i ** 2)
print(squared)

#With vectorization (using NumPy operations):


squared = arr ** 2  # Element-wise squaring
print(squared)
'''
Output:

[ 1  4  9 16 25]
In this case, the vectorized operation is not only more concise but also much faster, especially for large arrays.

Broadcasting in NumPy
Broadcasting is a powerful feature in NumPy that allows operations between arrays of different shapes by automatically expanding the smaller array along certain axes to match the dimensions of the larger array. This eliminates the need to manually reshape arrays and allows operations that wouldn’t otherwise be possible due to dimensional mismatch.

Purpose: To enable arithmetic operations (addition, multiplication, etc.) on arrays of different shapes by "broadcasting" the smaller array to the shape of the larger array.
Key Benefit: It allows you to write concise and efficient code without needing to explicitly reshape or tile arrays.
Rules of Broadcasting:
If the arrays differ in number of dimensions, the smaller array is padded with extra dimensions (size 1) on the left side until both arrays have the same number of dimensions.
If the shapes of the two arrays don't match in a dimension, the smaller array's dimension must be 1 or must match the dimension of the larger array.
If the shape of the smaller array is 1 in a certain dimension, it will be "stretched" to match the larger array.
Example of Broadcasting:
'''
import numpy as np

# 2D array (3x3)
arr1 = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])

# 1D array (length 3)
arr2 = np.array([1, 2, 3])

# Broadcasting: arr2 is "broadcasted" to match the shape of arr1
result = arr1 + arr2
print(result)
'''
Output:


[[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]
In this example:

arr2 (shape: (3,)) is broadcasted across the rows of arr1 (shape: (3, 3)), making the operation possible without explicitly reshaping or tiling arr2.
Example with Scalars:
Broadcasting also works with scalars:

'''
arr = np.array([1, 2, 3, 4])

# Scalar 5 is broadcast to each element of the array
result = arr + 5
print(result)
'''
Output:


[ 6  7  8  9]
Here, the scalar 5 is broadcast across all elements of the array.

Performance Benefits of Vectorization and Broadcasting
Speed: Both vectorization and broadcasting allow NumPy to perform operations in C or Fortran behind the scenes, bypassing slower Python loops. This makes computations significantly faster, especially for large datasets.

Memory Efficiency: Broadcasting allows operations without creating unnecessary copies of data. It "virtually" stretches arrays without physically replicating the data, saving memory.

Conciseness: Operations that require multiple lines of code using loops can often be expressed in a single line with vectorization or broadcasting, making the code more readable and maintainable.
'''
'''
'''


[1, 4, 9, 16, 25]
[ 1  4  9 16 25]
[[ 2  4  6]
 [ 5  7  9]
 [ 8 10 12]]
[6 7 8 9]


'\n'

In [30]:
#Practical Questions

#1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns

#Here's how to create a 3x3 NumPy array with random integers between 1 and 100 and then interchange its rows and columns using NumPy in Python:

import numpy as np

# Create a 3x3 NumPy array with random integers between 1 and 100
array_3x3 = np.random.randint(1, 101, size=(3, 3))

# Interchange rows and columns (transpose the array)
interchanged_array = np.transpose(array_3x3)

# Alternatively, you can also use np.swapaxes
# interchanged_array = np.swapaxes(array_3x3, 0, 1)

print("Original Array:")
print(array_3x3)
print("\nInterchanged Array:")
print(interchanged_array)


Original Array:
[[87 67 23]
 [59 47 85]
 [35  2 82]]

Interchanged Array:
[[87 59 35]
 [67 47  2]
 [23 85 82]]


In [31]:
#2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array

#Here’s how to generate a 1D NumPy array with 10 elements, reshape it into a 2x5 array, and then into a 5x2 array:

import numpy as np

# Generate a 1D NumPy array with 10 elements
array_1d = np.arange(10)

# Reshape it into a 2x5 array
array_2x5 = array_1d.reshape(2, 5)

# Reshape it into a 5x2 array
array_5x2 = array_1d.reshape(5, 2)

print("1D Array:")
print(array_1d)
print("\nReshaped to 2x5 Array:")
print(array_2x5)
print("\nReshaped to 5x2 Array:")
print(array_5x2)


1D Array:
[0 1 2 3 4 5 6 7 8 9]

Reshaped to 2x5 Array:
[[0 1 2 3 4]
 [5 6 7 8 9]]

Reshaped to 5x2 Array:
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [34]:
#3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array

#Here is how to create a 4x4 NumPy array with random float values and then add a border of zeros around it to obtain a 6x6 array using the following code:

import numpy as np

# Create a 4x4 array with random float values
array_4x4 = np.random.rand(4, 4)

# Add a border of zeros around the array
array_6x6 = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)

print("4x4 Array with Random Float Values:")
print(array_4x4)

print("\n6x6 Array with Border of Zeros:")
print(array_6x6)


#Explanation:
#Creating the 4x4 Array: np.random.rand(4, 4) generates a 4x4 array with random float values between 0 and 1.
#Padding with Zeros: np.pad is used to add a border of zeros around the original array. The pad_width argument specifies how many zeros to add around each side, and mode='constant' with constant_values=0 indicates that the padding value should be zero.


4x4 Array with Random Float Values:
[[0.18700302 0.6759908  0.42353422 0.99053495]
 [0.51791678 0.26271557 0.01726268 0.24790173]
 [0.23358641 0.77105929 0.08237672 0.5828363 ]
 [0.64709714 0.12140001 0.07957015 0.10635872]]

6x6 Array with Border of Zeros:
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.18700302 0.6759908  0.42353422 0.99053495 0.        ]
 [0.         0.51791678 0.26271557 0.01726268 0.24790173 0.        ]
 [0.         0.23358641 0.77105929 0.08237672 0.5828363  0.        ]
 [0.         0.64709714 0.12140001 0.07957015 0.10635872 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [35]:
#4. Using NumPy, create an array of integers from 10 to 60 with a step of 5

#Here is how to  create an array of integers from 10 to 60 with a step of 5 using NumPy's np.arange() function. Here’s how you can do it:


import numpy as np

# Create an array of integers from 10 to 60 with a step of 5
array = np.arange(10, 61, 5)

print("Array of integers from 10 to 60 with a step of 5:")
print(array)

#Explanation:
#np.arange(start, stop, step): This function generates values starting from start (10) up to (but not including) stop (61) with the specified step (5).
#The output will include integers: 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, and 60.

Array of integers from 10 to 60 with a step of 5:
[10 15 20 25 30 35 40 45 50 55 60]


In [36]:
#5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element

import numpy as np

# Create a NumPy array of strings
string_array = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations
uppercase_array = np.char.upper(string_array)
lowercase_array = np.char.lower(string_array)
titlecase_array = np.char.title(string_array)
capitalize_array = np.char.capitalize(string_array)

# Print the results
print("Original Array:")
print(string_array)

print("\nUppercase:")
print(uppercase_array)

print("\nLowercase:")
print(lowercase_array)

print("\nTitle Case:")
print(titlecase_array)

print("\nCapitalize:")
print(capitalize_array)

#Explanation:
#Creating the Array: np.array(['python', 'numpy', 'pandas']) creates a NumPy array of strings.
#Case Transformations:
#np.char.upper() converts all characters in the array to uppercase.
#np.char.lower() converts all characters to lowercase.
#np.char.title() converts the first character of each word to uppercase and the rest to lowercase.
#np.char.capitalize() capitalizes the first character of each string in the array.

Original Array:
['python' 'numpy' 'pandas']

Uppercase:
['PYTHON' 'NUMPY' 'PANDAS']

Lowercase:
['python' 'numpy' 'pandas']

Title Case:
['Python' 'Numpy' 'Pandas']

Capitalize:
['Python' 'Numpy' 'Pandas']


In [37]:
#6. Generate a NumPy array of words. Insert a space between each character of every word in the array

import numpy as np

# Create a NumPy array of words
words_array = np.array(['python', 'numpy', 'pandas'])

# Insert a space between each character of every word
spaced_words_array = np.array([' '.join(word) for word in words_array])

# Print the results
print("Original Array:")
print(words_array)

print("\nArray with Spaces Between Characters:")
print(spaced_words_array)

#Explanation:
#Creating the Array: np.array(['python', 'numpy', 'pandas']) creates a NumPy array of words.
#Inserting Spaces:
#The list comprehension [' '.join(word) for word in words_array] iterates over each word in the array and joins its characters with a space in between.
#The result is converted back to a NumPy array.

Original Array:
['python' 'numpy' 'pandas']

Array with Spaces Between Characters:
['p y t h o n' 'n u m p y' 'p a n d a s']


In [38]:
#7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

import numpy as np

# Create two 2D NumPy arrays
array1 = np.array([[1, 2, 3], [4, 5, 6]])
array2 = np.array([[7, 8, 9], [10, 11, 12]])

# Perform element-wise operations
addition = array1 + array2
subtraction = array1 - array2
multiplication = array1 * array2
division = array1 / array2

# Print the results
print("Array 1:")
print(array1)

print("\nArray 2:")
print(array2)

print("\nElement-wise Addition:")
print(addition)

print("\nElement-wise Subtraction:")
print(subtraction)

print("\nElement-wise Multiplication:")
print(multiplication)

print("\nElement-wise Division:")
print(division)

#Explanation:
#Creating the Arrays:

#array1 is initialized as a 2D array with values [[1, 2, 3], [4, 5, 6]].
#array2 is initialized as a 2D array with values [[7, 8, 9], [10, 11, 12]].
#Element-wise Operations:

#Addition: array1 + array2
#Subtraction: array1 - array2
#Multiplication: array1 * array2
#Division: array1 / array2

Array 1:
[[1 2 3]
 [4 5 6]]

Array 2:
[[ 7  8  9]
 [10 11 12]]

Element-wise Addition:
[[ 8 10 12]
 [14 16 18]]

Element-wise Subtraction:
[[-6 -6 -6]
 [-6 -6 -6]]

Element-wise Multiplication:
[[ 7 16 27]
 [40 55 72]]

Element-wise Division:
[[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]


In [40]:
#8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements

import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.eye(5)

# Extract the diagonal elements
diagonal_elements = np.diagonal(identity_matrix)

# Print the results
print("5x5 Identity Matrix:")
print(identity_matrix)

print("\nDiagonal Elements:")
print(diagonal_elements)

#Explanation:
#Creating the Identity Matrix: np.eye(5) generates a 5x5 identity matrix, where all the diagonal elements are 1, and all other elements are 0.
#Extracting Diagonal Elements: np.diagonal(identity_matrix) retrieves the diagonal elements of the identity matrix.

5x5 Identity Matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

Diagonal Elements:
[1. 1. 1. 1. 1.]


In [41]:
#9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

import numpy as np

# Function to check if a number is prime
def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

# Generate a NumPy array of 100 random integers between 0 and 1000
random_integers = np.random.randint(0, 1001, size=100)

# Find all prime numbers in the array
prime_numbers = np.array([num for num in random_integers if is_prime(num)])

# Print the results
print("Array of Random Integers:")
print(random_integers)

print("\nPrime Numbers in the Array:")
print(prime_numbers)

#Explanation:
#Prime Checking Function: The is_prime(n) function checks if a number is prime by testing divisibility from 2 up to the square root of n.
#Generating Random Integers: np.random.randint(0, 1001, size=100) generates an array of 100 random integers between 0 and 1000.
#Finding Prime Numbers: A list comprehension is used to filter out the prime numbers from the random integers, which are then converted to a NumPy array.

Array of Random Integers:
[918 750 398 435 868 872   8 558 656  35 445 538 556 120 752 924 640 325
  27 447 267 427 533 570 445 846 650 572 396 682 747 281 366 958 652 983
 124 703 420 517 703 934 992 965 637 262 170  41 685 992 141  36 314 643
 949 838 160 549 293 922 193 447 999 874 494 827 147 250 864 457 917 541
 846  39 300 508 335 344 716 793 156 186 915 428 163 528 275 574 178 369
 934 593 737 494 704 187 352  19 959 848]

Prime Numbers in the Array:
[281 983  41 643 293 193 827 457 541 163 593  19]


In [46]:
#10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.

import numpy as np

# Create a NumPy array representing daily temperatures for a month (30 days)
daily_temperatures = np.random.uniform(15, 35, size=30)

# Calculate the weekly averages (4 complete weeks + remaining days)
num_weeks = len(daily_temperatures) // 7  # Calculate the number of complete weeks
weekly_averages = np.mean(daily_temperatures[:num_weeks * 7].reshape(-1, 7), axis=1)

# Handle the remaining days
remaining_days = daily_temperatures[num_weeks * 7:]
if len(remaining_days) > 0:
    remaining_average = np.mean(remaining_days)
    weekly_averages = np.append(weekly_averages, remaining_average)

# Print the results
print("Daily Temperatures for the Month:")
print(daily_temperatures)

print("\nWeekly Averages:")
print(weekly_averages)

#Explanation:
#Import NumPy: The numpy library is imported for numerical operations.

#Generate Daily Temperatures: A NumPy array of 30 random float values (representing daily temperatures) is created, ranging from 15 to 35 degrees Celsius.

#Calculate Complete Weeks: The number of complete weeks in 30 days is determined, which is 4 weeks (28 days).

#Compute Weekly Averages:

#The first 28 days are reshaped into a 4x7 array, and the average temperature for each week is calculated.
#Handle Remaining Days: The last 2 days (if any) are averaged and added to the weekly averages.

#Output: The daily temperatures and the calculated weekly averages are printed.

Daily Temperatures for the Month:
[25.07667419 19.56533863 18.62627651 22.92392736 20.36436532 15.93122588
 17.53829383 18.08295521 28.00048054 25.56401485 25.32510183 30.75490658
 33.29895463 16.30099554 30.43326811 20.93137881 18.55265291 34.39866542
 29.72564387 18.76013114 19.41593091 33.08692703 34.7439123  23.52182193
 32.86588185 24.72455021 17.90971794 16.76173027 29.24923265 16.42778269]

Weekly Averages:
[20.00372882 25.33248703 24.60252445 26.23064879 22.83850767]
