### Numpy Assignment ###

In [None]:
#Q1 Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
#  enhance Python's capabilities for numerical operations?

'''
NumPy is a fundamental library for scientific computing and data analysis in Python. Its purpose is to provide support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays. Here’s how NumPy enhances Python's capabilities for numerical operations:

Purpose of NumPy
Efficient Array Handling: NumPy introduces the ndarray (n-dimensional array) object, which is a fast and flexible container for large datasets. It allows for efficient storage and manipulation of numerical data.
Mathematical Functions: It provides a wide range of mathematical functions for operations on arrays, including linear algebra operations, statistical functions, and Fourier transforms.
Interoperability: NumPy is compatible with a variety of other libraries in the scientific Python ecosystem (like SciPy, pandas, and scikit-learn), enabling smooth integration and data exchange.

Advantages of NumPy
Performance: NumPy is implemented in C and Fortran, making array operations much faster than Python's built-in lists. This performance boost is achieved through efficient memory management and vectorized operations, which allow operations to be applied to entire arrays at once.
Memory Efficiency: NumPy arrays are more compact than Python lists. They use less memory and allow for efficient data storage and retrieval, which is crucial when working with large datasets.
Convenient Array Operations: With NumPy, you can perform complex mathematical operations on arrays with simple and concise syntax. For example, element-wise operations, broadcasting, and aggregation functions are straightforward and intuitive.
Broadcasting: This powerful feature allows NumPy to perform operations on arrays of different shapes in a way that makes sense, without requiring explicit loops. It simplifies code and improves efficiency when performing operations on arrays with mismatched dimensions.
Standardized Data Types: NumPy supports a wide range of data types, including integers, floats, and complex numbers. This standardization ensures consistent behavior and accurate calculations across different operations.
Integration with C/C++/Fortran: NumPy provides tools for interfacing with code written in lower-level languages. This allows for the development of performance-critical components in C or Fortran while leveraging NumPy for higher-level operations.
Rich Ecosystem: NumPy’s functionality forms the foundation for many other scientific and analytical libraries in Python. Libraries such as pandas (for data manipulation), SciPy (for additional scientific computations), and scikit-learn (for machine learning) all rely on NumPy arrays for their data structures.

How NumPy Enhances Python’s Capabilities
Vectorization: By replacing explicit loops with vectorized operations, NumPy allows for more readable and efficient code. For instance, adding two arrays element-wise is done with a simple + operator instead of manually iterating through elements.
Array Manipulation: NumPy offers robust functions for reshaping, slicing, and aggregating arrays. These capabilities simplify data preprocessing and manipulation tasks in scientific computing.
Mathematical Computations: NumPy’s extensive library of mathematical functions, such as trigonometric functions, statistical functions, and linear algebra routines, provides a powerful toolkit for numerical analysis.

Overall, NumPy significantly extends Python’s capabilities for numerical computing by providing high-performance array operations, memory-efficient data structures, and a rich set of mathematical functions. Its design allows for concise, efficient, and readable code, making it an essential tool for scientific computing and data analysis.

'''

In [None]:
#Q2 Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the
# other?

'''
np.mean()
Purpose:
Computes the arithmetic mean (average) of array elements along a specified axis.
Syntax : - numpy.mean(a, axis=None, dtype=None, out=None, keepdims=False)
Parameters:
a: Input array.
axis: Axis or axes along which the mean is computed. Default is None, which means the mean is computed over all elements.
dtype: Data type to use for the computation. Default is None.
out: Alternative output array in which to place the result.
keepdims: If True, the reduced axes are retained in the result as dimensions with size one.

np.mean() calculates the sum of the array elements and divides by the number of elements, with no additional weighting or special considerations.

np.average()
Purpose:
Computes the weighted average of array elements, with an optional weighting factor for each element.
Syntax :- numpy.average(a, axis=None, weights=None, returned=False)
Parameters:
a: Input array.
axis: Axis or axes along which the average is computed. Default is None.
weights: Array of weights with the same shape as a. If provided, the average is weighted according to these weights.
returned: If True, the function also returns the sum of the weights.

np.average() calculates the weighted average, where you provide a weights array that specifies the importance or frequency of each element. If weights is None, it defaults to calculating the arithmetic mean.

'''

In [None]:
#Q3 Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

'''
1D Array: Reversal is done using slicing with [::-1].
2D Array:
Reversing Rows (Axis 0): Use [::-1, :] to reverse the rows while keeping columns unchanged.
Reversing Columns (Axis 1): Use [:, ::-1] to reverse the columns while keeping rows unchanged.
Examples of the same is as follows
'''

In [7]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

reversed_arr = arr[::-1]

print("Original array:", arr)
print("Reversed array:", reversed_arr)

Original array: [1 2 3 4 5]
Reversed array: [5 4 3 2 1]


In [8]:
import numpy as np


arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

reversed_axis0 = arr_2d[::-1, :]

reversed_axis1 = arr_2d[:, ::-1]

print("Original array:\n", arr_2d)
print("Reversed along axis 0 (rows):\n", reversed_axis0)
print("Reversed along axis 1 (columns):\n", reversed_axis1)

Original array:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Reversed along axis 0 (rows):
 [[7 8 9]
 [4 5 6]
 [1 2 3]]
Reversed along axis 1 (columns):
 [[3 2 1]
 [6 5 4]
 [9 8 7]]


In [None]:
#Q5 How can you determine the data type of elements in a NumPy array? Discuss the importance of data types
#   in memory management and performance.
'''
To determine the data type of elements in a NumPy array, use the .dtype attribute of the array. For example:
import numpy as np
arr = np.array([1, 2, 3])
print(arr.dtype)

Importance of Data Types
Memory Management: Different data types consume different amounts of memory. Choosing the right type (e.g., int8 vs. int64) helps optimize memory usage, which is crucial for handling large datasets efficiently.
Performance: Data types affect computational speed. Smaller types (e.g., float32) can lead to faster operations compared to larger types (e.g., float64). This choice can improve performance by enabling more efficient data processing and leveraging optimized low-level operations.

Selecting the appropriate data type balances memory usage and performance needs in numerical computations.

In [9]:
import numpy as np

arr = np.array([1, 2, 3])
print(arr.dtype) 

int64


In [None]:
#Q6 Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.
'''
1. Memory Efficiency
Fixed Size and Type: NumPy arrays store data in contiguous blocks of memory with a fixed size and type for each element. This fixed size and type allow NumPy to use less overhead and access memory more efficiently compared to Python lists, which store objects of varying sizes and types and have additional overhead for each element.
Compact Storage: NumPy arrays use a more compact memory layout, which reduces the memory footprint compared to Python lists that store references to Python objects, each of which has additional overhead.

2. Performance Optimization
Vectorization: NumPy employs vectorized operations, which allow operations to be applied to entire arrays at once. This reduces the need for explicit loops and leverages optimized low-level implementations in C and Fortran. In contrast, Python lists require loops and explicit element-wise operations, which are slower.
Broadcasting: NumPy supports broadcasting, a technique that allows operations on arrays of different shapes without requiring explicit looping or copying. This enables efficient computation by applying operations across arrays in a memory-efficient manner.

3. Efficient Computation
Low-Level Implementation: NumPy operations are implemented in C and Fortran, which are much faster than the Python interpreter. This low-level implementation ensures that operations on NumPy arrays are optimized for performance, leveraging efficient algorithms and memory access patterns.
Pre-compiled Functions: Many mathematical and statistical functions in NumPy are pre-compiled and optimized, providing faster execution compared to equivalent Python code that would require manual implementation of algorithms.

4. Parallelization
Underlying Libraries: NumPy often uses underlying libraries like BLAS (Basic Linear Algebra Subprograms) and LAPACK (Linear Algebra PACKage) that are optimized for performance and may take advantage of multi-core processors to parallelize computations.


In [10]:
#Python List operation
import time

# Large-scale operation on a Python list
data = list(range(1000000))

start_time = time.time()
squared = [x**2 for x in data]  # Using a list comprehension
end_time = time.time()

print("Python list operation time:", end_time - start_time)

Python list operation time: 0.3497123718261719


In [11]:
#Python Array operation
import numpy as np
import time

# Large-scale operation on a NumPy array
data_np = np.arange(1000000)

start_time = time.time()
squared_np = data_np**2  # Using a vectorized operation
end_time = time.time()

print("NumPy array operation time:", end_time - start_time)

NumPy array operation time: 0.0032553672790527344


In [None]:
#Q7 Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.

'''
In NumPy, vstack() and hstack() are functions used for stacking arrays along different axes. 
They are useful for combining arrays into a larger array in a vertical or horizontal direction. 
numpy.vstack()
Purpose: Stack arrays in sequence vertically (row-wise).
Operation: Combines arrays along the vertical axis (axis 0). The arrays need to have the same shape along all but the first axis.

numpy.hstack()
Purpose: Stack arrays in sequence horizontally (column-wise).
Operation: Combines arrays along the horizontal axis (axis 1). The arrays need to have the same shape along all but the second axis.

Examples of both are as follows :
'''


In [14]:
import numpy as np

# Define two 2D arrays
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([[7, 8, 9], [10, 11, 12]])

# Stack arrays vertically
result_vstack = np.vstack((arr1, arr2))
result_hstack = np.hstack((arr1, arr2))

print("Array 1:\n", arr1)
print("Array 2:\n", arr2)
print("Stacked Vertically using vstack :\n", result_vstack)
print("Stacked Horizontally using hstack :\n", result_hstack)

Array 1:
 [[1 2 3]
 [4 5 6]]
Array 2:
 [[ 7  8  9]
 [10 11 12]]
Stacked Vertically using vstack :
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]
Stacked Horizontally using hstack :
 [[ 1  2  3  7  8  9]
 [ 4  5  6 10 11 12]]


In [None]:
#Q8  Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.
'''
The fliplr() and flipud() methods in NumPy are used to reverse the order of elements in arrays along specific axes.

numpy.fliplr()
Purpose: Flip (reverse) the elements of an array along the left-right (horizontal) axis.
Operation: Reverses the order of columns in each row.

Effect on Various Dimensions:
1D Arrays: fliplr() is not applicable to 1D arrays. It will raise an error if applied to a 1D array.
2D Arrays: Reverses the columns within each row. This is effectively a horizontal flip.

numpy.flipud()
Purpose: Flip (reverse) the elements of an array along the up-down (vertical) axis.
Operation: Reverses the order of rows.

Effect on Various Dimensions:

1D Arrays: flipud() is not applicable to 1D arrays. It will raise an error if applied to a 1D array.
2D Arrays: Reverses the rows of the array. This is effectively a vertical flip.

Examples is as follows :
'''

In [16]:
import numpy as np

# Define a 2D array
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])

result_fliplr = np.fliplr(arr_2d)
result_flipud = np.flipud(arr_2d)

print("Original array:\n", arr_2d)
print("After fliplr:\n", result_fliplr)
print("After flipud:\n", result_flipud)

Original array:
 [[1 2 3]
 [4 5 6]]
After fliplr:
 [[3 2 1]
 [6 5 4]]
After flipud:
 [[4 5 6]
 [1 2 3]]


In [None]:
#Q9 Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

'''
The numpy.array_split() method is a versatile function for dividing an array into multiple sub-arrays. It is particularly useful when you need to split large arrays into smaller chunks for processing or analysis. 

Functionality of array_split()
Purpose:
To split an array into multiple sub-arrays along a specified axis.
Syntax: numpy.array_split(ary, indices_or_sections, axis=0)

Handling Uneven Splits
When the number of sections into which the array is to be split is not evenly divisible by the length of the array along the specified axis, array_split() handles this gracefully:
Integer Argument:
When you provide an integer to indices_or_sections, array_split() will divide the array into the specified number of sub-arrays. If the array size along the specified axis cannot be divided evenly, the sub-arrays will differ in size. The function ensures that the size difference between the largest and smallest sub-arrays is at most one element.
1D Array of Indices:
If you provide a list of indices, array_split() will split the array at those indices. The resulting sub-arrays will be determined by these index positions.
'''

In [9]:
import numpy as np

arr = np.arange(10)  # array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

result = np.array_split(arr, 3)
result1 = np.array_split(arr, [2,4, 6])

print("Original array:", arr)
print("Split into 3 sections:", result)
print("Split at indices [3, 6]:", result1)

Original array: [0 1 2 3 4 5 6 7 8 9]
Split into 3 sections: [array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8, 9])]
Split at indices [3, 6]: [array([0, 1]), array([2, 3]), array([4, 5]), array([6, 7, 8, 9])]


In [None]:
#10 Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?
'''
Vectorization
Concept:
Vectorization refers to the process of replacing explicit loops with array operations that apply functions or operations to entire arrays at once. Instead of iterating over elements one by one using loops, vectorized operations are applied in a way that utilizes low-level optimizations.

Benefits:
Performance: Vectorized operations are implemented in low-level languages like C or Fortran, which can be optimized for performance and run much faster than equivalent Python loops.
Code Simplicity: Code becomes more concise and easier to read, as operations are expressed in a more natural mathematical form.
Avoids Python Overhead: By operating on whole arrays at once, vectorized operations reduce the overhead of Python loops and function calls.

Broadcasting
Concept:
Broadcasting is a technique that allows NumPy to perform arithmetic operations on arrays of different shapes in a way that extends the smaller array to match the shape of the larger one without actually replicating data. This allows for operations on arrays with different dimensions.

Rules of Broadcasting:
If arrays have different ranks (number of dimensions), the shape of the smaller array is padded with ones on the left side until both shapes are the same.
Arrays are then compared element-wise. For each dimension, the sizes must either be the same or one of them must be 1.
If a dimension of size 1 is encountered, it is broadcasted (stretched) to match the other dimension’s size.

Benefits:
Efficiency: Broadcasting avoids the need for explicit loops and data replication, leading to more memory-efficient and faster computations.
Flexibility: Allows for operations on arrays of different shapes and dimensions without requiring reshaping or manual manipulation.

Examples are as follows :
'''

In [11]:
import numpy as np

#Vectorization
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

result = a + b

print("Result of vectorized addition:", result)

#broadcasting
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
arr_1d = np.array([10, 20, 30])

result = arr_2d + arr_1d

print("Original 2D array:\n", arr_2d)
print("1D array:\n", arr_1d)
print("Result after broadcasting:\n", result)

Result of vectorized addition: [ 6  8 10 12]
Original 2D array:
 [[1 2 3]
 [4 5 6]]
1D array:
 [10 20 30]
Result after broadcasting:
 [[11 22 33]
 [14 25 36]]


## Practical Questions ##

In [16]:
# 1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns

import numpy as np
arr=np.random.randint(1,100,(3,3))
print(f" Original array :\n {arr}")
print(f" Transposed array :\n {arr.T}")

 Original array :
 [[60 32 26]
 [43 22 40]
 [66 10 57]]
 Transposed array :
 [[60 43 66]
 [32 22 10]
 [26 40 57]]


In [22]:
# 2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.

import numpy as np
arr1=np.arange(10)
print(f"Original array : \n {arr1}")

print(f"Array with 2x5 size :\n {arr1.reshape(2,5)}")
print(f"Array with 5x2 size :\n {arr1.reshape(5,2)}")


Original arraya : 
 [0 1 2 3 4 5 6 7 8 9]
Array with 2x5 size :
 [[0 1 2 3 4]
 [5 6 7 8 9]]
Array with 5x2 size :
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [30]:
# 3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array
import numpy as np

array_4x4 = np.random.rand(4, 4)

print("Original 4x4 Array:\n", array_4x4)

array_6x6 = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)

print("\n6x6 Array with border of zeros:\n", array_6x6)

Original 4x4 Array:
 [[0.6502521  0.41059869 0.91971631 0.06673692]
 [0.79449254 0.95180654 0.64139426 0.79307491]
 [0.80178915 0.53592664 0.2184583  0.04774569]
 [0.1267599  0.452806   0.54638946 0.58450088]]

6x6 Array with border of zeros:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.6502521  0.41059869 0.91971631 0.06673692 0.        ]
 [0.         0.79449254 0.95180654 0.64139426 0.79307491 0.        ]
 [0.         0.80178915 0.53592664 0.2184583  0.04774569 0.        ]
 [0.         0.1267599  0.452806   0.54638946 0.58450088 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [35]:
# 4 Using NumPy, create an array of integers from 10 to 60 with a step of 5.
import numpy as np
arr3=np.arange(10,61,5)
arr3

array([10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60])

In [36]:
# 5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations(uppercase, lowercase, title case, etc.) to each element.

import numpy as np

# Create a NumPy array of strings
array = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations
uppercase_array = np.char.upper(array)
lowercase_array = np.char.lower(array)
titlecase_array = np.char.title(array)
capitalize_array = np.char.capitalize(array)

print("Original Array:\n", array)
print("\nUppercase Array:\n", uppercase_array)
print("\nLowercase Array:\n", lowercase_array)
print("\nTitlecase Array:\n", titlecase_array)
print("\nCapitalize Array:\n", capitalize_array)

Original Array:
 ['python' 'numpy' 'pandas']

Uppercase Array:
 ['PYTHON' 'NUMPY' 'PANDAS']

Lowercase Array:
 ['python' 'numpy' 'pandas']

Titlecase Array:
 ['Python' 'Numpy' 'Pandas']

Capitalize Array:
 ['Python' 'Numpy' 'Pandas']


In [37]:
# 6 Generate a NumPy array of words. Insert a space between each character of every word in the array.

import numpy as np

# Create a NumPy array of words
words_array = np.array(['python', 'numpy', 'pandas'])

# Insert a space between each character of every word
spaced_array = np.char.join(' ', words_array)

print("Original Array:\n", words_array)
print("\nArray with spaces between characters:\n", spaced_array)

Original Array:
 ['python' 'numpy' 'pandas']

Array with spaces between characters:
 ['p y t h o n' 'n u m p y' 'p a n d a s']


In [38]:
# 7 Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

import numpy as np


array1 = np.array([[1, 2, 3], [4, 5, 6]])
array2 = np.array([[7, 8, 9], [10, 11, 12]])

addition_result = array1 + array2
subtraction_result = array1 - array2
multiplication_result = array1 * array2
division_result = array1 / array2

print("Array 1:\n", array1)
print("\nArray 2:\n", array2)
print("\nElement-wise Addition:\n", addition_result)
print("\nElement-wise Subtraction:\n", subtraction_result)
print("\nElement-wise Multiplication:\n", multiplication_result)
print("\nElement-wise Division:\n", division_result)

Array 1:
 [[1 2 3]
 [4 5 6]]

Array 2:
 [[ 7  8  9]
 [10 11 12]]

Element-wise Addition:
 [[ 8 10 12]
 [14 16 18]]

Element-wise Subtraction:
 [[-6 -6 -6]
 [-6 -6 -6]]

Element-wise Multiplication:
 [[ 7 16 27]
 [40 55 72]]

Element-wise Division:
 [[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]


In [40]:
# 8 Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.

import numpy as np

# Step 1: Create a 5x5 identity matrix
identity_matrix = np.eye(5,dtype="int")

print("5x5 Identity Matrix:\n", identity_matrix)

diagonal_elements = identity_matrix.diagonal()
print("\nDiagonal Elements:\n", diagonal_elements)

5x5 Identity Matrix:
 [[1 0 0 0 0]
 [0 1 0 0 0]
 [0 0 1 0 0]
 [0 0 0 1 0]
 [0 0 0 0 1]]

Diagonal Elements:
 [1 1 1 1 1]


In [41]:
# 9  Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

import numpy as np

# Step 1: Generate a NumPy array of 100 random integers between 0 and 1000
array = np.random.randint(0, 1000, size=100)

print("Original Array:\n", array)

def is_prime(n):
    if n <= 1:
        return False
    if n <= 3:
        return True
    if n % 2 == 0 or n % 3 == 0:
        return False
    i = 5
    while i * i <= n:
        if n % i == 0 or n % (i + 2) == 0:
            return False
        i += 6
    return True

primes = np.array([num for num in array if is_prime(num)])

print("\nPrime Numbers in the Array:\n", primes)

Original Array:
 [363 122 863  90 223 426 253 953 803 962 739 608 582 650 298 981 681 503
  48 458 401 958 408 228 823  53 143 837 721 836 238  36 656 828 576 689
 424 289 560 130 642   7 938 758 632 694 832  75 150 342 791   5  31 800
 587 901 329 758 349 991  14 575 997 563 947 610 394 247  30 183  11 284
 581 599 414 311 404 295 364 119 781 947 235 836 603 909 664 591 535 574
 855 560 497 459 528 142 735 343 669 560]

Prime Numbers in the Array:
 [863 223 953 739 503 401 823  53   7   5  31 587 349 991 997 563 947  11
 599 311 947]


In [51]:
# 10 Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.

import numpy as np

daily_temperatures = np.random.randint(20, 30, size=28)

print("Daily Temperatures for the Month:\n", daily_temperatures)

weekly_temperatures = daily_temperatures.reshape(4, 7)

print("\nWeekly Temperatures:\n", weekly_temperatures)

weekly_averages = np.mean(weekly_temperatures, axis=1)

print("\nWeekly Averages:\n", weekly_averages)

Daily Temperatures for the Month:
 [24 25 23 29 22 29 28 24 26 25 27 27 28 29 22 28 25 23 24 24 20 29 25 21
 21 23 25 23]

Weekly Temperatures:
 [[24 25 23 29 22 29 28]
 [24 26 25 27 27 28 29]
 [22 28 25 23 24 24 20]
 [29 25 21 21 23 25 23]]

Weekly Averages:
 [25.71428571 26.57142857 23.71428571 23.85714286]
