NUMPY ASSIGNMENT

**QUES 1**: Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations.

NumPy, short for Numerical Python, is a fundamental library for scientific computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these data structures.

Purpose of NumPy

1. Efficient Array Handling: NumPy introduces the ndarray object, a powerful n-dimensional array that allows for efficient storage and manipulation of numerical data.

2. Mathematical Functions: It offers a wide range of mathematical functions to perform operations on arrays, such as linear algebra, statistical analysis, and Fourier transforms.

3. Integration with Other Libraries: NumPy serves as the foundation for many other scientific libraries in Python, such as SciPy, Pandas, and Matplotlib, enabling seamless integration.

Advantages of NumPy.

1. Performance: NumPy arrays are more efficient than Python lists. They are implemented in C, allowing for fast execution of operations. Vectorized operations minimize the need for loops, speeding up computations significantly.

2. Convenient Syntax: The syntax for manipulating arrays is straightforward and intuitive. This makes it easy to write and read code, especially for mathematical and statistical operations.

3. Broadcasting: NumPy supports broadcasting, which allows for arithmetic operations between arrays of different shapes. This feature simplifies code and enhances performance by eliminating the need for manual expansion of arrays.

Enhancements to Python's Capabilities.

1. Numerical Operations: NumPy allows for high-level operations that are optimized for performance. For example, matrix multiplications, element-wise operations, and more can be executed with simple, readable syntax.

2. Scientific Computing: By providing support for complex numerical tasks, NumPy enables Python to be a competitive choice for scientific computing alongside languages like MATLAB and R.

3. Data Analysis: With its powerful array operations, NumPy serves as the backbone for data manipulation in libraries like Pandas, making it easier to clean, transform, and analyze large datasets.

**QUES 2** : Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

In NumPy, both np.mean() and np.average() are used to compute averages, but they have some differences in functionality and use cases. Here’s a comparison:

np.mean()

*Purpose: Computes the arithmetic mean of the values in an array.

*Syntax: np.mean(a, axis=None, dtype=None, out=None, keepdims=False)

*Parameters:
 -a: Input array.

 -axis: Specifies the axis along which to compute the mean. If None, it computes the mean of the flattened array.

 -dtype: Data type to use in the computation.

 -out: An alternative output array to place the result.

 -keepdims: If True, the reduced axes are left in the result as dimensions with size one.

*Use Case: Use np.mean() when you simply want to find the mean value of an array or along a specific axis without needing weights.


np.average()

*Purpose: Computes the weighted average of an array, with the option to specify
 weights.

*Syntax: np.average(a, axis=None, weights=None, returned=False)

*Parameters:

-a: Input array.

-axis: Similar to np.mean(), specifies the axis along which to compute the
 average.

-weights: An array of weights associated with each element. If not specified,  
 it defaults to equal weights.

-returned: If True, it returns a tuple of the average and the sum of the weights.

When to Use Each

**Use np.mean() when:

You want a straightforward average of all elements or along a specific axis.

You do not require weights in your computation.

**Use np.average() when:

You need to account for weights that affect the average calculation.


You want to get both the weighted average and the sum of the weights (by setting returned=True).


**QUES 3**: Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays

Reversing a NumPy array can be accomplished using various methods, depending on the desired axis. Here’s how to do it for both 1D and 2D arrays.

In [None]:
#Reversing a 1D Array

import numpy as np

# Create a 1D array
arr_1d = np.array([10, 20, 30, 40, 50])

# Reverse the array using slicing
reversed_1d = arr_1d[::-1]

print("Original 1D array:", arr_1d)
print("Reversed 1D array:", reversed_1d)


Original 1D array: [10 20 30 40 50]
Reversed 1D array: [50 40 30 20 10]


Reversing a 2D Array
For a 2D array, you can reverse it along different axes using slicing. Here are examples for both axes:

Reversing Along the Rows (Axis 0): This will reverse the order of the rows.

In [None]:
# Create a 2D array
arr_2d = np.array([[10, 20, 30],
                   [40, 50, 60],
                   [70, 80, 90]])

# Reverse the array along the rows (axis 0)
reversed_rows = arr_2d[::-1]

print("Original 2D array:\n", arr_2d)
print("Reversed along rows (axis 0):\n", reversed_rows)


Original 2D array:
 [[10 20 30]
 [40 50 60]
 [70 80 90]]
Reversed along rows (axis 0):
 [[70 80 90]
 [40 50 60]
 [10 20 30]]


**QUES 4**:  How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.


In NumPy, you can determine the data type of the elements in an array using the .dtype attribute. This attribute returns a numpy.dtype object, which describes the type of elements in the array.

Importance of Data Types

Memory Management:

Size -  Optimization: Different data types have different memory requirements. For instance, an int32 takes 4 bytes, while an int64 takes 8 bytes. Choosing the right type can lead to significant memory savings, especially with large datasets.

Storage Efficiency: Using smaller data types (e.g., float32 instead of float64) for data that doesn't require high precision can help manage memory usage effectively.
Performance:

Speed of Operations: Operations on smaller data types generally execute faster because they require less memory bandwidth. For example, calculations with float32 can be faster than with float64 due to reduced data size.

Vectorization: NumPy's ability to perform vectorized operations (performing operations on whole arrays at once) is influenced by data types. Ensuring the appropriate data type can improve the efficiency of these operations.

Type Safety:

Ensuring that data is stored in the correct format helps prevent errors in computations. For example, mixing integer and floating-point types can lead to unintentional data type promotions, affecting performance and results.

Interoperability:

When interfacing with other libraries or systems (like databases, APIs, or machine learning frameworks), having the right data type can facilitate smooth data exchange and avoid conversion issues.

In [None]:
import numpy as np

# Create a NumPy array
arr = np.array([10, 20, 30, 40])

# Determine the data type
data_type = arr.dtype

print("Array:", arr)
print("Data type:", data_type)


Array: [10 20 30 40]
Data type: int64


**QUES 5** :  Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists.


In NumPy, ndarrays (short for "n-dimensional arrays") are the core data structure used for handling large datasets. They are designed to be more efficient and versatile than standard Python lists, especially for numerical computations and scientific computing.

Key Features of ndarrays--

Homogeneous Data:-

All elements in an ndarray are of the same data type (e.g., all integers, floats, etc.), which allows for optimized memory usage and faster operations.


Multi-dimensional:-

Ndarrays can be one-dimensional, two-dimensional, or multi-dimensional (n-dimensional), allowing for the representation of scalars, vectors, matrices, and higher-dimensional data.


Efficient Memory Layout:-

Ndarrays are stored in contiguous blocks of memory, leading to better cache performance and faster access times compared to Python lists, which can store objects scattered throughout memory.

Vectorized Operations:-

NumPy supports element-wise operations and mathematical functions that can be applied directly to ndarrays without the need for explicit loops, making code more concise and efficient


Differences from Standard Python Lists

Data Type Homogeneity:

Ndarrays: All elements must be of the same data type.

Python Lists: Can hold elements of different data types (e.g., integers, strings, objects).

Performance:

Ndarrays: Offer better performance for numerical operations due to optimized storage and execution.

Python Lists: Generally slower for numerical operations since they require iteration and type-checking.

Memory Efficiency:

Ndarrays: More memory-efficient because of contiguous storage and the fixed data type.

Python Lists: Use more memory because they store references to objects and have overhead for each element.


Multi-dimensional Support:

Ndarrays: Naturally support multi-dimensional structures (2D, 3D, etc.).

Python Lists: Can be nested to create multi-dimensional structures, but this is less efficient and more cumbersome.

**QUES 6**. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.


NumPy arrays offer several performance benefits over standard Python lists, particularly for large-scale numerical operations.

Memory Efficiency

Speed of Operations

Built-in Functions and Broadcasting

Less Overhead in Execution

Parallel Processing Capabilities

Handling Large Datasets

 Memory Efficiency

Contiguous Memory Allocation: NumPy arrays are stored in contiguous blocks of memory. This allows for more efficient use of cache memory and reduces overhead, while Python lists store references to objects that can be scattered throughout memory, leading to higher memory usage.

Fixed Data Types: NumPy arrays require all elements to be of the same data type, allowing for optimized storage. In contrast, Python lists can hold objects of varying types, which incurs additional overhead.

 Speed of Operations

Vectorized Operations: NumPy is designed to perform operations on entire arrays at once (vectorization) using optimized C and Fortran libraries. This leads to substantial performance improvements, especially for operations involving large datasets.

Built-in Functions and Broadcasting

Rich Library of Functions: NumPy provides a vast array of built-in functions for mathematical operations, linear algebra, statistical analysis, etc., which are optimized for performance.

Broadcasting: NumPy allows for operations between arrays of different shapes through broadcasting, enabling complex operations without the need for explicit replication of data. This saves both time and memory.

Less Overhead in Execution

Reduced Function Call Overhead: Operations on NumPy arrays involve fewer function calls compared to equivalent operations on Python lists, as many NumPy operations are implemented in low-level languages like C. This contributes to faster execution times.

**QUES 7**: Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.


In NumPy, vstack() and hstack() are functions used to stack arrays vertically and horizontally, respectively.
They are helpful for combining arrays along specific axes.

np.vstack()

Purpose: Stacks arrays in sequence vertically (row-wise).

Input: The arrays must have the same number of columns.

In [None]:
import numpy as np

# Create two 2D arrays
array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[7, 8, 9],
                   [10, 11, 12]])

# Stack arrays vertically
vstack_result = np.vstack((array1, array2))

print("Array 1:\n", array1)
print("Array 2:\n", array2)
print("Result of vstack:\n", vstack_result)


Array 1:
 [[1 2 3]
 [4 5 6]]
Array 2:
 [[ 7  8  9]
 [10 11 12]]
Result of vstack:
 [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]


np.hstack()

Purpose: Stacks arrays in sequence horizontally (column-wise).

Input: The arrays must have the same number of rows.

In [None]:
# Stack arrays horizontally
hstack_result = np.hstack((array1, array2))

print("Result of hstack:\n", hstack_result)


Result of hstack:
 [[ 1  2  3  7  8  9]
 [ 4  5  6 10 11 12]]


**QUES 8** -Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.


In NumPy, fliplr() and flipud() are methods used to flip arrays in specific directions.
 Here’s a detailed comparison of the two methods:

np.fliplr()

Purpose: Flips an array left to right (horizontally).

Effect: For a 2D array, it reverses the order of columns.

np.flipud()

Purpose: Flips an array upside down (vertically).

Effect: For a 2D array, it reverses the order of rows.

Differences Between fliplr() and flipud()

Direction of Flip:

fliplr(): Flips the array horizontally (left to right).

flipud(): Flips the array vertically (upside down).


Effect on Dimensions:

2D Arrays: Both methods are most commonly used with 2D arrays, where:

fliplr() changes the order of columns.

flipud() changes the order of rows.


In [None]:
import numpy as np

# Create a 2D array
array_2d = np.array([[12, 52, 83],
                     [14, 58, 65],
                     [17, 84, 94]])

# Flip the array left to right
flipped_lr = np.fliplr(array_2d)

print("Original Array:\n", array_2d)
print("Flipped Left to Right:\n", flipped_lr)


Original Array:
 [[12 52 83]
 [14 58 65]
 [17 84 94]]
Flipped Left to Right:
 [[83 52 12]
 [65 58 14]
 [94 84 17]]


In [None]:
# Flip the array upside down
flipped_ud = np.flipud(array_2d)

print("Flipped Upside Down:\n", flipped_ud)


Flipped Upside Down:
 [[17 84 94]
 [14 58 65]
 [12 52 83]]


**QUES** 9:. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits.

The array_split() method in NumPy is used to divide an array into multiple sub-arrays. It offers more flexibility than split() by allowing the array to be split into uneven parts.

Functionality of array_split()

Basic Syntax

ary: The input array to be split.

indices_or_sections: This can be an integer (number of splits) or an array of indices indicating where to split the array.

axis: The axis along which to split the array (default is 0 for vertical splits).

Key Features

Uneven Splits: Unlike the split() method, which requires equal-sized splits, array_split() can handle situations where the number of elements is not perfectly divisible by the number of splits requested. In this case, some sub-arrays will have one more element than others.

Returns a List: The function returns a list of sub-arrays.

Handles Multi-dimensional Arrays: You can specify the axis along which to split, making it versatile for multi-dimensional arrays.

**QUES 10**. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

Vectorization

 Vectorization refers to the process of applying operations on entire arrays rather than using explicit loops to perform operations element by element. In NumPy, vectorized operations are implemented using optimized C code, making them much faster than equivalent Python loops.

Benefits:

Performance: Vectorized operations leverage low-level optimizations and are generally much faster than Python loops. This is particularly important when working with large datasets.

Simplicity: Code becomes cleaner and more readable, as operations can be expressed in a concise manner without explicit iteration.


Broadcasting

Broadcasting is a technique that allows NumPy to perform arithmetic operations on arrays of different shapes. It enables automatic expansion of the smaller array's shape to match that of the larger array without creating copies of the data.


Alignment of Dimensions: When performing operations on arrays of different shapes, NumPy compares the shapes from the last dimension backward. If dimensions are equal or one of them is 1, broadcasting can occur.

Expansion: If the shapes are compatible, NumPy "broadcasts" the smaller array across the larger array.


In [None]:
import numpy as np

# Create two large arrays
a = np.random.rand(1000000)
b = np.random.rand(1000000)

# Vectorized addition
c = a + b  # This adds the two arrays element-wise

# Traditional loop (less efficient)
c_loop = np.empty_like(a)
for i in range(len(a)):
    c_loop[i] = a[i] + b[i]


In [None]:
# Create a 1D array and a 2D array
a = np.array([10, 20, 30])        # Shape (3,)
b = np.array([[100], [200], [300]])  # Shape (3, 1)

# Broadcasting adds the 1D array to each column of the 2D array
result = a + b

print("Array a:\n", a)
print("Array b:\n", b)
print("Result of broadcasting:\n", result)


Array a:
 [10 20 30]
Array b:
 [[100]
 [200]
 [300]]
Result of broadcasting:
 [[110 120 130]
 [210 220 230]
 [310 320 330]]


                                                    
                                                    **Practical Questions:**

---



1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.

In [None]:
import numpy as np

# Create a 3x3 array with random integers between 1 and 100
array = np.random.randint(1, 101, size=(3, 3))

# Print the original array
print("Original Array:\n", array)

# Interchange rows and columns (transpose the array)
transposed_array = array.T

# Print the transposed array
print("Transposed Array:\n", transposed_array)


Original Array:
 [[74 66 77]
 [86 68 96]
 [19 57 43]]
Transposed Array:
 [[74 86 19]
 [66 68 57]
 [77 96 43]]


2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array

In [None]:
import numpy as np

# Generate a 1D NumPy array with 10 elements
array_1d = np.arange(10)  # This creates an array with values [0, 1, 2, ..., 9]

# Print the original 1D array
print("Original 1D Array:\n", array_1d)

# Reshape the 1D array into a 2x5 array
array_2x5 = array_1d.reshape(2, 5)
print("\nReshaped to 2x5 Array:\n", array_2x5)

# Reshape the 1D array into a 5x2 array
array_5x2 = array_1d.reshape(5, 2)
print("\nReshaped to 5x2 Array:\n", array_5x2)


Original 1D Array:
 [0 1 2 3 4 5 6 7 8 9]

Reshaped to 2x5 Array:
 [[0 1 2 3 4]
 [5 6 7 8 9]]

Reshaped to 5x2 Array:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


QUES 3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.

In [2]:
import numpy as np

# Create a 4x4 array with random float values
array_4x4 = np.random.rand(4, 4)

# Add a border of zeros around it
array_6x6 = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)

print("4x4 Array:")
print(array_4x4)
print("\n6x6 Array with border of zeros:")
print(array_6x6)


4x4 Array:
[[0.11690286 0.47181873 0.14567717 0.12075625]
 [0.65713634 0.15210343 0.98292236 0.14132544]
 [0.42125629 0.97497161 0.74884726 0.77926795]
 [0.98356362 0.85718844 0.55602223 0.62095756]]

6x6 Array with border of zeros:
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.11690286 0.47181873 0.14567717 0.12075625 0.        ]
 [0.         0.65713634 0.15210343 0.98292236 0.14132544 0.        ]
 [0.         0.42125629 0.97497161 0.74884726 0.77926795 0.        ]
 [0.         0.98356362 0.85718844 0.55602223 0.62095756 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


QUES 4: Using NumPy, create an array of integers from 10 to 60 with a step of 5.

In [3]:
import numpy as np

# Create an array of integers from 10 to 60 with a step of 5
array = np.arange(10, 61, 5)

print(array)


[10 15 20 25 30 35 40 45 50 55 60]


QUES 5 :Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.



In [4]:
import numpy as np

# Create a NumPy array of strings
array = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations
uppercase_array = np.char.upper(array)
lowercase_array = np.char.lower(array)
titlecase_array = np.char.title(array)
capitalize_array = np.char.capitalize(array)

# Print the results
print("Original Array:")
print(array)
print("\nUppercase Array:")
print(uppercase_array)
print("\nLowercase Array:")
print(lowercase_array)
print("\nTitle Case Array:")
print(titlecase_array)
print("\nCapitalized Array:")
print(capitalize_array)


Original Array:
['python' 'numpy' 'pandas']

Uppercase Array:
['PYTHON' 'NUMPY' 'PANDAS']

Lowercase Array:
['python' 'numpy' 'pandas']

Title Case Array:
['Python' 'Numpy' 'Pandas']

Capitalized Array:
['Python' 'Numpy' 'Pandas']


QUES6. Generate a NumPy array of words. Insert a space between each character of every word in the array,


In [5]:
import numpy as np

# Create a NumPy array of words
words_array = np.array(['hello', 'physicswallah', 'data analytics ', 'batch'])

# Insert a space between each character of every word
spaced_words_array = np.char.join(' ', words_array)

# Print the results
print("Original Array:")
print(words_array)
print("\nArray with spaces between characters:")
print(spaced_words_array)


Original Array:
['hello' 'physicswallah' 'data analytics ' 'batch']

Array with spaces between characters:
['h e l l o' 'p h y s i c s w a l l a h' 'd a t a   a n a l y t i c s  '
 'b a t c h']


QUES 7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

In [7]:
import numpy as np

# Create two 2D NumPy arrays
array1 = np.array([[4, 2, 7],
                    [3, 5, 9]])

array2 = np.array([[7, 8, 9],
                    [10, 11, 12]])

# Perform element-wise operations
addition = array1 + array2
subtraction = array1 - array2
multiplication = array1 * array2
division = array1 / array2

# Print the results
print("Array 1:")
print(array1)

print("\nArray 2:")
print(array2)

print("\nElement-wise Addition:")
print(addition)

print("\nElement-wise Subtraction:")
print(subtraction)

print("\nElement-wise Multiplication:")
print(multiplication)

print("\nElement-wise Division:")
print(division)


Array 1:
[[4 2 7]
 [3 5 9]]

Array 2:
[[ 7  8  9]
 [10 11 12]]

Element-wise Addition:
[[11 10 16]
 [13 16 21]]

Element-wise Subtraction:
[[-3 -6 -2]
 [-7 -6 -3]]

Element-wise Multiplication:
[[ 28  16  63]
 [ 30  55 108]]

Element-wise Division:
[[0.57142857 0.25       0.77777778]
 [0.3        0.45454545 0.75      ]]


QUES 8: Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.


In [8]:
import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.eye(5)

# Extract the diagonal elements
diagonal_elements = np.diagonal(identity_matrix)

# Print the results
print("5x5 Identity Matrix:")
print(identity_matrix)

print("\nDiagonal Elements:")
print(diagonal_elements)


5x5 Identity Matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

Diagonal Elements:
[1. 1. 1. 1. 1.]


QUES 9 : Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

In [9]:
import numpy as np

# Generate an array of 100 random integers between 0 and 1000
random_integers = np.random.randint(0, 1001, size=100)

# Function to check if a number is prime
def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

# Find all prime numbers in the array
prime_numbers = [num for num in random_integers if is_prime(num)]

# Print the results
print("Random Integers Array:")
print(random_integers)

print("\nPrime Numbers:")
print(prime_numbers)


Random Integers Array:
[ 402  560  465  942  948  655  233   15  722  148  107  899  571  491
  287  901  587  623  656  806  403  962  551  901 1000  588  795  744
  946  765  614  942  951  929   90    1  576  592  428  219   44  895
  174  398  623  680  237  730  367  983  613  707  316  701  172  738
  189  936   33  163  733   74  589  288  335  683   36  343   25  205
  974   94  172   37  861  552  614  305  154  745  204  488  604   97
  883  829  172  972  939  374  164  972  701   60  802  528  947  614
  928  805]

Prime Numbers:
[233, 107, 571, 491, 587, 929, 367, 983, 613, 701, 163, 733, 683, 37, 97, 883, 829, 701, 947]


QUES 10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.

In [1]:
import numpy as np

# Create a NumPy array representing daily temperatures for 30 days
# For this example, we'll generate random temperatures between 15 and 35 degrees Celsius
daily_temperatures = np.random.uniform(15, 35, size=30)

# Calculate the weekly averages
# Reshape the array to (4, 7) for 4 weeks (28 days), and calculate the mean across the rows
weekly_averages = daily_temperatures[:28].reshape(4, 7).mean(axis=1)

# Print the results
print("Daily Temperatures for 30 Days:")
print(daily_temperatures)

print("\nWeekly Averages:")
print(weekly_averages)


Daily Temperatures for 30 Days:
[22.40445964 17.45673358 31.51156495 20.12784283 30.55340505 25.23611055
 26.57662894 30.27923443 15.14000831 18.34956821 31.77852382 23.82914442
 33.38760608 32.47600956 34.17923677 21.98451132 28.2666166  31.41510726
 31.89030953 16.74152386 15.28195302 25.81522013 33.86128883 19.54762023
 30.55146118 34.73800603 16.2607487  23.99934536 24.27567369 23.53081187]

Weekly Averages:
[24.8381065  26.46287069 25.67989405 26.39624149]
