<a href="https://colab.research.google.com/github/Reshma677/K.RESHMA/blob/main/numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Theoretical Questions:**

1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
enhance Python's capabilities for numerical operations?

ANS  the purpose and advantages of NumPy in scientific computing and data analysis:

Purpose:

NumPy is the fundamental package for scientific computing with Python. It provides support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Advantages:

Efficiency: NumPy's core is implemented in C, making it significantly faster than native Python code for numerical operations. This is crucial when dealing with large datasets or complex computations.
Multi-dimensional arrays: NumPy's ndarray (n-dimensional array) is a powerful data structure for storing and manipulating large datasets efficiently. It allows for vectorized operations, eliminating the need for explicit loops, further improving performance.
Broadcasting: NumPy's broadcasting feature simplifies operations between arrays of different shapes, enabling concise and efficient code.
Mathematical functions: NumPy provides a vast collection of mathematical functions, covering areas like linear algebra, Fourier transforms, random number generation, and more, eliminating the need for writing custom implementations.
How NumPy enhances Python's capabilities for numerical operations:

NumPy provides the foundation for performing efficient numerical computations in Python. Its ndarray object and optimized functions significantly outperform Python's built-in lists for numerical tasks. Without NumPy, scientific computing and data analysis in Python would be much slower and less convenient.

2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the others?

ANS   

Similarities:

Both functions calculate the average (central tendency) of an array of numbers.
By default, they compute the arithmetic mean.


Differences:

np.average() can handle weighted averages using the weights parameter. This makes np.average() more versatile.
np.mean() does not have a weights parameter. It always calculates the unweighted arithmetic mean.
When to use which function?

Use np.mean() for calculating the simple arithmetic mean of an array.
Use np.average() when you need to calculate a weighted average, where different elements of the array have different importances.


In [1]:
import numpy as np

data = np.array([1, 2, 3, 4, 5])
weights = np.array([1, 2, 3, 2, 1])

mean = np.mean(data)
weighted_average = np.average(data, weights=weights)

print(f"Mean: {mean}")
print(f"Weighted Average: {weighted_average}")

Mean: 3.0
Weighted Average: 3.0


In this example, mean will be the arithmetic mean of data (which is 3). weighted_average will be the weighted average, calculated by multiplying each element in data with its corresponding weight in weights, summing the results, and dividing by the sum of the weights.

In summary: If you need a simple average, use np.mean(). If you need a weighted average, use np.average().

I hope this helps! Let me know if you have any other questions.

3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D array?

ANS      Reversing a 1D NumPy Array You can reverse a 1D NumPy array using the np.flip() function or array slicing. Here's how:

In [2]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Using np.flip()
reversed_arr = np.flip(arr)
print(reversed_arr)  # Output: [5 4 3 2 1]

# Using array slicing
reversed_arr = arr[::-1]
print(reversed_arr)

[5 4 3 2 1]
[5 4 3 2 1]


Reversing a 2D NumPy Array


For a 2D NumPy array, you can reverse it along different axes using np.flip() with the axis argument. Here's an example:

In [3]:
import numpy as np

arr = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

# Reverse along axis 0 (rows)
reversed_arr_rows = np.flip(arr, axis=0)
print(reversed_arr_rows)

# Reverse along axis 1 (columns)
reversed_arr_cols = np.flip(arr, axis=1)
print(reversed_arr_cols)

# Reverse along both axes
reversed_arr_both = np.flip(arr)
print(reversed_arr_both)


[[7 8 9]
 [4 5 6]
 [1 2 3]]
[[3 2 1]
 [6 5 4]
 [9 8 7]]
[[9 8 7]
 [6 5 4]
 [3 2 1]]


In summary:

np.flip() reverses elements along the specified axis or all axes by default.
Array slicing [::-1] is a convenient way to reverse a 1D array or along a specific axis in a multidimensional array.
axis=0 refers to rows, and axis=1 refers to columns in a 2D array.
Sources

4.. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types
in memory management and performance.

ANS    the data type of elements in a NumPy array and the importance of data types in memory management and performance:

Determining the Data Type

To determine the data type of elements in a NumPy array, you can use the dtype attribute.

In [4]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(arr.dtype)

int64


Importance of Data Types

Memory Management: Data types play a crucial role in memory management. Different data types require different amounts of memory to store. Choosing the appropriate data type can help reduce the memory footprint of your arrays, especially when dealing with large datasets.

For example, if you're working with integers that fall within a specific range, you can use a smaller data type like int16 or int32 instead of the default int64 to save memory.


Performance:

Data types also impact performance. NumPy's optimized operations are often designed to work most efficiently with specific data types. Using the correct data type can help speed up your calculations.

For example, performing arithmetic operations on integer arrays is generally faster than performing the same operations on floating-point arrays.


In summary, understanding and selecting the appropriate data types for your NumPy arrays is crucial for efficient memory management and optimal performance in your numerical computations.

5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

ANS   Definition of ndarrays:

ndarrays (n-dimensional arrays) are the fundamental data structure in NumPy. They are homogeneous, multidimensional arrays used to store and manipulate large collections of numerical data efficiently.

Key Features of ndarrays:

Homogeneous:

All elements in an ndarray must be of the same data type, ensuring efficient storage and manipulation.
Multidimensional: ndarrays can have any number of dimensions, allowing them to represent data in various forms, such as vectors, matrices, and tensors.


Fixed Size:

Once created, the size of an ndarray is fixed, promoting memory efficiency.


Element-wise Operations:

NumPy provides functions for performing element-wise operations on ndarrays, eliminating the need for explicit loops and improving performance.


Broadcasting:

Broadcasting enables operations between arrays of different shapes, making code more concise and efficient.


Differences from Standard Python Lists:

Data Type:

ndarrays are homogeneous (elements of the same data type), while Python lists can be heterogeneous (elements of different data types).


Memory Efficiency:

ndarrays store elements contiguously in memory, leading to better performance and memory efficiency than Python lists, where elements can be scattered in memory.


Performance:

NumPy operations on ndarrays are significantly faster than equivalent operations on Python lists due to optimized algorithms and C implementation.
Functionality: NumPy provides a wide range of mathematical and scientific functions specifically designed for ndarrays, offering functionality not readily available for Python lists.


In summary, ndarrays are the core of NumPy, offering efficient storage and manipulation of numerical data for scientific computing. Their key features—homogeneity, multidimensionality, fixed size, element-wise operations, and broadcasting—make them a powerful tool for data analysis and numerical computations, significantly enhancing the capabilities of standard Python lists.

 6.Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

 ANS   an analysis of the performance benefits of NumPy arrays over Python lists for large-scale numerical operations:

NumPy arrays offer significant performance advantages over Python lists for large-scale numerical operations due to the following reasons:

Contiguous Memory Allocation:

NumPy arrays store data in contiguous memory blocks, meaning elements are stored next to each other in memory. This allows for efficient access and manipulation of data, especially during vectorized operations. Python lists, on the other hand, store pointers to memory locations where the elements are stored, potentially leading to scattered memory allocation and slower access times.

Data Type Homogeneity:

NumPy arrays are homogeneous, meaning they store elements of the same data type. This allows for optimized storage and faster computations compared to Python lists, which can hold elements of different data types.

Vectorized Operations:

NumPy provides vectorized operations that allow mathematical functions to be applied to entire arrays without explicit loops. This eliminates the overhead of Python loops, leading to significant performance gains for large-scale computations.

Optimized C Implementation:

NumPy's core operations are implemented in C, a compiled language, making them significantly faster than equivalent operations performed in Python, an interpreted language. This low-level implementation contributes to the efficiency of NumPy arrays for numerical tasks.

Here's an example illustrating the performance difference:

In [5]:
import numpy as np
import time

# Create a large list and NumPy array
size = 1000000
python_list = list(range(size))
numpy_array = np.arange(size)

# Time the execution of a simple operation (e.g., addition)
start_time = time.time()
result_list = [x + 5 for x in python_list]
end_time = time.time()
print("Python list time:", end_time - start_time)

start_time = time.time()
result_array = numpy_array + 5
end_time = time.time()
print("NumPy array time:", end_time - start_time)

Python list time: 0.11676955223083496
NumPy array time: 0.009762763977050781


Execute the code yourself to see the output. You will see that NumPy takes significantly less time to complete the operation.

In summary, NumPy arrays offer substantial performance improvements for large-scale numerical operations due to their contiguous memory allocation, data type homogeneity, vectorized operations, and optimized C implementation. These features contribute to NumPy's efficiency and its widespread use in scientific computing and data analysis tasks.

7.Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and
output.

ANS    a comparison of vstack() and hstack() functions in NumPy with examples:

vstack()

Purpose: Stacks arrays vertically (row-wise).


Input: Takes a sequence of arrays with compatible shapes along the columns.


Output: Returns a new array with the input arrays stacked vertically.

In [6]:
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = np.vstack((a, b))
print(c)

[[1 2 3]
 [4 5 6]]


hstack()

Purpose: Stacks arrays horizontally (column-wise).


Input: Takes a sequence of arrays with compatible shapes along the rows.


Output: Returns a new array with the input arrays stacked horizontally.

In [7]:
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

c = np.hstack((a, b))
print(c)

[1 2 3 4 5 6]


In summary:

vstack() stacks arrays vertically, increasing the number of rows.

hstack() stacks arrays horizontally, increasing the number of columns.

8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various
array dimensions.

ANS  fliplr()

Purpose: Reverses the order of elements along axis 1 (left to right).

Effect on array dimensions:

1D arrays: Reverses the elements.

2D arrays: Flips the entries in each row in the left/right direction. Columns are preserved but appear in reverse order.

N-dimensional arrays (N > 2): Equivalent to np.flip(array, axis=1). It flips along the second axis.

Example:

In [8]:
import numpy as np

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
flipped_arr = np.fliplr(arr_2d)
print(flipped_arr)

[[3 2 1]
 [6 5 4]]


flipud()

Purpose: Reverses the order of elements along axis 0 (up to down).

Effect on array dimensions:

    1D arrays: Reverses the elements (same as fliplr() for 1D).

     2D arrays: Flips the entries in each column in the up/down direction. Rows are preserved but appear in reverse order.

     N-dimensional arrays (N > 2): Equivalent to np.flip(array, axis=0). It flips along the first axis.

Example:

In [9]:
import numpy as np

arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
flipped_arr = np.flipud(arr_2d)
print(flipped_arr)

[[4 5 6]
 [1 2 3]]


In summary:

fliplr() flips an array horizontally (left to right).

flipud() flips an array vertically (up to down).

For arrays with more than two dimensions, consider using np.flip() with the axis argument for more control over the flipping direction.

9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

ANS    Functionality:

The array_split() method in NumPy is used to split an array into multiple sub-arrays. It provides more flexibility than the split() method by handling uneven splits gracefully. This means that if the array cannot be divided evenly into the desired number of sub-arrays, array_split() will distribute the remaining elements as evenly as possible among the sub-arrays.

Handling Uneven Splits:

Here's how array_split() handles uneven splits:

Divisible Case:

If the array's size is perfectly divisible by the specified number of splits, array_split() will create sub-arrays of equal size.

Indivisible Case:

If the array's size is not divisible by the number of splits, array_split() will distribute the remaining elements in a way that minimizes the size difference between sub-arrays.

The first n-r sub-arrays will have size s.

The remaining r sub-arrays will have size s+1, where:

n is the number of sub-arrays requested

s is floor(len(array) / n) (the size of each sub-array if the split was even)

r is len(array) % n (the remainder when dividing the array's size by the number of sub-arrays)

Example:

In [15]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])

# Splitting into 3 sub-arrays
sub_arrays = np.array_split(arr, 3)


In this example, array_split() splits the array arr into 3 sub-arrays. Since the array size (8) is not divisible by 3, the remaining elements (2) are distributed among the sub-arrays, resulting in sub-arrays with sizes 3, 3, and 2.

In summary, array_split() is a flexible method for splitting arrays in NumPy. Its ability to handle uneven splits makes it a valuable tool for tasks where you need to divide data into specific groups, even if the data size doesn't allow for perfect division. I hope this helps! Let me know if you have any other questions.

# 10 . Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

ANS Vectorization

Vectorization is the process of applying operations to entire arrays instead of individual elements. NumPy achieves this through optimized, pre-compiled C code, eliminating the need for explicit Python loops. This leads to significant performance improvements, especially for large arrays.

Broadcasting

Broadcasting is a powerful mechanism in NumPy that allows operations between arrays of different shapes, as long as certain rules are satisfied. It eliminates the need to create temporary arrays or explicitly reshape arrays to perform element-wise operations. This leads to more concise and efficient code.

Contribution to Efficient Array Operations

Reduced Execution Time:

Vectorization and broadcasting significantly reduce execution time by leveraging optimized C code and eliminating Python loops.
Improved Memory Efficiency: Broadcasting avoids creating unnecessary copies of data, leading to better memory utilization, especially for large datasets.
Concise Code: Broadcasting allows for more concise and readable code by simplifying operations between arrays of different shapes.
Enhanced Flexibility: Broadcasting provides flexibility in performing operations on arrays with compatible dimensions, enabling a wider range of calculations.
Example:

In [16]:
import numpy as np

a = np.array([1, 2, 3])
b = 2

result = a * b  # Broadcasting: b is stretched to match the shape of a
print(result)

[2 4 6]


In this example, broadcasting allows the scalar b to be multiplied with each element of the array a without explicit looping. This demonstrates how broadcasting simplifies and optimizes array operations.

In summary, vectorization and broadcasting are fundamental concepts in NumPy that contribute to efficient array operations. They significantly improve performance, memory efficiency, code conciseness, and flexibility when working with numerical data in Python.

# **Practical Questions:**

1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.

In [17]:
import numpy as np

# Create a 3x3 array with random integers between 1 and 100
array = np.random.randint(1, 101, size=(3, 3))

# Interchange rows and columns using transpose()
transposed_array = array.transpose()

print("Original Array:")
print(array)
print("\nTransposed Array:")
print(transposed_array)

Original Array:
[[88 72 18]
 [26 90 28]
 [13 86 51]]

Transposed Array:
[[88 26 13]
 [72 90 86]
 [18 28 51]]


2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.

In [18]:
import numpy as np

# Generate a 1D array with 10 elements
arr = np.arange(10)

# Reshape into a 2x5 array
arr_2x5 = arr.reshape(2, 5)

# Reshape into a 5x2 array
arr_5x2 = arr.reshape(5, 2)

print("Original Array:")
print(arr)
print("\n2x5 Array:")
print(arr_2x5)
print("\n5x2 Array:")
print(arr_5x2)

Original Array:
[0 1 2 3 4 5 6 7 8 9]

2x5 Array:
[[0 1 2 3 4]
 [5 6 7 8 9]]

5x2 Array:
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.

In [19]:
import numpy as np

# Create a 4x4 array with random float values
array_4x4 = np.random.rand(4, 4)

# Create a 6x6 array filled with zeros
array_6x6 = np.zeros((6, 6))

# Insert the 4x4 array into the center of the 6x6 array
array_6x6[1:5, 1:5] = array_4x4

print("Original 4x4 Array:")
print(array_4x4)
print("\n6x6 Array with Zero Border:")
print(array_6x6)

Original 4x4 Array:
[[0.9829749  0.98887505 0.75110203 0.8978966 ]
 [0.25983339 0.35575499 0.80076747 0.51766777]
 [0.53728076 0.48017029 0.46974686 0.31919023]
 [0.74296646 0.55228356 0.51955065 0.15801866]]

6x6 Array with Zero Border:
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.9829749  0.98887505 0.75110203 0.8978966  0.        ]
 [0.         0.25983339 0.35575499 0.80076747 0.51766777 0.        ]
 [0.         0.53728076 0.48017029 0.46974686 0.31919023 0.        ]
 [0.         0.74296646 0.55228356 0.51955065 0.15801866 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.

In [20]:
import numpy as np

array = np.arange(10, 61, 5)
print(array)

[10 15 20 25 30 35 40 45 50 55 60]


5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations
(uppercase, lowercase, title case, etc.) to each element.

In [21]:
import numpy as np

# Create the NumPy array
arr = np.array(['python', 'numpy', 'pandas'])

# Apply case transformations
uppercase = np.char.upper(arr)
lowercase = np.char.lower(arr)
titlecase = np.char.title(arr)

# Print the results
print("Original Array:", arr)
print("Uppercase:", uppercase)
print("Lowercase:", lowercase)
print("Titlecase:", titlecase)

Original Array: ['python' 'numpy' 'pandas']
Uppercase: ['PYTHON' 'NUMPY' 'PANDAS']
Lowercase: ['python' 'numpy' 'pandas']
Titlecase: ['Python' 'Numpy' 'Pandas']


6. Generate a NumPy array of words. Insert a space between each character of every word in the array.

In [22]:
import numpy as np

# Create a NumPy array of words
words = np.array(['hello', 'world', 'python'])

# Insert a space between each character
result = np.char.join(' ', words)

print("Original Array:", words)
print("Result:", result)

Original Array: ['hello' 'world' 'python']
Result: ['h e l l o' 'w o r l d' 'p y t h o n']


7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

In [23]:
import numpy as np

# Create two 2D arrays
array1 = np.array([[1, 2, 3], [4, 5, 6]])
array2 = np.array([[7, 8, 9], [10, 11, 12]])

# Perform element-wise operations
addition = array1 + array2
subtraction = array1 - array2
multiplication = array1 * array2
division = array1 / array2

# Print the results
print("Array 1:", array1)
print("Array 2:", array2)
print("\nAddition:", addition)
print("Subtraction:", subtraction)
print("Multiplication:", multiplication)
print("Division:", division)

Array 1: [[1 2 3]
 [4 5 6]]
Array 2: [[ 7  8  9]
 [10 11 12]]

Addition: [[ 8 10 12]
 [14 16 18]]
Subtraction: [[-6 -6 -6]
 [-6 -6 -6]]
Multiplication: [[ 7 16 27]
 [40 55 72]]
Division: [[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]


8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.

In [24]:
import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.identity(5)

# Extract diagonal elements
diagonal_elements = np.diag(identity_matrix)

print("Identity Matrix:")
print(identity_matrix)
print("\nDiagonal Elements:", diagonal_elements)

Identity Matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

Diagonal Elements: [1. 1. 1. 1. 1.]


9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in
this array.

In [25]:
import numpy as np

def is_prime(n):
  """Returns True if n is a prime number, False otherwise."""
  if n <= 1:
    return False
  for i in range(2, int(n**0.5) + 1):
    if n % i == 0:
      return False
  return True

# Generate a NumPy array of 100 random integers between 0 and 1000
random_integers = np.random.randint(0, 1001, size=100)

# Find and display all prime numbers in the array
prime_numbers = [num for num in random_integers if is_prime(num)]

print("Random Integers:", random_integers)
print("\nPrime Numbers:", prime_numbers)

Random Integers: [348 411 269 357  22  21  71 676 606 539 396 354 144  32 590 243 815 957
 698  35 127 575  39 592 421 314 278 636 636 826 118 968 277  74 689 584
 920 216 185  20 984 475 566 645 767 326 341 429  35 526 324 214 388 166
 769 426 277 198   7 647 159 542 723 887 441 676 228 289 101 811 230 849
 851 619 676 691 572  80 619 956 553 304 750 460 947 345 971 616 691 569
 166 733 983 821 401 399 775 960 404 553]

Prime Numbers: [269, 71, 127, 421, 277, 769, 277, 7, 647, 887, 101, 811, 619, 691, 619, 947, 971, 691, 569, 733, 983, 821, 401]


10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly
averages.

In [27]:
import numpy as np

# Assume 30 days in the month
daily_temperatures = np.random.randint(20, 35, size=30)  # Example temperatures

# Reshape to represent weeks (4 weeks + 2 extra days)
# The original reshape was causing the error
# Instead, reshape to (5, 6) to consider 5 weeks and 6 days per week or similar
# You can adjust this based on how you want to represent the data
num_weeks = 5
days_per_week = 6  # You can adjust this if needed
weekly_temperatures = daily_temperatures.reshape(num_weeks, days_per_week)


# Calculate weekly averages
weekly_averages = np.mean(weekly_temperatures, axis=1)
print("Daily Temperatures:", daily_temperatures)
print("\nWeekly Temperatures:", weekly_temperatures)

Daily Temperatures: [28 34 29 34 32 31 26 20 23 23 34 30 21 20 21 22 32 22 26 27 23 24 30 28
 30 31 25 24 29 22]

Weekly Temperatures: [[28 34 29 34 32 31]
 [26 20 23 23 34 30]
 [21 20 21 22 32 22]
 [26 27 23 24 30 28]
 [30 31 25 24 29 22]]
