In [None]:
#THEORETICAL QUESTIONS

In [None]:
# 1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it
# enhance Python's capabilities for numerical operations?

# NumPy (Numerical Python) is a fundamental library for scientific computing and data analysis in Python.
# It provides powerful tools and functionalities for working with arrays and matrixes, enabling efficient numerical operations.

# Here's a breakdown of its purpose and advantages:

# a. Efficient Array Operations:
#    - NumPy's core strength lies in its `ndarray` (n-dimensional array) object.
#    - Unlike Python lists, which can store heterogeneous data types, NumPy arrays store homogeneous data, allowing for efficient storage and processing.
#    - NumPy enables element-wise operations on arrays without the need for explicit loops, significantly improving performance for numerical calculations.
#    - It facilitates vectorized operations, which operate on entire arrays simultaneously, resulting in concise and faster code.

# b. Mathematical and Statistical Functions:
#    - NumPy provides a comprehensive collection of mathematical functions (e.g., trigonometric, logarithmic, exponential, linear algebra) optimized for array operations.
#    - It allows for statistical calculations, such as mean, median, standard deviation, and correlation, on arrays effortlessly.

# c. Broadcasting:
#    - Broadcasting enables operations between arrays with different shapes and sizes.
#    - It automatically expands the smaller array to match the shape of the larger one, allowing element-wise calculations without explicitly replicating data.
#    - This significantly reduces the need for manual array manipulation and enhances code readability.

# d. Integration with Other Libraries:
#    - NumPy serves as a cornerstone for many data science and machine learning libraries in Python, including Pandas, SciPy, and scikit-learn.
#    - It facilitates seamless data exchange and integration between these libraries.

# How NumPy enhances Python's capabilities for numerical operations:

# - Performance: NumPy leverages optimized C code for array operations, making it significantly faster than using native Python lists or loops.
# - Conciseness: It enables expressing complex numerical computations in a more concise and readable manner through vectorization and broadcasting.
# - Functionality: NumPy provides a vast array of tools and functions specifically designed for numerical operations, expanding Python's capabilities in this domain.
# - Ecosystem: It acts as a fundamental building block for a large ecosystem of scientific computing and data analysis libraries in Python.

# In summary, NumPy is a crucial component for performing efficient numerical computations and data analysis in Python.
# Its array operations, mathematical functions, broadcasting, and integration with other libraries empower researchers, data scientists, and engineers to handle large datasets and perform sophisticated calculations with ease.


In [None]:
# 2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?


# np.mean() and np.average() are both used to calculate the average of an array in NumPy, but they differ in how they handle weights.

# np.mean():
# - Calculates the arithmetic mean of the array elements.
# - It treats all elements equally without considering weights.

# np.average():
# - Calculates the weighted average of the array elements.
# - Allows you to specify weights for each element, influencing their contribution to the overall average.

# Here's a comparison:

# | Feature        | np.mean()                               | np.average()                                     |
# |----------------|-----------------------------------------|-------------------------------------------------|
# | Weights        | No weights (all elements equal)         | Weights can be specified                          |
# | Calculation    | Arithmetic mean                         | Weighted average                                  |
# | Use Case       | Simple average of all elements          | Calculating average considering element importance |


# When to use np.mean():

# - When you want the simple average of all elements in the array.
# - When you don't need to consider the relative importance of different elements.

# When to use np.average():

# - When you want a weighted average, where some elements have more influence than others.
# - For example, in calculating a grade average where different assignments have different weights.




In [None]:
# 3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

import numpy as np

# Reversing a 1D array
arr_1 = np.array([1, 2, 3, 4, 5])
reversed_arr_1 = np.flip(arr_1)
print(reversed_arr_1) #Outcome: [5,4,3,2,1]

# Reversing a 2D array along axis 0 (rows)
arr_2 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
reversed_arr_2_axis0 = np.flip(arr_2, axis=0)
print(reversed_arr_2_axis0) #Here the array is reversed along axis 0


# Reversing a 2D array along axis 1 (columns)
reversed_arr_2_axis1 = np.flip(arr_2, axis=1)
print(reversed_arr_2_axis1) #Here the array is reversed along axis 1

# Reversing a 2D array along both axes
reversed_arr_2_both = np.flip(arr_2)
print(reversed_arr_2_both) #Here the array is reversed along both axes

[5 4 3 2 1]
[[7 8 9]
 [4 5 6]
 [1 2 3]]
[[3 2 1]
 [6 5 4]
 [9 8 7]]
[[9 8 7]
 [6 5 4]
 [3 2 1]]


In [None]:
# 4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types
# in memory management and performance.


# Determining Data Type:

# You can determine the data type of elements in a NumPy array using the `dtype` attribute.

import numpy as np

arr1 = np.array([1, 2, 3, 4, 5])
print(arr1.dtype)  # Output: int64


# Importance of Data Types:

# Data types play a crucial role in memory management and performance in NumPy:

# a. Memory Allocation:

# - Each data type has a specific size in memory (e.g., int32 occupies 4 bytes per element).
# - When you create a NumPy array, the system allocates memory based on the specified data type and the number of elements.
# - Choosing the appropriate data type allows for efficient memory usage and reduces unnecessary memory consumption.

# b. Performance:

# - Data types impact the speed of computations performed on the array.
# - NumPy's optimized routines are tailored to specific data types, resulting in faster execution.
# - Using a smaller data type (e.g., int16 instead of int64) when the range of values allows it can improve performance, especially for large arrays.

# c. Accuracy:

# - Data types determine the precision with which values are stored.
# - Floating-point data types (e.g., float32, float64) allow for storing fractional numbers with varying levels of precision.
# - Choosing the appropriate data type ensures that calculations maintain the desired level of accuracy.

# Example:

# For example, if you are dealing with pixel values in an image, you might choose the uint8 data type because pixel values generally range from 0 to 255.
# Using this data type instead of int64 will significantly reduce memory usage and potentially improve performance when processing the image.


# Summary:

# Data types in NumPy are essential for efficient memory management and performance.
# By choosing the right data type, you can optimize storage, improve the speed of computations, and enhance the accuracy of your numerical analysis and data processing tasks.


int64


In [None]:
# 5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?


# ndarrays (N-dimensional arrays) are the fundamental data structures in NumPy.
# They are homogeneous multidimensional arrays that store elements of the same data type.

# Key Features:

# a. Homogeneous Data Type: Unlike Python lists, which can store different data types in a single list, ndarrays require all elements to have the same data type. This allows for efficient storage and processing.

# b. Fixed Size: Once created, the size of an ndarray is fixed. You can't directly append or remove elements like you can with lists.

# c. Vectorized Operations: ndarrays support element-wise operations, meaning operations are performed on entire arrays simultaneously, eliminating the need for explicit loops.

# d. Broadcasting: ndarrays enable operations between arrays of different shapes and sizes. The smaller array is automatically "broadcasted" to match the shape of the larger array, allowing for efficient element-wise calculations.

# e. Memory Efficiency: NumPy arrays are typically more memory-efficient than Python lists, especially for large datasets.

# f. Multidimensional: ndarrays can have multiple dimensions (e.g., 1D, 2D, 3D, etc.). This allows them to represent matrices, tensors, and other complex data structures.

# How they differ from standard Python lists:

# | Feature               | ndarray                                  | Python list                                  |
# |-----------------------|-------------------------------------------|----------------------------------------------|
# | Data Type             | Homogeneous (same data type)             | Heterogeneous (different data types allowed) |
# | Size                  | Fixed                                      | Dynamic (can grow/shrink)                    |
# | Operations            | Vectorized, element-wise                  | Iterative, element-by-element                 |
# | Memory Efficiency      | More efficient for large datasets         | Less efficient for large datasets             |
# | Multidimensional       | Supports multiple dimensions (N-D)       | Limited to 1 dimension                        |
# | Performance           | Generally faster for numerical operations | Slower for numerical operations               |



In [None]:
#6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.
# When working with large numerical operations in Python, Numpy offers significant performance advantages over traditional Python lists.
# Let's explore this aspect
# NumPy arrays are designed to be memory efficient, storing data in a contiguous block of memory. This allows for faster access and reduces memory overhead.
# Also NumPy's broadcasting mechanism allows for automatic shape inferernce and alignment of arrays during operations,
# which can lead to more concise.
# And it offers specialized data types for numerical operations, such as numpy.int64, numpy.float32 etc. These data types are optimized for arithmetic
# operations and memory usage.
# Conclusion:
# So Numpy offers a substantial performance advantage over Python lists for large scale numerical operations.


In [None]:
# 7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and
# output.

import numpy as np

# vstack()

# vstack() is used to stack arrays vertically, meaning it combines arrays along the row axis (axis=0).
# This results in a new array with rows from the input arrays concatenated together.

arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

vertical_stack = np.vstack((arr1, arr2))
print("vstack output:\n", vertical_stack)


# hstack()

# hstack() is used to stack arrays horizontally, meaning it combines arrays along the column axis (axis=1).
# This results in a new array with columns from the input arrays concatenated together.

arr3 = np.array([[1], [2], [3]])
arr4 = np.array([[4], [5], [6]])

horizontal_stack = np.hstack((arr3, arr4))
print("hstack output:\n", horizontal_stack)


vstack output:
 [[1 2 3]
 [4 5 6]]
hstack output:
 [[1 4]
 [2 5]
 [3 6]]


In [None]:
# 8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various
# array dimensions.

import numpy as np

# fliplr() and flipud() are NumPy functions used to flip arrays along their horizontal and vertical axes, respectively.

# fliplr() (Flip Left-Right):

#   - Flips the array along the horizontal axis (axis=1).
#   - It reverses the order of columns in the array.
#   - For a 2D array, it mirrors the array along its vertical centerline.

# flipud() (Flip Up-Down):

#   - Flips the array along the vertical axis (axis=0).
#   - It reverses the order of rows in the array.
#   - For a 2D array, it mirrors the array along its horizontal centerline.

# Example:

arr = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

# Flipping horizontally (fliplr)
flipped_lr = np.fliplr(arr)
print("Flipped Left-Right:\n", flipped_lr)


# Flipping vertically (flipud)
flipped_ud = np.flipud(arr)
print("Flipped Up-Down:\n", flipped_ud)


# Effects on various array dimensions:

#   - 1D Arrays:
#     - Both fliplr() and flipud() essentially reverse the order of elements in a 1D array.

#   - 2D Arrays:
#     - fliplr() reverses the columns.
#     - flipud() reverses the rows.

#   - Higher Dimensional Arrays:
#     - The flipping effect applies along the specified axis (axis=1 for fliplr(), axis=0 for flipud()).


Flipped Left-Right:
 [[3 2 1]
 [6 5 4]
 [9 8 7]]
Flipped Up-Down:
 [[7 8 9]
 [4 5 6]
 [1 2 3]]


In [None]:
# 9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

import numpy as np

# array_split() Functionality:

# The array_split() method in NumPy is used to split an array into multiple sub-arrays along a specified axis.
# It allows you to divide an array into a desired number of sub-arrays, even if the array cannot be divided evenly.


# Example: Even Split

arr = np.arange(10)
sub_arrays = np.array_split(arr, 5)
print("Even split into 5 sub-arrays:", sub_arrays)

# Example: Uneven Split

arr = np.arange(11)
sub_arrays = np.array_split(arr, 3)
print("\nUneven split into 3 sub-arrays:", sub_arrays)

# Handling Uneven Splits:

# If the array cannot be divided into equal-sized sub-arrays, array_split() handles this by distributing the remaining elements across the resulting sub-arrays.
# The earlier sub-arrays will have one extra element than the last if the split is not even.

# In summary, the array_split() method provides flexibility for splitting arrays, even when the desired number of splits does not result in equal-sized sub-arrays.
# It intelligently distributes the remaining elements to ensure that the entire array is included in the resulting sub-arrays.


Even split into 5 sub-arrays: [array([0, 1]), array([2, 3]), array([4, 5]), array([6, 7]), array([8, 9])]

Uneven split into 3 sub-arrays: [array([0, 1, 2, 3]), array([4, 5, 6, 7]), array([ 8,  9, 10])]


In [None]:
# prompt: 10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array
# operations?

# 10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array
# operations?


# Vectorization:

# Vectorization is a powerful technique in NumPy that allows you to perform operations on entire arrays without explicit loops.
# Instead of iterating through elements individually, vectorized operations apply the operation to the entire array simultaneously.

# Example:

import numpy as np

arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([6, 7, 8, 9, 10])

# Element-wise addition using vectorization
result = arr1 + arr2
print("Vectorized addition:", result)


# Broadcasting:

# Broadcasting is a set of rules that allows NumPy to perform operations on arrays with different shapes and sizes.
# It automatically expands the smaller array to match the shape of the larger array, enabling element-wise calculations without explicit replication of data.

# Example:

arr3 = np.array([[1, 2, 3],
                [4, 5, 6]])
b = 2

# Broadcasting multiplication
result = arr3 * b
print("Broadcasting multiplication:", result)

# How They Contribute to Efficiency:

# a. Performance:
#    - Vectorized operations and broadcasting are significantly faster than using explicit loops in Python.
#    - NumPy leverages optimized C code for these operations, making them highly efficient.

# b. Conciseness:
#    - Vectorization and broadcasting enable expressing complex numerical operations in a more concise and readable manner.
#    - They eliminate the need for manual element-wise iteration, leading to cleaner code.

# c. Memory Efficiency:
#    - Broadcasting avoids unnecessary data replication, which can consume significant memory, particularly for large arrays.
#    - It allows for efficient calculations without creating intermediate arrays unnecessarily.

# In Summary:

# Vectorization and broadcasting are fundamental techniques in NumPy that enable efficient array operations.
# They contribute to improved performance, code conciseness, and memory efficiency, making NumPy a powerful tool for numerical computations and data analysis.


Vectorized addition: [ 7  9 11 13 15]
Broadcasting multiplication: [[ 2  4  6]
 [ 8 10 12]]


In [None]:
#PRACTICAL QUESTIONS

In [None]:
# 1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns

import numpy as np

# Create a 3x3 NumPy array with random integers between 1 and 100
arr = np.random.randint(1, 100, size=(3, 3))

print("Original Array:\n", arr)

# Interchange rows and columns
arr_transposed = arr.transpose()

print("\nArray with Interchanged Rows and Columns:\n", arr_transposed)


Original Array:
 [[46 10  2]
 [83 13 71]
 [24 90 65]]

Array with Interchanged Rows and Columns:
 [[46 83 24]
 [10 13 90]
 [ 2 71 65]]


In [None]:
# 2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.

import numpy as np

# Generate a 1D NumPy array with 10 elements
arr = np.arange(10)

# Reshape it into a 2x5 array
arr_2x5 = arr.reshape(2, 5)
print("2x5 array:", arr_2x5)

# Reshape it into a 5x2 array
arr_5x2 = arr.reshape(5, 2)
print("5x2 array:", arr_5x2)


2x5 array: [[0 1 2 3 4]
 [5 6 7 8 9]]
5x2 array: [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [None]:
# 3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.

import numpy as np

# Create a 4x4 NumPy array with random float values
arr = np.random.rand(4, 4)

# Add a border of zeros around it
bordered_arr = np.pad(arr, pad_width=1, mode='constant', constant_values=0)

print("Original Array:\n", arr)
print("\nArray with Zero Border:\n", bordered_arr)


Original Array:
 [[0.14068932 0.70860776 0.53907099 0.60367167]
 [0.04085621 0.04401192 0.2071238  0.29784333]
 [0.67056354 0.91264677 0.19260197 0.97306402]
 [0.11156798 0.8769091  0.7776981  0.83806379]]

Array with Zero Border:
 [[0.         0.         0.         0.         0.         0.        ]
 [0.         0.14068932 0.70860776 0.53907099 0.60367167 0.        ]
 [0.         0.04085621 0.04401192 0.2071238  0.29784333 0.        ]
 [0.         0.67056354 0.91264677 0.19260197 0.97306402 0.        ]
 [0.         0.11156798 0.8769091  0.7776981  0.83806379 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [1]:
# 4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.

import numpy as np

array_with_step = np.arange(10, 60, 5)
print(array_with_step)


[10 15 20 25 30 35 40 45 50 55]


In [2]:
# 5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations
# (uppercase, lowercase, title case, etc.) to each element.

import numpy as np

# Create a NumPy array of strings
string_array = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations

# Uppercase
uppercase_array = np.char.upper(string_array)
print("Uppercase:", uppercase_array)

# Lowercase
lowercase_array = np.char.lower(string_array)
print("Lowercase:", lowercase_array)

# Titlecase
titlecase_array = np.char.title(string_array)
print("Titlecase:", titlecase_array)

# Swapcase
swapcase_array = np.char.swapcase(string_array)
print("Swapcase:", swapcase_array)


Uppercase: ['PYTHON' 'NUMPY' 'PANDAS']
Lowercase: ['python' 'numpy' 'pandas']
Titlecase: ['Python' 'Numpy' 'Pandas']
Swapcase: ['PYTHON' 'NUMPY' 'PANDAS']


In [3]:
# 6. Generate a NumPy array of words. Insert a space between each character of every word in the array

import numpy as np

words = np.array(['hello', 'world', 'numpy'])

# Insert a space between each character of every word
spaced_words = np.char.join(' ', words)

print(spaced_words)


['h e l l o' 'w o r l d' 'n u m p y']


In [5]:
# 7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

import numpy as np

# Create two 2D NumPy arrays
arr1 = np.array([[1, 2, 3],
                 [4, 5, 6]])
arr2 = np.array([[7, 8, 9],
                 [10, 11, 12]])

# Element-wise addition
addition_result = arr1 + arr2
print("Element-wise Addition:", addition_result)

# Element-wise subtraction
subtraction_result = arr1 - arr2
print("Element-wise Subtraction:", subtraction_result)

# Element-wise multiplication
multiplication_result = arr1 * arr2
print("Element-wise Multiplication:", multiplication_result)

# Element-wise division
division_result = arr1 / arr2
print("Element-wise Division:", division_result)


Element-wise Addition: [[ 8 10 12]
 [14 16 18]]
Element-wise Subtraction: [[-6 -6 -6]
 [-6 -6 -6]]
Element-wise Multiplication: [[ 7 16 27]
 [40 55 72]]
Element-wise Division: [[0.14285714 0.25       0.33333333]
 [0.4        0.45454545 0.5       ]]


In [6]:
# 8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements

import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.identity(5)

# Extract the diagonal elements
diagonal_elements = np.diag(identity_matrix)

print("Identity Matrix:", identity_matrix)
print("Diagonal Elements:", diagonal_elements)


Identity Matrix: [[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
Diagonal Elements: [1. 1. 1. 1. 1.]


In [22]:
# 9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

import numpy as np

# Generate a NumPy array of 100 random integers between 0 and 1000
random_integers = np.random.randint(0, 1000, size=100)

def is_prime(n):
  if n <= 1:
    return False
  if n <= 3:
    return True
  if n % 2 == 0 or n % 3 == 0:
    return False
  i = 5
  while i * i <= n:
    if n % i == 0 or n % (i + 2) == 0:
      return False
    i = i + 6
  return True

# Find and display all prime numbers in this array
prime_numbers = [num for num in random_integers if is_prime(num)]
print("Prime numbers in the array:", prime_numbers)




Prime numbers in the array: [83, 997, 941, 467, 241, 761, 809, 719, 941, 929, 13, 193, 863, 773, 439, 151, 193, 457, 491, 43, 283, 53]


In [12]:
# 10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly
# averages.

import numpy as np

# Create a NumPy array representing daily temperatures for a month (30 days)
daily_temperatures = np.random.randint(15, 40, size=30)

# Calculate weekly averages
weekly_averages = []
for i in range(0, 30, 7):
  week_temperatures = daily_temperatures[i:i+7]
  weekly_average = np.mean(week_temperatures)
  weekly_averages.append(weekly_average)

print("Daily Temperatures:", daily_temperatures)
print("Weekly Averages:", weekly_averages)


Daily Temperatures: [35 37 19 35 24 17 28 35 34 28 17 37 38 20 37 26 30 36 25 39 25 32 29 39
 28 29 33 27 33 27]
Weekly Averages: [27.857142857142858, 29.857142857142858, 31.142857142857142, 31.0, 30.0]
