# Q1  Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?

In [1]:
# Advantages of numpy in scientific computing and data analysis

#1 High-Performance Arrays: NumPy's N-dimensional arrays offer significant performance benefits compared to Python's built-in lists.
# These arrays are optimized for numerical operations, leading to substantial speedups, especially for large datasets.

#2 Broad Functionality: NumPy provides a comprehensive set of functions for performing various mathematical operations,
# including linear algebra, Fourier transforms, and random number generation. This rich functionality eliminates the need for manual
# implementation of common algorithms, saving time and effort.

#3 Integration with Other Libraries: NumPy seamlessly integrates with other scientific computing libraries like SciPy, Matplotlib,
# and Pandas. This interoperability enables powerful data analysis and visualization workflows.

#4 Memory Efficiency: NumPy's arrays are stored in contiguous memory blocks, promoting efficient memory access and reducing overhead.
# This memory efficiency is crucial for handling large datasets.

#5 Vectorization: NumPy supports vectorized operations, allowing you to perform operations on entire arrays without explicit loops.
# This vectorization often results in significant performance improvements and cleaner code.



# Enhances Python's Capabilities:

#1 Efficient Numerical Operations: NumPy's optimized array operations provide a substantial speedup compared to Python's built-in data structures.

#2 Mathematical Functions: NumPy's extensive library of mathematical functions simplifies complex calculations and data analysis tasks.

#3 Linear Algebra and Statistics: NumPy offers functions for linear algebra, statistical analysis, and other numerical computations,
# making it a versatile tool for scientific research.

#4 Data Manipulation: NumPy's arrays provide a flexible and efficient way to manipulate and transform data, including reshaping,
# slicing, and indexing.

#5 Integration with Other Libraries: NumPy's compatibility with other scientific computing libraries enables powerful data analysis
# and visualization workflows.


# Q2 Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

In [2]:
# np.mean()

# Purpose: Calculates the arithmetic mean of an array.
# Weights: Assumes all elements have equal weight.
# Usage: np.mean(array)

# np.average()

# Purpose: Calculates the weighted average of an array.
# Weights: Allows you to specify weights for each element.
# Usage: np.average(array, weights=None)

# When to use np.mean():

# When all elements have equal weight.
# For simple arithmetic mean calculations.

# When to use np.average():

# When elements have different weights.
# For weighted average calculations.

# Q3 Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays

In [3]:
# Methods for Reversing NumPy Arrays:

# np.flip(array, axis=None):
import numpy as np
array1d = np.array([1, 2, 3, 4])
reversed_array1d = array1d[::-1]
print(reversed_array1d)  

# Reverses the order of elements along the specified axis.
# If axis is not specified, it reverses along all axes.
# array[::-1]:

# Reverses the order of elements along the first axis (rows in a 2D array).
# array[:, ::-1]:

# Reverses the order of elements along the second axis (columns in a 2D array).
array2d = np.array([[1, 2, 3],
                   [4, 5, 6]])

# Reversing along the first axis (rows)
reversed_rows = array2d[::-1]

print(reversed_rows) 

# Reversing along the second axis (columns)
reversed_cols = array2d[:, ::-1]

print(reversed_cols)  

# Reversing along both axes
reversed_both = np.flip(array2d)

print(reversed_both)  

[4 3 2 1]
[[4 5 6]
 [1 2 3]]
[[3 2 1]
 [6 5 4]]
[[6 5 4]
 [3 2 1]]


# Q4 How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance

In [4]:
# Determining Data Types:

# NumPy provides several methods to ascertain the data type of elements within an array:
    
    
# dtype Attribute:

# Directly access the dtype attribute of the array:
array = np.array([1, 2, 3, 4])
data_type = array.dtype
data_type

# type() Function:
array1=[1,2,3]
element=array1[0]
data_type1=type(element)
data_type1


# np.info() Function:

# Use np.info() to get information about the array's dtype:

np.info(element)
np.info(array)

# Importance of Data Types:

# Data types play a crucial role in NumPy arrays for memory management and performance:

# 1 Memory Allocation------->
#1)  NumPy allocates memory for elements based on their data types. Smaller data types, like int8 or float32, require less memory than larger ones like int64 or float64.
# 2) By using appropriate data types, you can optimize memory usage and avoid unnecessary overhead.


#2 Performance:

#1)  NumPy's optimized operations are often tailored to specific data types. Using the correct data type can significantly improve the speed of calculations.
#2) For example perations on integer arrays are generally faster than those on floating-point arrays.


# Data Integrity:

# Choosing the right data type ensures that your data is stored and manipulated accurately. For instance, using int8 for values that exceed its range (typically -128 to 127) can lead to data corruption.

int([x]) -> integer
int(x, base=10) -> integer

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is a number, return x.__int__().  For floating point
numbers, this truncates towards zero.

If x is not a number or if base is given, then x must be a string,
bytes, or bytearray instance representing an integer literal in the
given base.  The literal can be preceded by '+' or '-' and be surrounded
by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
Base 0 means to interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4
class:  ndarray
shape:  (4,)
strides:  (4,)
itemsize:  4
aligned:  True
contiguous:  True
fortran:  True
data pointer: 0x24f27256970
byteorder:  little
byteswap:  False
type: int32


# Q5 Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

In [5]:

# ndarrays (n-dimensional arrays) are the fundamental data structure in NumPy. They provide a powerful and efficient way to represent
# and manipulate numerical data in Python. Unlike standard Python lists, ndarrays are optimized for numerical operations and offer
# several key advantages.



# Key Features of NumPy ndarrays:
# 1 Homogeneous Data Type: Unlike Python lists, which can contain elements of different data types, ndarrays must have all elements
# of the same data type. This ensures efficient memory usage and optimized numerical operations.

# 2 Fixed Size: The size of an ndarray is fixed once it's created. This allows for efficient memory allocation and indexing.
# While Python lists can dynamically change size, ndarrays are more suitable for large, fixed-size datasets.

# 3 Multi-Dimensional: ndarrays can represent data in multiple dimensions, from one-dimensional vectors to higher-dimensional
# matrices and tensors. This flexibility makes them suitable for a wide range of applications, including image processing, machine learning, and scientific computing.

# 4 Vectorized Operations: NumPy provides vectorized operations that perform operations on entire arrays element-wise, without
# the need for explicit loops. This significantly improves performance and readability.

# 5 Broadcasting: NumPy's broadcasting mechanism allows arrays of different shapes to be compatible for arithmetic operations
# . This simplifies code and avoids manual looping.

# 6 Memory Efficiency: ndarrays are stored in contiguous memory blocks, which enables efficient memory access and optimized
# calculations.

# 7 Integration with Other Libraries: NumPy seamlessly integrates with other scientific computing libraries like SciPy,
# Matplotlib, and Pandas, making it a versatile tool for data analysis and visualization.

# Q6 Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations

In [6]:
# 1. Data Type Homogeneity and Memory Efficiency:

# Homogeneous Data Type: NumPy arrays require all elements to have the same data type, which allows for more efficient memory allocation and access. Python lists, on the other hand, can store elements of various data types, leading to overhead in memory management.
# Contiguous Memory Layout: NumPy arrays are stored in contiguous memory blocks, enabling faster access and arithmetic operations. Python lists, being more general-purpose, may have non-contiguous memory layouts, resulting in slower access times.

# 2. Vectorized Operations:
    
# Element-wise Operations: NumPy provides vectorized operations that perform operations on entire arrays at once, without the need for explicit loops. This eliminates the overhead of Python's interpreter and leverages the efficiency of compiled C code.
# Performance Boost: Vectorized operations can significantly improve performance, especially for large datasets. Python lists, relying on interpreted loops, are generally slower for numerical computations.
    
# 3. Optimized Algorithms:
    
# Optimized Libraries: NumPy leverages highly optimized algorithms and libraries written in C or Fortran for many numerical operations. These algorithms are often tailored for specific data types and hardware architectures, providing substantial performance gains.
# Numerical Precision: NumPy's optimized algorithms can also ensure numerical accuracy and stability, which is crucial for many scientific and engineering applications.
    
# 4. Broadcasting:
    
# Automatic Shape Inference: NumPy's broadcasting mechanism allows arrays of different shapes to be compatible for arithmetic operations. This eliminates the need for manual reshaping or looping, simplifying code and improving performance.
# Efficient Operations: Broadcasting can enable efficient operations on arrays of different sizes, without requiring explicit element-wise loops.

# Q7  Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output

In [7]:
# Comparing vstack() and hstack() in NumPy
# vstack()

# Vertical stacking: Stacks arrays vertically, row-wise.
# Input: A sequence of arrays with compatible shapes (same number of columns).
# Output: A new array with the stacked arrays as rows.

# Example:
array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])

stacked_array = np.vstack((array1, array2))
print(stacked_array)

# hstack()
# Horizontal stacking: Stacks arrays horizontally, column-wise.
# Input: A sequence of arrays with compatible shapes (same number of rows).
# Output: A new array with the stacked arrays as columns.

array1 = np.array([[1, 2], [3, 4]])
array2 = np.array([[5, 6], [7, 8]])

stacked_array = np.hstack((array1, array2))
print(stacked_array)

[[1 2]
 [3 4]
 [5 6]
 [7 8]]
[[1 2 5 6]
 [3 4 7 8]]


# Q8 Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions.

In [8]:
# fliplr()
# Purpose: Flips the array left and right, reversing the order of elements along the first axis (typically the columns).
# Effect: Reverses the order of columns in a 2D array or the first dimension in higher-dimensional arrays.
array = np.array([[1, 2, 3],
                 [4, 5, 6]])

flipped_array = np.fliplr(array)
print(flipped_array)
# fliplr() reverses the order along the first axis (columns).



# flipud()
# Purpose: Flips the array up and down, reversing the order of elements along the last axis (typically the rows).
# Effect: Reverses the order of rows in a 2D array or the last dimension in higher-dimensional arrays.
array = np.array([[1, 2, 3],
                 [4, 5, 6]])

flipped_array = np.flipud(array)
print(flipped_array) 
# flipud() reverses the order along the last axis (rows).

[[3 2 1]
 [6 5 4]]
[[4 5 6]
 [1 2 3]]


# Q9  Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits

In [9]:
# Functionality:

# Splits an array: Divides an array into a specified number of sub-arrays.
# Handles uneven splits: If the array cannot be evenly divided, the last sub-array will contain the remaining elements.
# Returns a list: Returns a list of sub-arrays


# Syntax
# np.array_split(ary, sections, axis=0)


# ary: The array to be split.
# sections: The number of sub-arrays to create.
# axis: The axis along which to split the array (default is 0, which splits rows).

# Example: Even split

arr = np.array([1, 2, 3, 4, 5, 6])
split_arr = np.array_split(arr, 3)
print(split_arr)

# Example Uneven Split

arr = np.array([1, 2, 3, 4, 5, 6, 7])
split_arr = np.array_split(arr, 3)
print(split_arr)

# The array_split() method is flexible and can handle both even and uneven splits.

# The axis parameter allows you to specify the axis along which to split the array.
# The function returns a list of sub-arrays.
# The last sub-array will contain any remaining elements if the split is uneven.



[array([1, 2]), array([3, 4]), array([5, 6])]
[array([1, 2, 3]), array([4, 5]), array([6, 7])]


# Q10 Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations

In [10]:
# Vectorization and Broadcasting in NumPy
# Vectorization and broadcasting are two fundamental concepts in NumPy that significantly enhance the efficiency of array operations. They allow for element-wise operations on entire arrays without the need for explicit loops, leading to substantial performance gains.

# Vectorization
# Definition: Vectorization is the process of performing operations on entire arrays element-wise, rather than using loops to iterate over each element individually.
# Benefits:
# Performance: Vectorized operations are typically much faster than their loop-based counterparts, as they leverage optimized algorithms implemented in compiled languages.
# Readability: Vectorized code is often more concise and easier to understand than equivalent loop-based code.
# Memory efficiency: Vectorized operations can sometimes be more memory-efficient than loops, especially for large arrays.


# x = np.array([1, 2, 3])
# y = np.array([4, 5, 6])

# Vectorized addition
# result = x + y
# print(result) 


# Broadcasting:
# Definition: Broadcasting is a mechanism in NumPy that allows arrays of different shapes to be compatible for arithmetic operations. It automatically expands the smaller array to match the shape of the larger array before performing the operation.Rules:

# The arrays must have compatible shapes.
# If an array has a shape of 1 in a particular dimension, it can be stretched to match the shape of the other array in that dimension.
# If two arrays have different shapes, they can only be broadcast together if one of them has a shape of 1 in one or more dimensions.

import numpy as np
x = np.array([1, 2, 3])
y = np.array([[4], [5], [6]])

# Reshape y to a 1x3 matrix
y = y.reshape(1, 3)

# Broadcasting
result = x * y
print(result)  # Output: [[4 8 12], [5 10 15], [6 12 18]]

[[ 4 10 18]]


# Practical Questions

# Q1 Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns

In [11]:
import random
arr=np.random.randint(1,100,size=(3,3))
arr

array([[63, 82, 46],
       [18, 43, 23],
       [27, 61, 51]])

# Q2 Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array

In [12]:
arr1=np.array([1,2,3,4,5,6,7,8,9,10])
arr1
arr1.reshape(2,5)
arr1.reshape(5,2)

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [ 7,  8],
       [ 9, 10]])

# Q3 Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array.

In [13]:
arr2=np.random.rand(4,4)
bordered_arr = np.pad(arr2, 1)
print(bordered_arr)

[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.29640922 0.96571799 0.38763007 0.22798166 0.        ]
 [0.         0.62768007 0.85260601 0.87808434 0.29265071 0.        ]
 [0.         0.4136439  0.33377584 0.87582461 0.24096133 0.        ]
 [0.         0.37685061 0.54496972 0.92112346 0.27015829 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


# Q4 Using NumPy, create an array of integers from 10 to 60 with a step of 5.

In [14]:
arr3=np.arange(10,60,5)

In [15]:
arr3

array([10, 15, 20, 25, 30, 35, 40, 45, 50, 55])

# Q5  Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.

In [16]:
arr4=np.array(['python','numpy','pandas'])
arr4

array(['python', 'numpy', 'pandas'], dtype='<U6')

In [17]:
np.char.upper(arr4)

array(['PYTHON', 'NUMPY', 'PANDAS'], dtype='<U6')

In [18]:
np.char.lower(arr4)

array(['python', 'numpy', 'pandas'], dtype='<U6')

In [19]:
np.char.capitalize(arr4)

array(['Python', 'Numpy', 'Pandas'], dtype='<U6')

# Q6 Generate a NumPy array of words. Insert a space between each character of every word in the array

In [20]:
words = np.array(['python', 'numpy', 'pandas'])
spaced_words = np.char.join(' ', words)
print(spaced_words)

['p y t h o n' 'n u m p y' 'p a n d a s']


# Q7 Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

In [21]:
arr6=np.array([[1,2,3,4],[5,6,7,8]])
arr7=np.array([[1,2,3,4],[5,6,7,8]])

In [22]:
arr6+arr7

array([[ 2,  4,  6,  8],
       [10, 12, 14, 16]])

In [23]:
arr6-arr7

array([[0, 0, 0, 0],
       [0, 0, 0, 0]])

In [24]:
arr6*arr7

array([[ 1,  4,  9, 16],
       [25, 36, 49, 64]])

In [25]:
arr6/arr7

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])

# Q8  Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.

In [26]:
identity_matrix = np.eye(5)
identity_matrix
diagonal_elements = identity_matrix.diagonal()
diagonal_elements

array([1., 1., 1., 1., 1.])

# Q9 Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

In [27]:
# Function to check prime
def is_prime(num):
    if num <= 1:
        return False
    if num <= 3:
        return True
    if num % 2 == 0 or num % 3 == 0:
        return False
    i = 5
    while i * i <= num:
        if num % i == 0 or num % (i + 2) == 0:
            return False
        i += 6
    return True

# Generate random numbers until 100 prime numbers are found
prime_numbers = []
while len(prime_numbers) < 100:
    random_num = np.random.randint(0, 1001)
    if is_prime(random_num):
        prime_numbers.append(random_num)

# Displaying the prime numbers
print("Prime numbers in the array:", prime_numbers)


Prime numbers in the array: [233, 197, 541, 307, 797, 859, 607, 881, 719, 11, 587, 233, 149, 727, 769, 643, 383, 313, 419, 353, 853, 929, 569, 751, 907, 5, 397, 43, 179, 107, 467, 73, 673, 347, 881, 907, 683, 307, 211, 467, 241, 491, 79, 449, 11, 263, 569, 739, 293, 941, 683, 257, 769, 5, 157, 809, 823, 773, 389, 19, 269, 863, 313, 571, 499, 67, 677, 17, 647, 347, 83, 593, 139, 137, 631, 163, 359, 809, 821, 281, 761, 557, 163, 571, 739, 421, 797, 809, 271, 449, 557, 241, 941, 421, 929, 31, 769, 241, 487, 197]


# Q10 Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages

In [28]:
# Create a NumPy array representing daily temperatures for a month (assuming 30 days)
daily_temperatures = np.random.randint(60, 90, 30)

# Calculate weekly averages
weekly_averages = np.mean(daily_temperatures.reshape(5, 6), axis=1)

# Display the weekly averages
print("Weekly averages:", weekly_averages)

Weekly averages: [73.83333333 78.66666667 69.         69.83333333 79.16666667]
