# Assignment : Numpy

# Theoretical Questions:

# 1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does it enhance Python's capabilities for numerical operations?

### -> NumPy (Numerical Python) is a powerful library in Python primarily used for numerical operations, scientific computing, and data analysis. Its core purpose is to enable efficient manipulation of large, multi-dimensional arrays and matrices, along with the mathematical functions needed to operate on these data structures.

##  Purpose of NumPy:
### N-Dimensional Array (ndarray): The primary data structure in NumPy is the ndarray, which allows for the creation and manipulation of multi-dimensional arrays with ease.
### Broadcasting: NumPy supports broadcasting, which simplifies arithmetic operations on arrays of different shapes.
### Vectorization: Operations on arrays in NumPy are highly optimized and executed in a vectorized manner, significantly speeding up mathematical computations by avoiding loops in Python.
### Linear Algebra Operations: NumPy provides extensive support for linear algebra functions, including matrix multiplication, eigenvalues, and vector norms.
### Random Sampling and Statistical Functions: It includes tools for generating random numbers and performing statistical operations on data.
### Interoperability with C/C++ and Fortran: NumPy allows integration with low-level languages like C, C++, and Fortran, improving performance when handling very large datasets.

## Advantages of NumPy
### Performance:

### Efficient memory use: NumPy arrays are more memory-efficient than traditional Python lists because they store data contiguously and use fixed data types.
### Speed: NumPy operations are executed much faster than equivalent operations in pure Python due to its underlying C implementation. It eliminates the need for loops, enabling performance gains.
### Convenient Data Structures: The ndarray structure enables the efficient handling of multi-dimensional arrays and matrices, making it ideal for complex scientific computing tasks.
### Mathematical Functions:NumPy offers a comprehensive set of mathematical operations, including matrix manipulations, element-wise operations, and broadcasting. These operations are optimized to work seamlessly on entire arrays at once, which reduces the need for writing explicit loops.


# 2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use one over the other?

### Both np.mean() and np.average() in NumPy are used to compute the central tendency of a dataset, but they differ in how they compute the result, particularly in terms of weighting and functionality. Here is a detailed comparison of the two:

## 1. np.mean()
### Purpose:

### The np.mean() function computes the arithmetic mean (average) of the elements along a specified axis or of the entire array if no axis is provided.
## Usage:

### Syntax: np.mean(a, axis=None, dtype=None, out=None, keepdims=False)
### Example:

In [2]:
import numpy as np

In [4]:
a = np.array([1,2,3,4,5])
c = np.mean(a)

In [6]:
print(c)

3.0


## 2. np.average()
### Purpose:

### The np.average() function can also compute the arithmetic mean, but with the added capability to compute a weighted average if weights are provided.
### Usage:

### Syntax: np.average(a, axis=None, weights=None, returned=False)
### Example:

In [9]:
a = np.array([1,2,3,4,5])
c = np.average(a, weights = [1,2,3,4,5])

In [11]:
print(c)

3.6666666666666665


# 3. Describe the methods for reversing a NumPy array along different axes. Provide examples for 1D and 2D arrays.

## 1. Reversing a 1D Array
### For a 1D array, reversing is simple and can be done using slicing. You specify the step size of -1 in the slice to reverse the elements.
### Example:

In [14]:
arr1 = np.array([1,2,3,4,5])
arr1[::-1]

array([5, 4, 3, 2, 1])

## 2. Reversing a 2D Array
### In a 2D array, you can reverse along different axes:

### Reverse rows (axis 0)
### Reverse columns (axis 1)
### Reverse both axes (complete reversal)

In [22]:
arr2 = np.random.randint(2,6,(3,3))
arr2

array([[3, 5, 2],
       [3, 5, 3],
       [4, 4, 3]], dtype=int32)

In [25]:
# reverse rows
arr2[::-1,:]

array([[4, 4, 3],
       [3, 5, 3],
       [3, 5, 2]], dtype=int32)

In [26]:
# reverse columns
arr2[:,::-1]

array([[2, 5, 3],
       [3, 5, 3],
       [3, 4, 4]], dtype=int32)

In [27]:
# reverse both axis
arr2[::-1,::-1]

array([[3, 4, 4],
       [3, 5, 3],
       [2, 5, 3]], dtype=int32)

# 4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance.

## Determining the Data Type of Elements in a NumPy Array
### To determine the data type of elements in a NumPy array, you can use the dtype attribute, which provides information about the type of data stored in the array.

### Example:

In [28]:
arr = np.array([1,2,3,4])
arr.dtype

dtype('int64')

## Importance of Data Types in Memory Management and Performance
### Data types (or dtype in NumPy) play a crucial role in how NumPy arrays are stored in memory and how efficiently operations can be performed on them. Here's why they matter:

## 1. Memory Management
### Each data type in NumPy has a specific memory requirement. For instance:

### int8: 1 byte (8 bits) per element
### int32: 4 bytes (32 bits) per element
### float64: 8 bytes (64 bits) per element

## 2. Performance
### The data type affects how efficiently NumPy can perform operations. NumPy is optimized for performance by using fixed data types and contiguous memory allocation, which enables it to perform operations in a vectorized way and interact efficiently with lower-level languages like C.

### Faster Computations: Smaller data types (e.g., int8, float32) require fewer bytes, meaning operations on these arrays may be faster than on larger data types like int64 or float64. This can make a significant difference in performance when working with large datasets.
### Efficient Use of CPU Caches: Smaller data types allow more data to fit into the CPU cache, reducing memory access times and improving computational efficiency.

# 5. Define ndarrays in NumPy and explain their key features. How do they differ from standard Python lists?

### An ndarray (N-dimensional array) is the core data structure in NumPy. It is a powerful, flexible container for multi-dimensional homogeneous data, meaning all elements in the array are of the same data type. An ndarray can have any number of dimensions (1D, 2D, 3D, etc.), making it useful for a wide range of applications in scientific computing, data analysis, and machine learni

## Key Features of NumPy ndarray
## Homogeneous Data Types:

### All elements in an ndarray have the same data type (e.g., integers, floats, booleans). This enables efficient memory use and fast computations.
### The data type of an ndarray is specified using the dtype attribute, and NumPy supports a variety of types such as int32, float64, bool, etc.
## N-Dimensional:

### ndarray can handle multi-dimensional data. You can create arrays of arbitrary dimensions, such as 1D (vector), 2D (matrix), or higher dimensions like 3D, 4D, etc.
### The shape attribute describes the dimensions of the array (e.g., for a 2D array with 3 rows and 4 columns, shape would be (3, 4)).
## Efficient Memory Management:

### ndarray stores data in contiguous memory blocks, which allows efficient access and manipulation of array elements. This makes NumPy arrays significantly faster than Python lists for numerical computations.
### The memory footprint of an ndarray is fixed and smaller compared to Python lists because the data is stored compactly without additional overhead.
## Broadcasting:

### NumPy arrays support broadcasting, which allows you to perform arithmetic operations on arrays of different shapes without explicitly reshaping them. This simplifies array operations, such as adding a scalar to a matrix or element-wise operations between arrays of different sizes.
## Mathematical Functions:

### NumPy provides a wide range of mathematical functions (e.g., trigonometric, exponential, statistical) that can be applied directly to ndarray objects.
## Shape Manipulation:

### ndarray allows you to reshape, transpose, and flatten arrays easily using functions like reshape(), T, and ravel(). This flexibility makes it easier to handle multi-dimensional data.

## How ndarray Differs from Standard Python Lists

### NumPy ndarray is optimized for numerical computations with homogeneous, multi-dimensional data, offering efficient memory usage, fast performance, and support for vectorized operations.
### Python lists, on the other hand, are more flexible with heterogeneous data, but they are less efficient for large-scale numerical or scientific computing tasks.

# 6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations.

## Performance Benefits of NumPy Arrays Over Python Lists for Large-Scale Numerical Operations

### NumPy arrays (or ndarray) are optimized for high-performance numerical operations, whereas Python lists are more general-purpose and flexible but much slower for numerical tasks. Below are key reasons why NumPy arrays outperform Python lists when it comes to large-scale numerical operations:

## 1. Memory Efficiency :

### Fixed Data Type (Homogeneous): In NumPy arrays, all elements have the same data type, allowing for more compact storage in contiguous memory blocks. This reduces the memory overhead compared to Python lists, where each element is a Python object with additional metadata (e.g., type information and references).

### Contiguous Memory Layout: NumPy arrays are stored in contiguous memory blocks, unlike Python lists, which store elements in scattered memory locations. This contiguous layout improves cache locality, meaning that accessing elements is faster because they are stored next to each other in memory.

## 2. Advanced Indexing and Slicing (Views, Not Copies)
### Memory Views: When slicing a NumPy array, NumPy returns a view of the original data rather than creating a copy. This allows efficient memory use since no new data is allocated. In contrast, Python lists create copies when sliced, which uses more memory and increases processing time.

### Efficient Element Access: Due to the contiguous memory layout and low-level optimizations, accessing elements in a NumPy array is faster than accessing elements in a Python list, especially in large datasets.

## 3. Support for Mathematical Functions
### Efficient Math Operations: NumPy comes with built-in mathematical functions (e.g., trigonometric, statistical, linear algebra) that are optimized for use with arrays. These functions are written in C and Fortran, making them far faster than similar operations performed using loops in Python lists.

### Bulk Operations: NumPy allows for bulk mathematical operations on entire arrays (e.g., sum, mean, standard deviation) in a single command, without needing to iterate over individual elements. Python lists require manual iteration, which slows down these operations significantly.

# 7. Compare vstack() and hstack() functions in NumPy. Provide examples demonstrating their usage and output.

## In NumPy, the vstack() and hstack() functions are used to stack arrays along different axes. While both functions combine arrays, they do so in different orientations:

### 1. vstack() (Vertical Stack): Stacks arrays vertically (row-wise), increasing the number of rows.
### 2. hstack() (Horizontal Stack): Stacks arrays horizontally (column-wise), increasing the number of columns.

## 1. vstack() (Vertical Stack)
### Purpose: Stack arrays vertically (row-wise). The arrays being stacked must have the same number of columns (i.e., the same second dimension).
### Example:

In [29]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

result_vstack = np.vstack((arr1, arr2))
print(result_vstack)

[[1 2 3]
 [4 5 6]]


## 2. hstack() (Horizontal Stack)

### Purpose: Stack arrays horizontally (column-wise). The arrays must have the same number of rows (i.e., the same first dimension).
### Example:

In [30]:
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

result_hstack = np.hstack((arr1, arr2))
print(result_hstack)

[1 2 3 4 5 6]


# 8. Explain the differences between fliplr() and flipud() methods in NumPy, including their effects on various array dimensions. 

## Key Differences between fliplr() and flipud()

## 1. fliplr() (Flip Left-Right)
### Purpose: fliplr() reverses the elements of a 2D array horizontally (along the columns), flipping the array from left to right. This means that the first column becomes the last column, the second column becomes the second-to-last column, and so on.

### Effect: fliplr() only works on 2D arrays (or higher dimensions where the second axis is flipped). For 1D arrays, using fliplr() will raise an error.

## 2. flipud() (Flip Up-Down)
### Purpose: flipud() reverses the elements of a 2D array vertically (along the rows), flipping the array from top to bottom. This means the first row becomes the last row, the second row becomes the second-to-last row, and so on.

### Effect: flipud() works on both 1D and 2D arrays. In a 2D array, it flips the rows, and in a 1D array, it reverses the elements.

# 9. Discuss the functionality of the array_split() method in NumPy. How does it handle uneven splits?

## array_split() in NumPy
### The array_split() function in NumPy is used to split an array into multiple sub-arrays. Unlike its counterpart, split(), array_split() can handle uneven splits by dividing the array into sections of varying sizes if the array cannot be split evenly.

### Syntax : numpy.array_split(ary, indices_or_sections, axis=0)

## How array_split() Handles Uneven Splits:
### When the input array cannot be split into equal parts, array_split() ensures that all sub-arrays are as equal in size as possible. It distributes the remainder evenly among the earlier splits.

### If the number of elements in the array is not divisible by the number of sections:
### Some sub-arrays will have more elements than others.
### The function creates sub-arrays of nearly equal size, and the first few sub-arrays will have one extra element.

# 10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

## Vectorization in NumPy

### Vectorization refers to the process of applying operations to entire arrays (or vectors) element-wise without the need for explicit loops.

### Example:

In [31]:
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

result = arr1 + arr2
print(result)

[ 6  8 10 12]


## Broadcasting in NumPy

### Broadcasting is a powerful feature in NumPy that allows operations on arrays of different shapes, treating smaller arrays as if they have the same shape as the larger one. Broadcasting enables arithmetic operations between arrays of unequal sizes, without making copies of the smaller arrays.

### Example:

In [32]:
arr1 = np.array([[1, 2, 3], 
                 [4, 5, 6], 
                 [7, 8, 9]])

arr2 = np.array([10, 20, 30])
result = arr1 + arr2
print(result)

[[11 22 33]
 [14 25 36]
 [17 28 39]]


# Practical Questions:

# 1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns.

In [34]:
# creating numpy aray with random integer between 1 and 100
arr1 = np.random.randint(1,100,(3,3))
print(arr1)

[[99 12 70]
 [29 56  2]
 [93 86 52]]


In [35]:
# interchanging rows and columns
result = arr1[::-1,::-1]
print(result)

[[52 86 93]
 [ 2 56 29]
 [70 12 99]]


# 2. Generate a 1D NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 array.

In [36]:
arr1 = np.random.randint(1,10,10)
print(arr1)

[9 5 5 8 5 6 5 9 1 3]


In [38]:
# reshaping it into 2x5
np.reshape(arr1,(2,5))

array([[9, 5, 5, 8, 5],
       [6, 5, 9, 1, 3]], dtype=int32)

In [39]:
# reshaping it into 5x2
np.reshape(arr1,(5,2))

array([[9, 5],
       [5, 8],
       [5, 6],
       [5, 9],
       [1, 3]], dtype=int32)

# 3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a 6x6 array

In [43]:
# creating 4x4 numpy array with random float values
arr1 = np.random.rand(4,4)
print(arr1)

[[0.44208242 0.82610602 0.20344164 0.31750372]
 [0.33981478 0.35024198 0.06845173 0.61908783]
 [0.21208082 0.61292038 0.60520384 0.16267945]
 [0.12264141 0.00838475 0.56165937 0.12898745]]


In [45]:
# adding a border of zeros around it.
borderd_arr = np.pad(arr1, pad_width=1, mode='constant', constant_values= 0)
print(borderd_arr)

[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.44208242 0.82610602 0.20344164 0.31750372 0.        ]
 [0.         0.33981478 0.35024198 0.06845173 0.61908783 0.        ]
 [0.         0.21208082 0.61292038 0.60520384 0.16267945 0.        ]
 [0.         0.12264141 0.00838475 0.56165937 0.12898745 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


# 4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.

In [47]:
arr1 = np.arange(10,60,5)
print(arr1)

[10 15 20 25 30 35 40 45 50 55]


# 5. Create a NumPy array of strings ['python', 'numpy', 'pandas']. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.

In [48]:
# create numpy array of string
arr1 = np.array(["python","numpy","pandas"])
print(arr1)

['python' 'numpy' 'pandas']


In [50]:
# transforming strings in uppercase
result = np.char.upper(arr1)
print(result)

['PYTHON' 'NUMPY' 'PANDAS']


In [51]:
# transforming strings in lowercase
result = np.char.lower(arr1)
print(result)

['python' 'numpy' 'pandas']


In [52]:
# transforming strings in title case
result = np.char.capitalize(arr1)
print(result)

['Python' 'Numpy' 'Pandas']


# 6. Generate a NumPy array of words. Insert a space between each character of every word in the array.

In [53]:
# create a word array
array_word = np.array(["Saif","Shubham","Ankit","Karan","Anurag"])

# creating a function to insert space between each character
def insert_space(words):
    return ' '.join(words)

result = np.vectorize(insert_space)(array_word)
print(result)

['S a i f' 'S h u b h a m' 'A n k i t' 'K a r a n' 'A n u r a g']


# 7. Create two 2D NumPy arrays and perform element-wise addition, subtraction, multiplication, and division.

In [56]:
# create 2D numpy array
arr1 = np.random.randint(1,10,(3,3))
arr2 = np.random.randint(1,10,(3,3))
print(arr1)
print(arr2)

[[2 9 3]
 [9 2 7]
 [3 6 2]]
[[9 9 4]
 [2 6 1]
 [3 2 5]]


In [57]:
#addition of arrays
result = arr1 + arr2
print(result)

[[11 18  7]
 [11  8  8]
 [ 6  8  7]]


In [58]:
#subtraction of arrays
result = arr1 - arr2
print(result)

[[-7  0 -1]
 [ 7 -4  6]
 [ 0  4 -3]]


In [59]:
#multiplication of arrays
result = arr1 * arr2
print(result)

[[18 81 12]
 [18 12  7]
 [ 9 12 10]]


In [60]:
#division of arrays
result = arr1 / arr2
print(result)

[[0.22222222 1.         0.75      ]
 [4.5        0.33333333 7.        ]
 [1.         3.         0.4       ]]


# 8. Use NumPy to create a 5x5 identity matrix, then extract its diagonal elements.

In [65]:
# create 5x5 identity matrix
identity_matrix = np.eye(5)
print("identity matrix:")
print(identity_matrix)

# extract diagonal elements
diagonal_element = np.diagonal(identity_matrix)
print("diagonal element:")
print(diagonal_element)

identity matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
diagonal element:
[1. 1. 1. 1. 1.]


# 9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers in this array.

In [67]:
# create numpy array between 0 to 1000 of size 100
arr1 = np.random.randint(0,1000,size = 100)

# create a function to check prime number
def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

prime_numbers = np.array([num for num in arr1 if is_prime(num)])
print(prime_numbers)

[839  11 443 881 863 809 701 983 967 479 353 397 367 359 251 919 443  67]


# 10. Create a NumPy array representing daily temperatures for a month. Calculate and display the weekly averages.

In [84]:
# create a numpy array daily temperatures for
arr1 = np.random.randint(1,35,size = 28)

# calculating weekly temperatures
arr2 = arr1.reshape(4,7)
result = np.mean(arr2, axis = 0)
print(result)

[18.25 13.25 11.25 13.5  10.   14.75 17.  ]
