<a href="https://colab.research.google.com/github/srujany/python-basics/blob/main/numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

1. Explain the purpose and advantages of NumPy in scientific computing and data analysis. How does enhance Python's capabilities for numerical operations?


Purpose of NumPy
Array Objects: NumPy provides a powerful N-dimensional array object (ndarray) that allows for efficient storage and manipulation of large datasets.
Numerical Computation: It offers a wide range of mathematical functions for performing operations on arrays, including basic arithmetic, linear algebra, statistical analysis, and more.
Integration with Other Libraries: NumPy serves as the foundation for many other scientific libraries, such as SciPy, pandas, and Matplotlib, making it a central part of the scientific Python ecosystem.
Advantages of NumPy
Performance: NumPy is implemented in C and optimized for performance, enabling fast operations on large datasets compared to Python's built-in lists.
Memory Efficiency: The ndarray uses contiguous blocks of memory, which reduces overhead and improves memory management compared to standard Python lists.
Broadcasting: NumPy supports broadcasting, allowing arithmetic operations to be performed on arrays of different shapes without explicitly resizing them. This simplifies code and enhances performance.
Vectorization: It allows for vectorized operations, meaning that operations can be applied to entire arrays without the need for explicit loops. This not only speeds up calculations but also leads to cleaner and more readable code.
Rich Functionality: NumPy includes a wide array of mathematical functions, linear algebra routines, random number generation, and tools for Fourier transforms, among others.
Interoperability: NumPy arrays can be easily integrated with other libraries, making it easier to perform complex data analysis and scientific computations.
Community and Support: As one of the most widely used libraries in the Python ecosystem, NumPy has extensive documentation, a large community, and many resources available for learning and troubleshooting.
Enhancements to Python's Numerical Capabilities
Array Manipulation: NumPy provides a rich set of functions to manipulate arrays (e.g., reshaping, slicing, indexing) that are much more efficient than using native Python lists.
Advanced Data Types: It supports various data types, including complex numbers and structured arrays, which are not natively available in Python.
Linear Algebra: Functions for matrix operations (e.g., dot product, eigenvalues) are built-in, making it easier to perform complex mathematical operations that are essential in scientific computing.

2. Compare and contrast np.mean() and np.average() functions in NumPy. When would you use ons over the other?


Key Differences
Weights:

np.mean() does not allow for weights; it always computes a simple average.
np.average() allows for the inclusion of weights, providing flexibility in calculating the average based on the importance of different elements.

3. Describe the methods for reversing a NumPy array along different axes. Provide examples for ID and 2D arrays


In [2]:
import numpy as np

arr = np.array([1, 2, 3,4, 5, 6])

reversed_arr = np.flip(arr)
print(reversed_arr)


arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

reversed_rows = np.flip(arr_2d)

print(reversed_rows)



[6 5 4 3 2 1]
[[9 8 7]
 [6 5 4]
 [3 2 1]]


4. How can you determine the data type of elements in a NumPy array? Discuss the importance of data types in memory management and performance


In [3]:
import numpy as np


arr = np.array([1, 2, 3, 4])
print(arr.dtype)


arr_float = np.array([1.0, 2.0, 3.0])
print(arr_float.dtype)


int64
float64


Importance of Data Types
Memory Management:

Size: Different data types consume different amounts of memory. For instance, an int32 type takes 4 bytes, while int64 takes 8 bytes. Choosing the right data type can significantly affect the memory footprint of your application, especially when dealing with large datasets.
Optimization: By selecting appropriate data types, you can minimize memory usage. For example, if you know that your data will always fit within the range of int8, using this type instead of int64 can save considerable memory.
Performance:

Speed of Computations: Operations on smaller data types can be faster due to lower memory bandwidth usage and better cache utilization. For example, computations using float32 may be faster than using float64 in certain scenarios.
Vectorized Operations: NumPy is optimized for operations on arrays of the same data type. Using homogeneous data types allows NumPy to leverage efficient underlying libraries, resulting in faster computations.
Type-Safety:

Ensuring that the data type matches the intended use (e.g., integers for counting, floats for measurements) helps prevent errors and improves code readability.
Interoperability:

When interfacing with other libraries (e.g., SciPy, pandas, or C libraries), having the correct data type ensures compatibility and proper functioning of your code.

5. Define ndorrays in NumPy and explain their key features. How do they differ fram standard Python lists?


Key Features of ndarrays
Homogeneous Data:

All elements in an ndarray are of the same data type, which can be specified at the time of creation. This homogeneity allows for efficient memory storage and faster computation.
N-Dimensional:

ndarrays can have any number of dimensions (1D, 2D, 3D, etc.). The shape of the array is defined by its dimensions, allowing for complex data structures like matrices, tensors, and higher-dimensional arrays.
Efficient Memory Management:

NumPy arrays use contiguous blocks of memory, which reduces overhead and improves performance, especially for large datasets. This contrasts with standard Python lists, which can have elements stored in non-contiguous memory locations.
Broadcasting:

ndarrays support broadcasting, a powerful mechanism that allows operations on arrays of different shapes without needing explicit replication of data. This leads to more concise and readable code.
Vectorized Operations:

NumPy allows for vectorized operations, meaning you can apply mathematical operations to entire arrays without writing explicit loops. This significantly speeds up calculations and leads to cleaner code.
Rich Functionality:

NumPy provides a wide array of mathematical functions, linear algebra operations, and statistical tools that are optimized for ndarrays, enhancing their usability for scientific computing.
Interoperability:

ndarrays can easily interface with other libraries, such as SciPy and pandas, making them integral to the scientific Python ecosystem.
Differences from Standard Python Lists
Data Type Homogeneity:

ndarrays: All elements must be of the same type.
Python Lists: Can contain elements of different types (e.g., integers, floats, strings).
Performance:

ndarrays: More efficient for numerical computations due to their optimized performance and memory usage.
Python Lists: Slower for numerical operations, especially when handling large datasets.
Dimensionality:

ndarrays: Can easily represent multi-dimensional data (2D matrices, 3D tensors, etc.).
Python Lists: Can also represent multi-dimensional data, but it's less straightforward and often involves nesting lists, which can lead to inefficiencies and more complex code.
Functionality:

ndarrays: Equipped with a comprehensive set of mathematical functions and methods specifically designed for numerical operations.
Python Lists: Limited in terms of built-in mathematical functions; operations usually require iteration or list comprehensions.
Memory Layout:

ndarrays: Store data in a contiguous block of memory, which is better for performance.
Python Lists: Store references to objects, which can lead to higher memory overhead and fragmentation.

6. Analyze the performance benefits of NumPy arrays over Python lists for large-scale numerical operations


1. Memory Efficiency
Contiguous Memory Allocation: NumPy arrays are stored in contiguous blocks of memory, which reduces overhead and fragmentation. This layout allows for better cache utilization, leading to faster access times compared to Python lists, which store references to objects scattered in memory.
Homogeneous Data Types: NumPy arrays require all elements to be of the same type, which enables more efficient memory usage. Python lists can contain mixed types, resulting in additional overhead for type management.
2. Performance in Numerical Computations
Vectorization: NumPy allows for vectorized operations, meaning that operations can be applied to entire arrays at once without explicit loops. This leverages underlying optimized C and Fortran libraries, resulting in significantly faster computations. For example, adding two arrays element-wise is much faster with NumPy than with a Python loop.

In [None]:
import numpy as np
import time

# Using NumPy
arr1 = np.random.rand(10**6)
arr2 = np.random.rand(10**6)

start_time = time.time()
result_np = arr1 + arr2
print("NumPy time:", time.time() - start_time)

# Using Python lists
list1 = arr1.tolist()
list2 = arr2.tolist()

start_time = time.time()
result_list = [x + y for x, y in zip(list1, list2)]
print("Python list time:", time.time() - start_time)


7. Compare vstack() and hstack() functions in Numlly Provice examples demonstrating their usage and output


vstack()
Purpose: Stacks arrays vertically (row-wise). This means it adds arrays as new rows.
Input: Can take a tuple or a list of arrays that have the same shape along all but the first axis.

In [None]:
import numpy as np

# Creating two 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Stacking vertically
result_vstack = np.vstack((arr1, arr2))

print("Result of vstack:")
print(result_vstack)


hstack()
Purpose: Stacks arrays horizontally (column-wise). This means it adds arrays as new columns.
Input: Can also take a tuple or a list of arrays that have the same shape along all but the second axis.

In [None]:
import numpy as np

# Creating two 1D arrays
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])

# Stacking horizontally
result_hstack = np.hstack((arr1, arr2))

print("Result of hstack:")
print(result_hstack)


8. Explain the differences between flipir() and flipud() methods in NumPy, including their effects on various array dimensions



Differences
Axis of Flipping:

flipud(): Flips the array along the vertical axis (up and down).
fliplr(): Flips the array along the horizontal axis (left and right).
Dimensional Effects:

Both functions primarily operate on 2D arrays, but they can also be applied to higher-dimensional arrays by flattening the relevant axes:
For 1D Arrays: Both functions will have no effect, as there's no second axis to flip.
For 3D Arrays: flipud() will flip along the first axis (depth-wise), and fliplr() will flip along the second axis (width-wise).
For N-D Arrays: They affect only the specified axes and leave other axes unchanged.

In [None]:
arr_3d = np.array([[[ 1,  2,  3],
                    [ 4,  5,  6]],

                   [[ 7,  8,  9],
                    [10, 11, 12]]])

flipped_ud_3d = np.flipud(arr_3d)
flipped_lr_3d = np.fliplr(arr_3d)

print("Original 3D array:")
print(arr_3d)

print("Flipped upside down (flipud) on 3D:")
print(flipped_ud_3d)

print("Flipped left to right (fliplr) on 3D:")
print(flipped_lr_3d)


9. Discuss the functionality of the array_split() method in NumPy How does it handle uneven splits


Functionality of array_split()
Basic Usage: The array_split() function allows you to specify the number of splits you want to create from the original array.
Syntax: numpy.array_split(ary, indices_or_sections, axis=0)
ary: The input array to be split.
indices_or_sections: This can be an integer (specifying the number of equal parts to split the array into) or an array of indices at which to split the array.
axis: The axis along which to split the array. The default is 0 (vertical split).

In [None]:
import numpy as np

# Create a 1D array
arr = np.array([1, 2, 3, 4, 5, 6])

# Split the array into 3 equal parts
result = np.array_split(arr, 3)

print("Split result (3 equal parts):")
for sub_array in result:
    print(sub_array)


10. Explain the concepts of vectorization and broadcasting in NumPy. How do they contribute to efficient array operations?

Vectorization
Definition: Vectorization refers to the ability to perform operations on entire arrays (or large chunks of data) at once, rather than using explicit loops. This is made possible by NumPy's underlying implementation in C and Fortran, which allows for optimized computations on arrays.

Benefits:

Speed: Vectorized operations are much faster than iterating through elements with Python loops. This is because NumPy applies optimizations at a low level and can take advantage of CPU capabilities (like SIMD - Single Instruction, Multiple Data).
Conciseness: Vectorized code is generally more concise and readable than equivalent loop-based code. This improves code maintainability.

In [None]:
import numpy as np

# Creating a 2D array
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6]])

# Creating a 1D array
arr_1d = np.array([10, 20, 30])

# Broadcasting: adding a 1D array to a 2D array
result = arr_2d + arr_1d

print("Result of broadcasting:")
print(result)


1. Create a 3x3 NumPy array with random integers between 1 and 100. Then, interchange its rows and columns


In [4]:
import numpy as np

arr = np.random.randint(1,101, size = (3,3))

print("original array", arr)
arr_t = arr.T

print("transposed array", arr_t)



original array [[35 87 58]
 [18 64 54]
 [28 48 12]]
transposed array [[35 18 28]
 [87 64 48]
 [58 54 12]]


2. Generate a ID NumPy array with 10 elements. Reshape it into a 2x5 array, then into a 5x2 απαγ



In [8]:
import numpy as np

arr = np.arange(10)
print("original array",arr)

arr_2 = np.reshape(arr, (2,5))
print("(2,5)",arr_2)

arr_3 = np.reshape(arr,(5,2))
print("(5,2)",arr_3)




original array [0 1 2 3 4 5 6 7 8 9]
(2,5) [[0 1 2 3 4]
 [5 6 7 8 9]]
(5,2) [[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


3. Create a 4x4 NumPy array with random float values. Add a border of zeros around it, resulting in a Bir aray


In [12]:
import numpy as np

arr = np.random.rand(4,4)


print(arr)

bordered_array = np.pad(arr, pad_width=1, mode='constant', constant_values=0)

print(bordered_array)

[[0.72125399 0.55662169 0.34265928 0.34439092]
 [0.45174409 0.24378996 0.92147242 0.32800344]
 [0.44815046 0.01724198 0.88129956 0.75232135]
 [0.84664298 0.3985263  0.50951072 0.31776242]]
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.72125399 0.55662169 0.34265928 0.34439092 0.        ]
 [0.         0.45174409 0.24378996 0.92147242 0.32800344 0.        ]
 [0.         0.44815046 0.01724198 0.88129956 0.75232135 0.        ]
 [0.         0.84664298 0.3985263  0.50951072 0.31776242 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


4. Using NumPy, create an array of integers from 10 to 60 with a step of 5.



In [14]:
import numpy as np
arr = np.arange(10,60,5)
print(arr)

[10 15 20 25 30 35 40 45 50 55]


5. Create a NumPy array of strings [python, numpy, pandas. Apply different case transformations (uppercase, lowercase, title case, etc.) to each element.


In [15]:
import numpy as np
arr = np.array(['python', 'numpy', 'pandas'])

upper = np.char.upper(arr)
lower = np.char.lower(arr)
title = np.char.title(arr)
capitalize_arr = np.char.capitalize(arr)

print("upper",upper)
print("lower",lower)
print("title",title)
print("capitalize",capitalize_arr)

upper ['PYTHON' 'NUMPY' 'PANDAS']
lower ['python' 'numpy' 'pandas']
title ['Python' 'Numpy' 'Pandas']
capitalize ['Python' 'Numpy' 'Pandas']


6. Generate a NumPy array of words insert a space between each character of every word in the array



In [None]:
import numpy as np

# Create a NumPy array of words
words = np.array(['hello', 'world', 'numpy', 'python'])

# Insert a space between each character of every word
spaced_words = np.char.join(' ', words)

# Print the original and the modified arrays
print("Original array:")
print(words)

print("\nArray with spaces between characters:")
for word in spaced_words:
    print(word)


7. Create two 20 NurnPy arrays and perform element wise addition, subtraction, multiplication, and division


In [16]:
import numpy as np

arr1 = np.random.randint(1,20,size = 20)
arr2 = np.random.randint(1,20,size = 20)

add = np.add(arr1,arr2)
sub = np.subtract(arr1,arr2)
mul = np.multiply(arr1,arr2)
div = np.divide(arr1,arr2)

print("add",add)
print("sub",sub)
print("mul",mul)
print("div",div)

add [18 12 25 18 10 20 27 15 24 19 11 11 33 15 12 20 18 28 13 13]
sub [  6  10   1   4  -2  14   5   9  -6   9   7   5  -5  -1 -10 -10  -4   2
  -9 -11]
mul [ 72  11 156  77  24  51 176  36 135  70  18  24 266  56  11  75  77 195
  22  12]
div [ 2.         11.          1.08333333  1.57142857  0.66666667  5.66666667
  1.45454545  4.          0.6         2.8         4.5         2.66666667
  0.73684211  0.875       0.09090909  0.33333333  0.63636364  1.15384615
  0.18181818  0.08333333]


8. Use fiumPy to create a 5x5 identity matrix, then extract its diagonal elements


In [None]:
# prompt: 8. Use fiumPy to create a 5x5 identity matrix, then extract its diagonal elements

import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.eye(5)

# Extract the diagonal elements
diagonal_elements = np.diag(identity_matrix)

print("5x5 Identity Matrix:")
print(identity_matrix)
print("\nDiagonal Elements:")
print(diagonal_elements)

9. Generate a NumPy array of 100 random integers between 0 and 1000. Find and display all prime numbers this array


In [17]:


import numpy as np

def is_prime(n):
  if n <= 1:
    return False
  for i in range(2, int(n**0.5) + 1):
    if n % i == 0:
      return False
  return True


random_integers = np.random.randint(0, 1001, size=100)


prime_numbers = [num for num in random_integers if is_prime(num)]

print("Random Integers:", random_integers)
print("\nPrime Numbers:", prime_numbers)

Random Integers: [921 423 220 790 486 205 842 842 947 156 888 482 457 279 962 874 240 201
 512 379 775 653 350 674 999 603 178 709 524 542 392 623 504 273 346 586
 273 234 565 380 409 863 402 294 858 716 942 121 824 145 917 911 829 518
 494 873 984 211 169 283 600 124 584 807 171 630  13 527 641 219 299 155
 562 423 405 987 201 567 281 245 773 736 884 264 460 623 189 837 905 369
 507 270 389  50 890 212 375 192 778   4]

Prime Numbers: [947, 457, 379, 653, 709, 409, 863, 911, 829, 211, 283, 13, 641, 281, 773, 389]


10. Create a umily analy representing daily temperatures for a month Calculate and display the w averages

In [18]:
# prompt: 10. Create a umily analy representing daily temperatures for a month Calculate and display the w averages

import numpy as np


daily_temperatures = np.random.randint(15, 35, size=30)


weekly_averages = []
for i in range(0, 30, 7):
    week_temps = daily_temperatures[i:min(i + 7, 30)]
    weekly_average = np.mean(week_temps)
    weekly_averages.append(weekly_average)


print("Daily Temperatures:", daily_temperatures)
print("\nWeekly Averages:")
for i, avg in enumerate(weekly_averages):
    print(f"Week {i+1}: {avg:.2f} degrees Celsius")

Daily Temperatures: [20 33 23 21 21 21 29 33 25 22 24 18 25 32 26 20 17 27 26 27 30 15 24 27
 26 31 34 26 34 34]

Weekly Averages:
Week 1: 24.00 degrees Celsius
Week 2: 25.57 degrees Celsius
Week 3: 24.71 degrees Celsius
Week 4: 26.14 degrees Celsius
Week 5: 34.00 degrees Celsius
