Numpy is a fundamental library for scientific computing in Python. Its primary purpose is to provide support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays.

Purpose of NumPy

Array Handling: NumPy introduces the ndarray object, a powerful n-dimensional array that is faster and more efficient than Python's built-in lists for numerical data.

Mathematical Functions: It offers a wide range of mathematical functions, including statistical operations, linear algebra routines, and random number generation.

Integration with Other Libraries: NumPy serves as the foundation for many other scientific libraries, such as SciPy, pandas, and Matplotlib, enabling seamless data manipulation and analysis.

Advantages of NumPy

Performance: NumPy arrays are implemented in C and provide much better performance than traditional Python lists, especially for large datasets. Operations on NumPy arrays are vectorized, meaning they are optimized for speed and can be executed in a single operation without explicit loops.

Memory Efficiency: NumPy arrays consume less memory than Python lists because they store elements of the same type in a contiguous block of memory. This compactness leads to improved performance for large datasets.

Ease of Use: NumPy's syntax and functions allow for concise and readable code. This simplicity makes it easier for researchers and data analysts to perform complex calculations without needing deep programming knowledge.

Broadcasting: NumPy supports broadcasting, a powerful feature that allows arithmetic operations between arrays of different shapes. This flexibility simplifies the coding of operations across arrays without needing to manually align dimensions.

Interoperability: NumPy arrays can be easily integrated with other libraries, such as SciPy for advanced mathematical functions and Matplotlib for data visualization. This interoperability extends Python's capabilities for numerical operations and data analysis.

Community and Ecosystem: As one of the most widely used libraries in the scientific Python ecosystem, NumPy has a large community of users and contributors. This support ensures continuous improvement, extensive documentation, and a wealth of resources for learning.



 NumPy enhances Python's capabilities for numerical operations by providing high-performance array handling, a rich set of mathematical functions, and excellent integration with other scientific libraries. This makes it an essential tool for scientists, engineers, and data analysts working with large datasets and complex numerical computations.





np.mean() and np.average() are functions in NumPy used to compute the average of an array, but they have some key differences in terms of functionality and use cases. Here’s a comparison:

np.mean()

Functionality: Calculates the arithmetic mean (average) of the elements along a specified axis.

Usage: np.mean(a, axis=None, dtype=None, out=None, keepdims=False)

a: Input array.

axis: Axis along which the means are computed. If None, the mean is computed over the flattened array.

dtype: Data type to use for the calculation.
out: An alternative output array to place the result.

keepdims: If True, the axes which are reduced are left in the result as dimensions with size one.

np.average()

Functionality: Computes the weighted average of the elements. If weights are not specified, it behaves like np.mean().

Usage: np.average(a, axis=None, weights=None, returned=False)

a: Input array.

axis: Axis along which to compute the average.

weights: An array of weights the same shape as a. If provided, the average is computed as the sum of the products of weights and values, divided by the sum of the weights.

returned: If True, returns a tuple of the average and the sum of weights.

Key Differences:

Weights:

np.mean() does not support weights; it always computes the unweighted mean.
np.average() supports weighted averages, allowing you to specify weights for the elements.

Use Cases:

Use np.mean() when you simply want the arithmetic mean of the data.
Use np.average() when you need to compute a weighted average, which is particularly useful when certain data points should contribute more significantly to the average

Reversing a NumPy array can be done using slicing techniques, which are quite powerful in NumPy. Below are methods for reversing both 1D and 2D arrays along different axes.

Reversing a 1D Array


For a 1D array, you can reverse the array using slicing with a step of -1

Example:

import numpy as np

# Create a 1D array
arr_1d = np.array([1, 2, 3, 4, 5])

# Reverse the array
reversed_arr_1d = arr_1d[::-1]

print("Original 1D Array:", arr_1d)

print("Reversed 1D Array:", reversed_arr_1d)

Reversing a 2D Array

For a 2D array, you can reverse along a specific axis by specifying the axis in the slicing.


Example:

# Create a 2D array
arr_2d = np.array([[1, 2, 3],

                   [4, 5, 6],

                   [7, 8, 9]])

# Reverse along rows (axis 0)
reversed_rows = arr_2d[::-1]

print("Original 2D Array:\n", arr_2d)

print("Reversed Rows (Axis 0):\n", reversed_rows)




In NumPy, we can determine the data type of elements in an array using the .dtype attribute. This attribute provides information about the data type of the elements contained in the array.

Importance of Data Types
Memory Management:

Efficiency: Different data types consume different amounts of memory. For example, an int32 uses 4 bytes per element, while an int64 uses 8 bytes. Choosing an appropriate data type can significantly reduce memory usage, especially for large datasets.

Contiguous Storage: NumPy stores arrays in contiguous memory blocks, making it more efficient to access and manipulate data. Consistent data types help maintain this contiguous structure.

Performance:

Speed of Operations: Operations on arrays of a specific, simpler data type (e.g., float32 vs. float64) can be faster because the underlying computations are optimized for those types. For example, floating-point operations using float32 may require less processing power than those using float64

.
Vectorization: NumPy’s ability to perform operations on entire arrays at once (vectorization) is highly optimized based on data types. This leads to faster computation times compared to iterative processing in pure Python.

Compatibility with Functions:

Many NumPy functions and operations behave differently depending on the data type. For example, integer division vs. floating-point division can yield different results. Understanding data types helps avoid unintended behavior in calculations.

Data Integrity:

Using the correct data type ensures that the data is represented accurately. For example, using an integer type for counting makes sense, while using a float may introduce rounding errors in certain cases.

ndarray stands for "N-dimensional array" and is a core data structure in NumPy. It provides a powerful way to store and manipulate large datasets efficiently. Here are the key features of ndarrays and how they differ from standard Python lists:

Key Features of ndarrays

Homogeneous Data Types:

All elements in a NumPy array must be of the same data type (e.g., all integers, all floats). This uniformity allows for optimized memory usage and performance.

Multi-dimensional:

ndarrays can be one-dimensional, two-dimensional, or even higher-dimensional. This flexibility enables representation of complex data structures like matrices and tensors.

Fixed Size:

Once created, the size of a NumPy array cannot be changed (though you can create a new array). This fixed size helps in optimizing performance and memory allocation.

Vectorized Operations:

NumPy supports vectorized operations, allowing you to perform operations on entire arrays without the need for explicit loops. This leads to more concise and faster code.

Broadcasting:

NumPy can automatically expand the dimensions of arrays during arithmetic operations, allowing for operations on arrays of different shapes. This feature simplifies many calculations and reduces the need for manual adjustments.

Advanced Indexing and Slicing:

ndarrays offer sophisticated indexing and slicing capabilities, enabling easy manipulation and access to subarrays.

Memory Efficiency:

NumPy arrays are more memory-efficient than Python lists because they store elements in contiguous blocks of memory, leading to reduced overhead.

Built-in Mathematical Functions:

NumPy provides a comprehensive set of mathematical functions that can be applied directly to arrays, including linear algebra, statistical operations, and more.

Differences from Standard Python Lists

Data Types:

Python lists can hold elements of different data types (e.g., integers, floats, strings), whereas ndarrays are homogeneous.

Performance:

Operations on ndarrays are generally faster than on Python lists, especially for large datasets, due to optimization and implementation in C.

Memory Usage:

ndarrays use less memory compared to lists, which can lead to significant performance improvements for large arrays.

Functionality:

NumPy provides numerous functions specifically designed for numerical operations on arrays, while Python lists do not have built-in support for these mathematical operations.

Mutability:

While both structures are mutable, the way they are resized and modified differs. You can change elements in both, but resizing a NumPy array requires creating a new array, whereas Python lists can be dynamically resized.

NumPy arrays offer significant performance benefits over Python lists, particularly for large-scale numerical operations. Here’s a detailed analysis of these benefits:


1. Memory Efficiency

Contiguous Memory Allocation: NumPy arrays are stored in contiguous blocks of memory, which reduces memory overhead and fragmentation. This contrasts with Python lists, which are essentially arrays of pointers to objects. The contiguous storage of NumPy arrays allows for more efficient memory access patterns.

Homogeneous Data Types: All elements in a NumPy array are of the same type, allowing for more efficient storage. Python lists can hold mixed data types, leading to increased memory overhead because each element needs a pointer to the object and type information.

2. Performance of Numerical Operations

Vectorization: NumPy arrays support vectorized operations, allowing operations to be applied to entire arrays without the need for explicit loops. This means that operations can be executed in compiled code, which is significantly faster than interpreted Python code.

3.  Reduced Overhead

Less Overhead per Element: Each element in a NumPy array has less overhead than each element in a Python list. In a list, each item is a reference to an object, which introduces additional memory and time overhead. In contrast, NumPy arrays store raw data, leading to faster access times.

4. Better Cache Utilization
Cache-Friendly: Due to the contiguous storage and predictable access patterns, NumPy arrays are more cache-friendly than Python lists. When processing large datasets, this can lead to better performance because accessing contiguous memory is faster than jumping around to different memory locations.

5. Parallelization and Broadcasting

Broadcasting: NumPy’s broadcasting feature allows operations on arrays of different shapes without the need for explicitly reshaping them. This can lead to simpler and more efficient code when performing mathematical operations.

Parallelization: Some NumPy operations can leverage multi-threading or SIMD (Single Instruction, Multiple Data) capabilities of modern CPUs, further enhancing performance on large datasets.


n NumPy, vstack() and hstack() are functions used to stack arrays vertically and horizontally, respectively. Here's a detailed comparison of the two functions, along with examples to illustrate their usage.

vstack()

Functionality: Stacks arrays in sequence vertically (row-wise). This means that the input arrays are joined along a new axis, creating a new array with more rows.

Usage: numpy.vstack(tup), where tup is a tuple of arrays to be stacked.

Example of vstack()


import numpy as np

# Create two 2D arrays
array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[7, 8, 9],
                   [10, 11, 12]])

# Stack the arrays vertically
result_vstack = np.vstack((array1, array2))

print("Result of vstack:")

print(result_vstack)

Output:


Result of vstack:
[[ 1  2  3]

 [ 4  5  6]

 [ 7  8  9]

 [10 11 12]]

hstack()

Functionality: Stacks arrays in sequence horizontally (column-wise). This means that the input arrays are joined along a new axis, creating a new array with more columns.

Usage: numpy.hstack(tup), where tup is a tuple of arrays to be stacked.

Example of hstack()

# Create two 2D arrays
array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[7, 8],
                   [9, 10]])

# Stack the arrays horizontally
result_hstack = np.hstack((array1, array2))

print("Result of hstack:")

print(result_hstack)

Output:


Result of hstack:
[[ 1  2  3  7  8]

 [ 4  5  6  9 10]]

In NumPy, fliplr() and flipud() are functions used to flip arrays along specific axes. Here's a detailed explanation of the differences between these two methods, along with their effects on various array dimensions.

fliplr()

Functionality: Flips an array horizontally (left to right). This means that the columns of the array are reversed.

Usage: numpy.fliplr(m), where m is the input array.

Effect: The order of the columns is reversed, while the rows remain unchanged.

flipud()

Functionality: Flips an array vertically (up to down). This means that the rows of the array are reversed.

Usage: numpy.flipud(m), where m is the input array.

Effect: The order of the rows is reversed, while the columns remain unchanged.

Effects on Various Array Dimensions

1D Arrays:

Both fliplr() and flipud() will yield the same result because there is only one dimension to reverse.

2D Arrays:

fliplr(): Reverses the order of columns.

flipud(): Reverses the order of rows.

3D Arrays:

For 3D arrays, both functions will still apply to the last two dimensions, effectively flipping the respective axes.

The array_split() method in NumPy is used to split an array into multiple sub-arrays. It is particularly useful when you need to divide an array into smaller segments for processing or analysis. Here's a detailed look at its functionality, including how it handles uneven splits.

Functionality of array_split()

Syntax: numpy.array_split(ary, indices_or_sections, axis=0)


ary: The input array to be split.

indices_or_sections: This can be either an integer (indicating the number of equal parts to split the array into) or a list/array of indices at which to split the array.

axis: The axis along which to split the array. The default is 0 (rows for 2D arrays)

Handling Uneven Splits

When the number of elements in the array is not evenly divisible by the specified number of splits, array_split() will handle this by distributing the remaining elements among the resulting sub-arrays. Here's how it works:


If the array can be divided evenly, it will create the specified number of sub-arrays.
If it cannot, the last sub-array will contain the remaining elements.

Vectorization and broadcasting are two powerful concepts in NumPy that significantly enhance the efficiency of array operations, making them much faster and more concise compared to traditional loop-based approaches. Here's an explanation of each concept and how they contribute to efficient computations.


Vectorization

Concept:

 Vectorization refers to the ability to perform operations on entire arrays (or large chunks of data) at once, rather than element by element. This is achieved through the use of NumPy's underlying implementation in C, which allows for optimized execution of mathematical operations.

Advantages:

Speed: Vectorized operations are executed at the C level, which is much faster than Python's for-loops. This leads to significant performance gains, especially with large datasets.

Conciseness: Vectorized operations lead to cleaner and more readable code. Instead of writing multiple lines of code to iterate through an array, you can express complex operations in a single line.

Broadcasting

Concept:

 Broadcasting is a feature that allows NumPy to perform arithmetic operations on arrays of different shapes. When operating on arrays of different sizes, NumPy automatically expands the smaller array across the larger array’s dimensions so that they have compatible shapes.

Mechanism:

Rule of Compatibility: Two arrays are compatible when they have the same shape, or when one of them has a size of 1 along a dimension. This allows NumPy to "broadcast" the smaller array to the shape of the larger one.

Expansion: During the operation, the smaller array is virtually expanded (without actual memory duplication) to match the shape of the larger array.


In [1]:
# we  can create a 3x3 NumPy array with random integers between 1 and 100 and then interchange its rows and columns (transpose the array) using the following code:


import numpy as np

# Create a 3x3 array with random integers between 1 and 100
array = np.random.randint(1, 101, size=(3, 3))

print("Original Array:")
print(array)

# Interchange rows and columns (transpose the array)
transposed_array = array.T

print("\nTransposed Array:")
print(transposed_array)

Original Array:
[[59 16 86]
 [85  5 72]
 [70 12 12]]

Transposed Array:
[[59 85 70]
 [16  5 12]
 [86 72 12]]


In [2]:
# we can create a 1D NumPy array with 10 elements and then reshape it into a 2x5 array and subsequently into a 5x2 array using the following code:


import numpy as np

# Generate a 1D NumPy array with 10 elements
array_1d = np.arange(10)  # This creates an array with elements from 0 to 9

print("Original 1D Array:")
print(array_1d)

# Reshape into a 2x5 array
array_2x5 = array_1d.reshape(2, 5)
print("\nReshaped into 2x5 Array:")
print(array_2x5)

# Reshape into a 5x2 array
array_5x2 = array_2x5.reshape(5, 2)
print("\nReshaped into 5x2 Array:")
print(array_5x2)

Original 1D Array:
[0 1 2 3 4 5 6 7 8 9]

Reshaped into 2x5 Array:
[[0 1 2 3 4]
 [5 6 7 8 9]]

Reshaped into 5x2 Array:
[[0 1]
 [2 3]
 [4 5]
 [6 7]
 [8 9]]


In [3]:
# we can create a 4x4 NumPy array with random float values and then add a border of zeros around it to create a 6x6 array using the following code:


import numpy as np

# Create a 4x4 array with random float values
array_4x4 = np.random.rand(4, 4)

print("Original 4x4 Array:")
print(array_4x4)

# Add a border of zeros around the 4x4 array
array_with_border = np.pad(array_4x4, pad_width=1, mode='constant', constant_values=0)

print("\n6x6 Array with Border of Zeros:")
print(array_with_border)

Original 4x4 Array:
[[0.54895473 0.50637229 0.6145008  0.87334714]
 [0.40561458 0.4193241  0.03023558 0.89268316]
 [0.41037827 0.24587303 0.9503601  0.41334971]
 [0.78064929 0.79019317 0.82658658 0.41868309]]

6x6 Array with Border of Zeros:
[[0.         0.         0.         0.         0.         0.        ]
 [0.         0.54895473 0.50637229 0.6145008  0.87334714 0.        ]
 [0.         0.40561458 0.4193241  0.03023558 0.89268316 0.        ]
 [0.         0.41037827 0.24587303 0.9503601  0.41334971 0.        ]
 [0.         0.78064929 0.79019317 0.82658658 0.41868309 0.        ]
 [0.         0.         0.         0.         0.         0.        ]]


In [4]:
# we can create an array of integers from 10 to 60 with a step of 5 using NumPy's np.arange() function. Here's how you can do it:


import numpy as np

# Create an array of integers from 10 to 60 with a step of 5
array = np.arange(10, 61, 5)

print("Array of integers from 10 to 60 with a step of 5:")
print(array)

Array of integers from 10 to 60 with a step of 5:
[10 15 20 25 30 35 40 45 50 55 60]


In [5]:
# we  can create a NumPy array of strings and then apply various case transformations such as uppercase, lowercase, title case, etc. Here’s how you can do this:


import numpy as np

# Create a NumPy array of strings
string_array = np.array(['python', 'numpy', 'pandas'])

# Apply different case transformations
uppercase_array = np.char.upper(string_array)
lowercase_array = np.char.lower(string_array)
titlecase_array = np.char.title(string_array)
capitalize_array = np.char.capitalize(string_array)

# Display the results
print("Original Array:")
print(string_array)

print("\nUppercase:")
print(uppercase_array)

print("\nLowercase:")
print(lowercase_array)

print("\nTitle Case:")
print(titlecase_array)

print("\nCapitalized:")
print(capitalize_array)

Original Array:
['python' 'numpy' 'pandas']

Uppercase:
['PYTHON' 'NUMPY' 'PANDAS']

Lowercase:
['python' 'numpy' 'pandas']

Title Case:
['Python' 'Numpy' 'Pandas']

Capitalized:
['Python' 'Numpy' 'Pandas']


In [8]:
# we can generate a NumPy array of words and insert a space between each character of every word using the following code:


import numpy as np

# Create a NumPy array of words
words_array = np.array(['hello', 'numpy', 'array', 'pwskills'])

# Insert a space between each character of every word
spaced_words_array = np.char.join(' ', words_array)

# Display the results
print("Original Array of Words:")
print(words_array)

print("\nArray with Spaces Between Characters:")
for word in spaced_words_array:
    print(word)

Original Array of Words:
['hello' 'numpy' 'array' 'pwskills']

Array with Spaces Between Characters:
h e l l o
n u m p y
a r r a y
p w s k i l l s


In [9]:
# we can create two 2D NumPy arrays and perform element-wise operations such as addition, subtraction, multiplication, and division. Here's how to do it:


import numpy as np

# Create two 2D NumPy arrays
array1 = np.array([[1, 2, 3],
                   [4, 5, 6]])

array2 = np.array([[10, 20, 30],
                   [40, 50, 60]])

# Perform element-wise addition
addition_result = array1 + array2

# Perform element-wise subtraction
subtraction_result = array1 - array2

# Perform element-wise multiplication
multiplication_result = array1 * array2

# Perform element-wise division
division_result = array1 / array2

# Display the results
print("Array 1:")
print(array1)

print("\nArray 2:")
print(array2)

print("\nElement-wise Addition:")
print(addition_result)

print("\nElement-wise Subtraction:")
print(subtraction_result)

print("\nElement-wise Multiplication:")
print(multiplication_result)

print("\nElement-wise Division:")
print(division_result)

Array 1:
[[1 2 3]
 [4 5 6]]

Array 2:
[[10 20 30]
 [40 50 60]]

Element-wise Addition:
[[11 22 33]
 [44 55 66]]

Element-wise Subtraction:
[[ -9 -18 -27]
 [-36 -45 -54]]

Element-wise Multiplication:
[[ 10  40  90]
 [160 250 360]]

Element-wise Division:
[[0.1 0.1 0.1]
 [0.1 0.1 0.1]]


In [10]:
# we can create a 5x5 identity matrix using NumPy and then extract its diagonal elements with the following code:


import numpy as np

# Create a 5x5 identity matrix
identity_matrix = np.eye(5)

print("5x5 Identity Matrix:")
print(identity_matrix)

# Extract the diagonal elements
diagonal_elements = np.diag(identity_matrix)

print("\nDiagonal Elements:")
print(diagonal_elements)

5x5 Identity Matrix:
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]

Diagonal Elements:
[1. 1. 1. 1. 1.]


In [11]:
# we can generate a NumPy array of 100 random integers between 0 and 1000 and then find and display all prime numbers in that array using the following code:


import numpy as np

# Generate a NumPy array of 100 random integers between 0 and 1000
random_integers = np.random.randint(0, 1001, size=100)

# Function to check if a number is prime
def is_prime(num):
    if num <= 1:
        return False
    for i in range(2, int(num**0.5) + 1):
        if num % i == 0:
            return False
    return True

# Find all prime numbers in the array
prime_numbers = np.array([num for num in random_integers if is_prime(num)])

# Display the results
print("Random Integers Array:")
print(random_integers)

print("\nPrime Numbers in the Array:")
print(prime_numbers)

Random Integers Array:
[ 55 851 785 381 589 488 691 121 242 148 895 787 143 642 510 761 774 275
  31 432  83 845 815 870 903 945 643 968 126 992 436 466 837 825  37 487
 122 705 422 115 596  92 232 367 817 509 649 429 755  52 569 816 502 698
 212 876 762 445 702 731  20 453 798 200 538 715 833 650 473 314 390 412
 981  35 182 887  98 647 934  24   0 145 473 923 775 591 622   1 451 108
 411 327 870 865 573 222 539 489 487 504]

Prime Numbers in the Array:
[691 787 761  31  83 643  37 487 367 509 569 887 647 487]


In [15]:
# we can create a NumPy array representing daily temperatures for a month (30 days) and then calculate and display the weekly averages. Here's how to do it:

import numpy as np

# Create a NumPy array representing daily temperatures for 30 days
# For example, random temperatures between 15 and 30 degrees Celsius
daily_temperatures = np.random.randint(15, 31, size=30)

print("Daily Temperatures for the Month:")
print(daily_temperatures)

# Reshape the array to a 4-week format (4 weeks of 7 days + 2 extra days)
reshaped_temperatures = daily_temperatures.reshape(4, 7)

# Calculate weekly averages
weekly_averages = np.mean(reshaped_temperatures, axis=1)

print("\nWeekly Averages:")
print(weekly_averages)


Daily Temperatures for the Month:
[25 25 29 30 17 28 27 17 26 30 30 22 27 15 23 27 25 16 19 22 15 23 24 22
 16 25 18 27 21 23]


ValueError: cannot reshape array of size 30 into shape (4,7)