#### Python Data Mastery - A Hands-On Guide to Efficient Data Analysis for Engineers and Students
##### Chapter 2: NumPy Fundamentals

##### Understanding NumPy Arrays

In [None]:
# Consolidate all imports to one cell at the top
import numpy as np

NumPy arrays can be created in various ways, each suited to different scenarios you might encounter in data analysis.To create a basic NumPy array, you can convert a Python list using the np.array() function. For example:

In [None]:
my_list = [1, 2, 3, 4, 5]
my_array = np.array(my_list)
print(my_array)  # Output: [1 2 3 4 5]

This creates a one-dimensional array. NumPy's true power, however, lies in its ability to handle multi-dimensional arrays effortlessly. You can create a 2D array (matrix) by nesting lists:


In [None]:
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(matrix)

In [None]:
""" These functions are particularly useful when initializing arrays for further computations or when creating placeholder arrays
in data analysis workflows."""

zeros = np.zeros((3, 4))  # 3x4 array of zeros
ones = np.ones((2, 2))  # 2x2 array of ones
sequence = np.arange(0, 10, 2)  # array([0, 2, 4, 6, 8])

Understanding the shape and dimensions of NumPy arrays is crucial for effective data manipulation. The shape attribute of a NumPy array returns a tuple representing the array's dimensions. For a 1D array, it returns a tuple with one element,while for a 2D array, it returns a tuple with two elements (rows,columns). The ndim attribute returns the number of dimensions of the array. For instance:

In [None]:
arr1d = np.array([1, 2, 3, 4])
print(arr1d.shape)  # Output: (4,)
print(arr1d.ndim)  # Output: 1
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2d.shape)  # Output: (2, 3)
print(arr2d.ndim)  # Output: 2

NumPy arrays also support a wide range of data types,including integers, floating-point numbers, complex numbers,and even custom data types. The dtype attribute of an array reveals its data type. You can specify the data type when creating an array to ensure precision or to optimize memory usage:

In [None]:
int_array = np.array([1, 2, 3], dtype=np.int32)
float_array = np.array([1.0, 2.0, 3.0], dtype=np.float64)
print(int_array.dtype)  # Output: int32
print(float_array.dtype)  # Output: float64

One of the most powerful features of NumPy arrays is their ability to be reshaped without copying the underlying data. The reshape() method allows you to change the shape of an array while keeping the total number of elements constant.
This is particularly useful when you need to restructure your data for specific operations or visualizations:

In [None]:
arr = np.arange(12)
reshaped = arr.reshape((3, 4))
print(reshaped)

##### Array Operations and Broadcasting

In [None]:
arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])
print(arr1 + arr2)  # Output: [ 6 8 10 12]
print(arr1 * 2)  # Output: [2 4 6 8]
print(arr1**2)  # Output: [ 1 4 9 16]

These vectorized operations are not only more concise but also significantly faster than their loop-based counterparts,especially for large arrays. This efficiency is crucial when working with large datasets in data analysis and scientific computing.

In [None]:
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
vector = np.array([10, 20, 30])
result = matrix + vector
print(result)

In this example, NumPy automatically broadcasts the vector to match the shape of the matrix, effectively adding it to each row. This powerful feature allows for concise and efficient operations that would otherwise require explicit loops or array tiling.


The rules of broadcasting are designed to be as intuitive as possible while still allowing for powerful operations. The basic principle is that two arrays are compatible for broadcasting if
their dimensions are compatible, starting from the trailing dimension. If one array has fewer dimensions, it's treated as if it had leading dimensions of size 1. For example:


In [None]:
arr_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])  # Shape: (2, 2, 2)
arr_2d = np.array([[10, 20], [30, 40]])  # Shape: (2, 2)
result = arr_3d + arr_2d
print(result)

Here, arr_2d is broadcast across the first dimension of arr_3d , effectively adding it to each 2D slice of the 3D array. Understanding and leveraging broadcasting can lead to more efficient and readable code, especially when working with multi-dimensional data.

NumPy also provides a rich set of mathematical functions that operate element-wise on arrays. These include trigonometric functions, exponential and logarithmic functions, and statistical functions. For example:

In [None]:
angles = np.array([0, np.pi / 2, np.pi])
print(np.sin(angles))  # Output: [ 0. 1. 0.]
print(np.exp(angles))  # Output: [ 1. 2.71828183 23.14069263]
print(np.log(np.array([1, np.e, np.e**2])))  # Output: [0. 1. 2.]

When working with multi-dimensional arrays, NumPy provides functions to perform operations along specific axes. The axis parameter in many NumPy functions allows you to specify the dimension along which the operation should be performed. For example, you can compute the sum or mean along rows or columns of a 2D array:

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(np.sum(arr, axis=0))  # Sum along columns: [5 7 9]
print(np.mean(arr, axis=1))  # Mean along rows: [2. 5.]

##### Indexing and Slicing in NumPy