In [1]:
import numpy as np 

## make your code faster with vectorization and broadcasting in NumPy. These techniques are key to boosting performance in numerical operations by avoiding slow loops and memory inefficiency.

In [2]:
arr = np.array([1, 2, 3, 4, 5])
result = []
 
# Using a loop to square each element (slow)
for num in arr:
    result.append(num ** 2)
 
print(result)  # Output: [1, 4, 9, 16, 25]

[np.int64(1), np.int64(4), np.int64(9), np.int64(16), np.int64(25)]


In [3]:
# vectorization 
# Vectorization allows you to perform operations on entire arrays at once, instead of iterating over elements one by one. This is made possible by NumPy’s optimized C-based backend that executes operations in compiled code, which is much faster than Python loops.

# Vectorized operations are also more readable and compact, making your code easier to maintain.
arr2 = np.array([1, 2, 3, 4, 5])
result = arr2 ** 2  # Vectorized operation
print(result)  # Output: [1 4 9 16 25]

[ 1  4  9 16 25]


### Low-level implementation: NumPy’s vectorized operations are implemented in C (compiled language), which is much faster than Python loops.

### Batch processing: NumPy processes multiple elements in parallel using SIMD (Single Instruction, Multiple Data), allowing multiple operations to be done simultaneously.


In [4]:
# broadcasting scaling arrays without extra memory 
# Broadcasting is often used when you want to perform an operation on an array and a scalar value (e.g., add a number to all elements of an array).
arr3 = np.array([1, 2, 3, 4, 5])
result = arr3 + 10  # Broadcasting: 10 is added to all elements
print(result)  # Output: [11 12 13 14 15]

[11 12 13 14 15]


In [5]:
# broad casting arrays with different shapes 
arr1 = np.array([1, 2, 3])
arr2 = np.array([10, 20, 30])

result = arr1 + arr2  # Element-wise addition
print(result)  # Output: [11 22 33]

[11 22 33]


In [6]:
#  broadcasting 2d and a 1d array
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([1, 2, 3])

result = arr1 + arr2  # Broadcasting arr2 across arr1
print(result)
# Output:
# [[2 4 6]
#  [5 7 9]]

[[2 4 6]
 [5 7 9]]


In [7]:
# real world scenario 
# Simulating a dataset (5 samples, 3 features)
data = np.array([[10, 20, 30],
                 [15, 25, 35],
                 [20, 30, 40],
                 [25, 35, 45],
                 [30, 40, 50]])

# Calculating mean and standard deviation for each feature (column)
mean = data.mean(axis=0)
std = data.std(axis=0)

# Normalizing the data using broadcasting
normalized_data = (data - mean) / std

print(normalized_data)

[[-1.41421356 -1.41421356 -1.41421356]
 [-0.70710678 -0.70710678 -0.70710678]
 [ 0.          0.          0.        ]
 [ 0.70710678  0.70710678  0.70710678]
 [ 1.41421356  1.41421356  1.41421356]]


## Summary:
    - Loops are slow because Python's interpreter adds overhead, making iteration less efficient.

    - Vectorization allows you to apply operations to entire arrays at once, greatly improving performance by utilizing NumPy’s optimized C backend.
    - Broadcasting enables operations between arrays of different shapes by automatically stretching the smaller array to match the shape of the    larger array, without creating additional copies.
    Real-world use: Broadcasting can be used in data science tasks, such as normalizing datasets, without sacrificing memory or performance.