# Broadcasting in NumPy
Now, we'll explore how to make your code faster with vectorization and broadcasting in NumPy. These techniques are key to boosting performance in numerical operations by avoiding slow loops and memory inefficiency.

### 1. Why Loops Are Slow
In Python, loops are typically slow because:

- Python’s interpreter: Every iteration of the loop requires Python to interpret the loop logic, which is inherently slower than lower-level, compiled code.
- High overhead: Each loop iteration in Python involves additional overhead for function calls, memory access, and index management.
  
While Python loops are convenient, they don’t take advantage of the optimized memory and computation that libraries like NumPy provide.



In [4]:
import numpy as np 
arr = np.array([1,2,3,4,5,6])
result = []
# Using a loop to square each element (slow)
for i in arr:
    result.append(i**2)
print(result)

[1, 4, 9, 16, 25, 36]


### 2. Vectorization: Fixing the Loop Problem

Vectorization allows you to perform operations on entire arrays at once, instead of iterating over elements one by one. This is made possible by NumPy’s optimized C-based backend that executes operations in compiled code, which is much faster than Python loops.

Vectorized operations are also more readable and compact, making your code easier to maintain.

In [9]:
arr = np.array([1,2,3,4,5,6])
result = arr**2# Vectorized operation
print(result)

[ 1  4  9 16 25 36]


### Why is it Faster?

- Low-level implementation: NumPy’s vectorized operations are implemented in C (compiled language), which is much faster than Python loops.
- Batch processing: NumPy processes multiple elements in parallel using SIMD (Single Instruction, Multiple Data), allowing multiple operations to be done simultaneously.

### 3. Broadcasting: Scaling Arrays Without Extra Memory

Broadcasting is a powerful feature of NumPy that allows you to perform operations on arrays of different shapes without creating copies. It “stretches” smaller arrays across larger arrays in a memory-efficient way, avoiding the overhead of creating multiple copies of data.

Example: Broadcasting with Scalar

Broadcasting is often used when you want to perform an operation on an array and a scalar value (e.g., add a number to all elements of an array).

In [16]:
arr = np.array([1,2,3,4,5,6])
result = arr+10 # Broadcasting: 10 is added to all elements
print(result)

[11 12 13 14 15 16]


### 4. Broadcasting with Arrays of Different Shapes

Broadcasting becomes more powerful when you apply operations on arrays of different shapes. NumPy automatically adjusts the shapes of arrays to make them compatible for element-wise operations, without actually copying the data.

In [19]:
arr1 = np.array([1,2,3])
arr2 = np.array([10,20,30])
result = arr1+arr2
print(result)

[11 22 33]


NumPy automatically aligns the two arrays and performs element-wise addition, treating them as if they have the same shape.

#### Broadcasting a 2D Array and a 1D Array

In [23]:
arr1 = np.array([[1,2,3],[4,5,6]])
arr2 = np.array([1,2,3])
result = arr1 + arr2
print(result)

[[2 4 6]
 [5 7 9]]


### How Broadcasting Works

- Dimensions must be compatible: The size of the trailing dimensions of the arrays must be either the same or one of them must be 1.
- Stretching arrays: If the shapes are compatible, NumPy stretches the smaller array to match the larger one, element-wise, without copying data.

### 5. Hands-on: Applying Broadcasting to Real-World Scenarios

Let’s apply broadcasting to a real-world scenario: scaling data in machine learning.

Example: Normalizing Data Using Broadcasting

Imagine you have a dataset where each row represents a sample and each column represents a feature. You can normalize the data by subtracting the mean of each column and dividing by the standard deviation.

In [27]:
# Simulating a dataset (5 samples, 3 features)
data = np.array([[10, 20, 30],
                 [15, 25, 35],
                 [20, 30, 40],
                 [25, 35, 45],
                 [30, 40, 50]])

# Calculating mean and standard deviation for each feature (column)

mean = data.mean(axis=0)
std = data.std(axis=0)

# Normalizing the data using broadcasting

normalized_data = (data - mean) / std
 
print(normalized_data)


[[-1.41421356 -1.41421356 -1.41421356]
 [-0.70710678 -0.70710678 -0.70710678]
 [ 0.          0.          0.        ]
 [ 0.70710678  0.70710678  0.70710678]
 [ 1.41421356  1.41421356  1.41421356]]


### Summary:

- Loops are slow because Python's interpreter adds overhead, making iteration less efficient.
Vectorization allows you to apply operations to entire arrays at once, greatly improving performance by utilizing NumPy’s optimized C backend.
Broadcasting enables operations between arrays of different shapes by automatically stretching the smaller array to match the shape of the larger array, without creating additional copies.
Real-world use: Broadcasting can be used in data science tasks, such as normalizing datasets, without sacrificing memory or performance.