# 1. Matrix Multiplication with NumPy
Matrix operations can be signifincatly faster with NumPy due to its optimized C backend.

Why NumPy

- Uses vectorized operations instead of Python loops.
- Efficient memory management.
- Parallel computation under the hood (uses BLAS libraries).

In [1]:
import numpy as np
import time 

# Generate large random matrices
size = 1000
A = np.random.rand(size, size)
B = np.random.rand(size, size)

# Perform matrix multiplication 
start_time = time.time()
C = np.dot(A, B) # Optimized with NumPy
end_time = time.time()

print(f'Matrix multiplication took {end_time - start_time:.2f} seconds.')

Matrix multiplication took 0.01 seconds.


# 2. Parallel Processing with concurrent.futures
Suppose you want to process a large dataset by applying a computationally expensive function to each chunk. parallelism can help.

Why Parallelism?
- Utilizes multiple CPU cores for faster processing.
- Suitable for tasks that are CPU-bound

In [1]:
import concurrent.futures
import numpy as np
import time
from my_functions import expensive_function

if __name__ == '__main__':
    # Generate a large dataset
    data = np.random.rand(10_000_000)

    # Split data into chunks
    num_chunks = 8
    chunk_size = len(data) // num_chunks
    chunks = [data[i * chunk_size:(1+i) * chunk_size] for i in range(num_chunks)]

    # Sequential processing 
    start_time = time.time()
    results = [expensive_function(chunk) for chunk in chunks]
    sequential_time = time.time()-start_time
    print(f'Sequential processing took {sequential_time:.2f} seconds.')

    # Parallel processing
    start_time = time.time()
    with concurrent.futures.ProcessPoolExecutor() as executor:
        results_parallel = list(executor.map(expensive_function, chunks))
    parallel_time = time.time() - start_time
    print(f'Parallel processing took {parallel_time:.2f} seconds.')
    

Sequential processing took 0.31 seconds.
Parallel processing took 0.37 seconds.


# 3. Combining NumPy and Parallelism
Optimize both the computation and execution by using NumPy for vectorized operations and concurrent.futures for multitasking

In [3]:
import numpy as np 
import concurrent.futures
import time

from my_functions import process_chunk

if __name__ == '__main__':
    # Generate a large dataset
    data = np.random.rand(10_000_000)

    # Split data into chunks
    num_chunks = 8
    chunk_size = len(data) // num_chunks
    chunks = [data[i * chunk_size:(i+1) * chunk_size] for i in range(num_chunks)]

    # Parallel processing with NumPy
    start_time = time.time()
    with concurrent.futures.ProcessPoolExecutor() as excutor:
        results = list(excutor.map(process_chunk, chunks))
    parallel_time = time.time() - start_time

    print(f'Parallel processing with NumPy took {parallel_time:.2f} seconds.')

Parallel processing with NumPy took 0.22 seconds.


# 4. Efficient Data Aggregation with NumPy
Compute the sum of squares for a large dataset efficiently.

Why use NumPy here?
- Eliminates the need for loops in python.
- Minimizes overhead and maximizes memory locality.

In [4]:
import numpy as np
import time

# Generate a large dataset
data = np.random.rand(100_000_000)

# Efficient computation with NumPy
start_time = time.time()
result = np.sum(data ** 2)
end_time = time.time()

print(f'Sum of squares computed in {end_time - start_time:.2f} seconds.')

Sum of squares computed in 0.54 seconds.
