# Lab 15 Example
In this lab we will cover two advanced topics in Python. The first topic is about **Concurrency and Parallelism**. And the second is about **Packaging and Distribution**.

## Concurrency and Parallelism
In Python, **concurrency and parallelism** deal with executing multiple tasks simultaneously to improve performance, responsiveness, or throughput. While the terms are often used interchangeably, they refer to different concepts:

* **Concurrency** is about dealing with many tasks at once—managing multiple operations in overlapping time frames (e.g., using threads or asynchronous I/O).
* **Parallelism** is about doing many tasks at the same time—actually running code simultaneously, typically on multiple CPU cores (e.g., using the `multiprocessing` module).

Python’s Global Interpreter Lock (GIL) imposes limitations on true parallelism in multi-threaded code, but tools like **`asyncio`**, **`threading`**, and **`multiprocessing`** offer different models for achieving concurrency and parallelism. Understanding when and how to use each is critical for building high-performance applications, especially in web servers, data pipelines, or scientific computing.


### **Concurrency (AsyncIO)**

Some programs involve natural waiting during execution. For example, a web scraping script often waits for network responses. During this time, your CPU is actually idle. A very straightforward idea is to use this idle time to do something else — like sending another request, processing previously received data, or performing background tasks.

This is where **asynchronous programming** comes in. Python's `asyncio` library allows you to write code that can pause (await) during slow operations without blocking the entire program. This makes it possible to handle many tasks concurrently in a single thread — ideal for I/O-bound workloads such as:

* Fetching data from multiple APIs
* Reading and writing files or databases
* Handling thousands of web clients in a server

With `async def` functions and the `await` keyword, you can build efficient, non-blocking applications that are easier to read and maintain than traditional callback-based approaches.


Let's see an example about accessing [https://marketstack.com/](https://marketstack.com/).

In [None]:
import asyncio
import datetime
import json


url_format = "https://api.marketstack.com/v1/eod?access_key={api_key}&symbols={symbols}&date_from={date_from}&date_to={date_to}"


## Parallelism

`AsyncIO` can improve CPU utilization and speed up tasks that involve a lot of waiting (idle time), such as downloading files or querying web APIs. But what if your program is **CPU-intensive** — like performing large number multiplications, sorting huge datasets, or processing images?

While the total execution time for CPU-bound tasks is limited by your machine's hardware, it's possible that your task isn't fully using all the available computational resources (i.e CPU time and CPU cores). 

Modern operating systems can execute multiple threads or processes in parallel across multiple CPU cores. To take advantage of this, you can break your workload into smaller chunks and run them **in parallel**. In Python, this is typically done using the **`multiprocessing`** module.

`multiprocessing` creates separate processes that can run truly in parallel, allowing your program to make full use of multiple CPU cores. This leads to significant performance improvements for compute-heavy operations.


In [23]:
import random
random.seed(0)

def rand_array(n):
    return [random.randint(0, 10000000) for _ in range(n)]

arrays = [rand_array(100000) for _ in range(100)]

In [26]:
from multiprocessing import Pool, cpu_count
import time

def norm_vector(v):
    norm = sum(x**2 for x in v) ** 0.5
    return norm

# Use all available CPU cores
num_workers = cpu_count()

print(f"Running on {num_workers} cores...")

start = time.time()

with Pool(processes=num_workers) as pool:
    results = pool.map(norm_vector, arrays)

end = time.time()
print(f"[Multiprocessing] Sorted {len(arrays)} arrays in {end - start:.2f} seconds.")

Running on 32 cores...
[Multiprocessing] Sorted 100 arrays in 0.56 seconds.


In [27]:
# single threaded version
start = time.time()
results = []
for arr in arrays:
    results.append(norm_vector(arr))
end = time.time()
print(f"[Singal Threaded] Sorted {len(arrays)} arrays in {end - start:.2f} seconds.")

[Singal Threaded] Sorted 100 arrays in 1.12 seconds.


## Packaging and Distribution

Throughout this class, we've worked with several popular libraries such as `NumPy`, `Pandas`, `Matplotlib`, and `Seaborn`. You might now be wondering how to create your own Python library and share it with others. In this section, we'll walk you through the process of packaging your code into a reusable library and distributing it so others can install and use it just like any other Python package.
