# Parallel Classification via FastAPI
### This notebook reads a CSV file containing website content, sends parallel POST requests to a running FastAPI inference container, and appends the predicted labels to the original data. To validate the performance of the parallelized setup, both parallel and sequential inference times are measured and compared.

## Data Preparation
### This section prepares input data for inference by loading and concatenating multiple CSV files.
### Each CSV is expected to contain scraped website data, including the site name and its corresponding content.

In [278]:
import glob
import pandas as pd

# Load and concatenate all matching chunk files
all_files = sorted(glob.glob("scrapingResults/chunk_*_results.csv"))
df_list = [pd.read_csv(f).dropna(subset=["website", "content"]) for f in all_files]
df = pd.concat(df_list, ignore_index=True)

# Add dummy row at index 0 because of not getting the right label at index 0 -TEST falied - something is wrong with the first row
# dummy_row = pd.DataFrame([{"website": "example.com", "content": "placeholder content"}])  # or content=None
# df = pd.concat([dummy_row, df], ignore_index=True)

print(f"Total records loaded: {len(df)}")
df.head()

Total records loaded: 57


Unnamed: 0,website,content
0,https://www.wikipedia.org,"Wikipedia The Free Encyclopedia English 6,974,..."
1,https://www.bbc.com,Skip to content Watch Live Home News Sport Bus...
2,https://www.cnn.com,CNN values your feedback 1. How relevant is th...
3,https://www.nytimes.com,Skip to content Skip to site index SKIP ADVERT...
4,https://www.britannica.com,Explore Search Britannica Click here to search...


### 🚀 Parallel Inference with `asyncio` and `httpx`

To improve classification speed, we leverage Python's `asyncio` and `httpx.AsyncClient` to send multiple inference requests concurrently to the FastAPI server.

Key components:
- `asyncio.Semaphore(9)`: Limits the number of concurrent requests to avoid overwhelming the server. It's recommended to set this to one or two (based on how to run the app, by image or without image locally) less than the number of Gunicorn due to a special case in the dataset.
- `fetch_parallel(...)`: Asynchronously sends one POST request for classification.
- `classify_parallel(df)`: Iterates over all rows in the input DataFrame and schedules async tasks for classification.


In [287]:
import httpx
import asyncio
import nest_asyncio
import time

nest_asyncio.apply()
API_URL = "http://localhost:8000/classify"
semaphore = asyncio.Semaphore(3)

async def fetch_parallel(session, website, content):
    async with semaphore:
        try:
            response = await session.post(API_URL, json={"website": website, "content": content}, timeout=15.0)
            return response.json().get("label", "error")
        except Exception:
            return "error"

async def classify_parallel(df):
    async with httpx.AsyncClient() as session:
        tasks = [fetch_parallel(session, row["website"], row["content"]) for _, row in df.iterrows()]
        return await asyncio.gather(*tasks)

# ⏱️ Time parallel run
start = time.time()
df["predicted_label_parallel"] = asyncio.run(classify_parallel(df))
end = time.time()
print(f"✅ Parallel inference time: {end - start:.2f} sec")
df


✅ Parallel inference time: 120.25 sec


Unnamed: 0,website,content,predicted_label_parallel,predicted_label_sync
0,https://www.wikipedia.org,"Wikipedia The Free Encyclopedia English 6,974,...",health,health
1,https://www.bbc.com,Skip to content Watch Live Home News Sport Bus...,news,news
2,https://www.cnn.com,CNN values your feedback 1. How relevant is th...,news,news
3,https://www.nytimes.com,Skip to content Skip to site index SKIP ADVERT...,news,news
4,https://www.britannica.com,Explore Search Britannica Click here to search...,educational,educational
5,https://www.nationalgeographic.com,Latest Stories Subscribe for full access to re...,news,news
6,https://www.reuters.com,Please enable JS and disable any ad blocker,health,health
7,https://www.npr.org,Accessibility links Skip to main content Keybo...,news,news
8,https://www.bloomberg.com,Bloomberg Need help? Contact us We've detected...,news,news
9,https://www.bloomberg.com,Bloomberg Need help? Contact us We've detected...,news,news


### 🐢 Sequential Inference with `httpx.post`

To verify the effectiveness of parallel inference, this section measures the time taken to classify website content sequentially, one-by-one,
using synchronous HTTP requests with the `httpx` library.

In [288]:
def classify_sync(df):
    labels = []
    for _, row in df.iterrows():
        try:
            response = httpx.post(API_URL, json={"website": row["website"], "content": row["content"]}, timeout=15.0)
            labels.append(response.json().get("label", "error"))
        except Exception:
            labels.append("error")
    return labels

start = time.time()
df["predicted_label_sync"] = classify_sync(df)
end = time.time()
print(f"🐢 Sequential inference time: {end - start:.2f} sec")
df

🐢 Sequential inference time: 137.98 sec


Unnamed: 0,website,content,predicted_label_parallel,predicted_label_sync
0,https://www.wikipedia.org,"Wikipedia The Free Encyclopedia English 6,974,...",health,health
1,https://www.bbc.com,Skip to content Watch Live Home News Sport Bus...,news,news
2,https://www.cnn.com,CNN values your feedback 1. How relevant is th...,news,news
3,https://www.nytimes.com,Skip to content Skip to site index SKIP ADVERT...,news,news
4,https://www.britannica.com,Explore Search Britannica Click here to search...,educational,educational
5,https://www.nationalgeographic.com,Latest Stories Subscribe for full access to re...,news,news
6,https://www.reuters.com,Please enable JS and disable any ad blocker,health,health
7,https://www.npr.org,Accessibility links Skip to main content Keybo...,news,news
8,https://www.bloomberg.com,Bloomberg Need help? Contact us We've detected...,news,news
9,https://www.bloomberg.com,Bloomberg Need help? Contact us We've detected...,news,news


### As demonstrated, parallel inference provided a slight performance improvement. However, when running the application inside a Docker container, we cannot fully leverage the benefits of parallelism due to resource constraints—some CPU resources are reserved for container overhead. For optimal performance, the application can be run directly on my machine with more gunicorn workers and concurrent connections, which I will showcase during the demo.