# Imports

In [1]:
import uvicorn
import requests
import time
import numpy as np

Ensure that the backend server is running before proceeding.

If not, run this command within the directory.

```
uvicorn main:app --reload
```

# Joblib Performance Evaluation

In [2]:
url = "http://127.0.0.1:8000/predict_joblib"
payload = {"features": [5.1, 3.5, 1.4, 0.2]}
headers = {"Content-Type": "application/json"}

joblib_times = []

for _ in range(1000):

    start_time = time.time()
    response = requests.post(url, json=payload, headers=headers)
    end_time = time.time()
    joblib_times.append(end_time - start_time)

joblib_times = np.array(joblib_times)
joblib_mean = joblib_times.mean()
joblib_95th_percentile = np.percentile(joblib_times, 95)

print("Joblib mean inference time:", joblib_mean)
print("Joblib 95th percentile inference time:", joblib_95th_percentile)


Joblib mean inference time: 0.0019896438121795655
Joblib 95th percentile inference time: 0.002776598930358886


# ONNX Performance Evaluation

In [3]:
url = "http://127.0.0.1:8000/predict_onnx"
payload = {"features": [5.1, 3.5, 1.4, 0.2]}
headers = {"Content-Type": "application/json"}

onnx_times = []

for _ in range(1000):

    start_time = time.time()
    response = requests.post(url, json=payload, headers=headers)
    end_time = time.time()
    onnx_times.append(end_time - start_time)

onnx_times = np.array(onnx_times)
onnx_mean = onnx_times.mean()
onnx_95th_percentile = np.percentile(onnx_times, 95)

print("ONNX mean inference time:", onnx_mean)
print("ONNX 95th percentile inference time:", onnx_95th_percentile)

ONNX mean inference time: 0.002039801597595215
ONNX 95th percentile inference time: 0.0037623763084411606


# Conclusion

The performance comparison between joblib and ONNX backends shows minimal differences, with joblib achieving slightly better average latency (1.99ms vs 2.04ms) and consistency (P95 of 2.78ms vs 3.76ms). For this lightweight logistic regression model, both serialization formats deliver sub-4ms response times for 95% of requests, making the choice between them negligible from a performance standpoint. The decision should be based on deployment requirements: joblib for Python-native environments and ONNX for cross-platform compatibility, rather than performance considerations.