# LightGBM Fiber vs FastAPI inference benchmarking test

A benchmarking test to assess the difference in latency for inferencing a LightGBM model endpoint, written in Python's FastAPI vs GoLang's Fiber.

Both endpoints are using the same LightGBM model file (trained on synthetic data generated via Sci-kit learn). FastAPI (built on Pydantic & Rust) is using the LightGBM package maintained by Microsoft. The Fiber endpoint is using the [Leaves](https://pkg.go.dev/github.com/dmitryikh/leaves@v0.0.0-20230708180554-25d19a787328) module which is pure GoLang implementation fo the same library (now unmaintained).

The test should show the performance difference between Python and GoLang and their frameworks (e.g. Fiber is using a custom JSON parser which is significnatly more efficient than the standard one in either Python or GoLang). The test is by no means comprehensive and intended as a rough estimation of the difference in latency between both languages, as well as proof of concept for hosting Python trained models in compiled languages, outside of ONNX.

Testing was run on an M1 Macbook Pro 2021

In [1]:
# Move python path to import src/ modules
import sys
sys.path.append('../')

# import modules
import time
import httpx
from numpy import ndarray

from src.data.sklearn import generate_data

In [2]:
# generate inference dataset
X, _ = generate_data(100)

# define host & ports
HOST_URL = "http://127.0.0.1"
FIBER = 3000
FASTAPI = 8000

In [3]:
# define request function
def sync_requests(port: int, inference: ndarray) -> None:
    """send a number of inference requests"""
    with httpx.Client() as client:
        for data in inference:
            response = client.post(f"{HOST_URL}:{port}/inference", json={"data": data.tolist()})
            if response.status_code != 200:
                response.raise_for_status()

## Testing

The test will draw 50 samples of 100 synchronous requests (of randomly generated data) and average the result to find the mean & median request latency for 100 requests.

In [4]:
# collect test data
SAMPLES = 50
fastapi_samples = []
fiber_samples = []

for _ in range(SAMPLES):
    start = time.perf_counter()
    sync_requests(FASTAPI, X)
    fastapi_samples.append(time.perf_counter() - start)

    start = time.perf_counter()
    sync_requests(FIBER, X)
    fiber_samples.append(time.perf_counter() - start)

In [5]:
import statistics

fastapi_mean = statistics.mean(fastapi_samples)
fastapi_median = statistics.median(fastapi_samples)
print(f"Mean FastAPI latency: {fastapi_mean:4f} seconds")
print(f"Median FastAPI latency: {fastapi_median:4f} seconds\n")

fiber_mean = statistics.mean(fiber_samples)
fiber_median = statistics.median(fiber_samples)
print(f"Mean Fiber latency: {fiber_mean:4f} seconds")
print(f"Median Fiber latency: {fiber_median:4f} seconds\n")

difference = fastapi_mean - fiber_mean
average = statistics.mean([fastapi_mean,fiber_mean])
print(f"Fiber vs FastAPI mean difference: {difference/average:.2%} faster")

Mean FastAPI latency: 0.261092 seconds
Median FastAPI latency: 0.255399 seconds

Mean Fiber latency: 0.086377 seconds
Median Fiber latency: 0.083931 seconds

Fiber vs FastAPI mean difference: 100.56% faster
