https://docs.modular.com/engine/python/get-started.html

The Python API for the Modular Inference Engine makes it easy to instantly upgrade your model’s inference performance. With just a few lines of code, you can run any TensorFlow or PyTorch model with reduced latency and compute cost.

This page shows you how to load a trained TensorFlow model and execute it with the Modular Inference Engine. (It’s just as easy with a PyTorch model.) If you’d like to see performance benchmarks with other models, see our performance dashboard.

We also offer a C API and our C++ API is coming soon.

Import Python modules

Nothing surprising here.

In [None]:
import numpy as np
from modular import engine
from pathlib import Path

Load The Model

First we need to create an InferenceSession and load the model:

In [None]:
session = engine.InferenceSession()
model_path = Path('resnet50_v1_savedmodel')
model = session.load(model_path)

This compiles the model into the Modular format for inference.

Run an inference

Before running the model, let’s check the input tensor shape and data type:

In [None]:
for tensor in model.input_metadata:
    print(f'shape: {tensor.shape}, dtype: {tensor.dtype}')

The first dimension is None, meaning the batch size is dynamic.

Just to demonstrate our API, let’s run an inference with random data that matches the input shape:

In [None]:
input_tensor = np.random.rand(1, 224, 224, 3).astype(np.float32)

model.execute(input_tensor)

That’s it! execute() returns the output as an ndarray.

We can also run 5 inferences at once by batching them together:

In [None]:
input_tensor_batch = np.repeat(input_tensor, 5, axis=0)

model.execute(input_tensor_batch)

It’s the same result five times, but this is just to show how easy it is to use the Python API with a TensorFlow or PyTorch model. This also does not illustrate the Inference Engine performance—for real benchmark examples, see our performance dashboard.

For more details, check out the Python API reference.

The Inference Engine is not publicly available yet, but if you’d like to get early access, please sign up here.