![Degirum banner](https://raw.githubusercontent.com/DeGirum/PySDKExamples/main/images/degirum_banner.png)
## Performance Test for Single-Model Inference
This notebook contains performance measurements for all Orca-based image detection AI models from DeGirum 
public model zoo

This script works with the following inference options:

1. Run inference on DeGirum Cloud Platform;
2. Run inference on DeGirum AI Server deployed on a localhost or on some computer in your LAN or VPN;
3. Run inference on DeGirum ORCA accelerator directly installed on your computer.

To try different options, you need to specify the appropriate `hw_location` option.

When running this notebook locally, you need to specify your cloud API access token in the [env.ini](../../env.ini) file, located in the same directory as this notebook.

When running this notebook in Google Colab, the cloud API access token should be stored in a user secret named `DEGIRUM_CLOUD_TOKEN`.

In [None]:
# make sure degirum-tools package is installed
!pip show degirum-tools || pip install degirum-tools

#### Specify test options here

In [None]:
# hw_location: where you want to run inference
#     "@cloud" to use DeGirum cloud
#     "@local" to run on local machine
#     IP address for AI server inference
# model_zoo_url: url/path for model zoo
#     cloud_zoo_url: valid for @cloud, @local, and ai server inference options
#     '': ai server serving models from local folder
#     path to json file: single model zoo in case of @local inference
# iterations: iterations to run for each model
# device_type: runtime/device family of models to profile
# model_family: family of models to profile
hw_location = "@cloud"
model_zoo_url = "degirum/public"
iterations = 10  # how many iterations to run for each model
device_type = "N2X/ORCA1"  # models of which device family to use
model_family="yolo"

#### The rest of the cells below should run without any modifications

In [None]:
import degirum as dg
import degirum_tools
# list of models to test
model_names = dg.list_models(
    inference_host_address=hw_location,
    zoo_url=model_zoo_url,
    token=degirum_tools.get_token(),
    device_type=device_type, 
    model_family=model_family
    )
# run batch predict for each model and record time measurements
results = {}
prog = degirum_tools.Progress(len(model_names), speed_units="models/s")
for model_name in model_names:
    try:
        results[model_name] = degirum_tools.model_time_profile(
            dg.load_model(
                model_name=model_name, 
                inference_host_address=hw_location,
                zoo_url=model_zoo_url,
                token=degirum_tools.get_token(),                
                ), 
            iterations if not degirum_tools.get_test_mode() else 2
        )
    except NotImplementedError:
        pass  # skip models for which time profiling is not supported
    prog.step()

In [None]:
# print results
CW = (62, 19, 16, 16)  # column widths
header = f"{'Model name':{CW[0]}}| {'Postprocess Type':{CW[1]}} | {'Observed FPS':{CW[2]}} | {'Max Possible FPS':{CW[3]}} |"

print(f"Models    : {len(model_names)}")
print(f"Iterations: {iterations}\n")
print(f"{'-'*len(header)}")
print(header)
print(f"{'-'*len(header)}")

for model_name, result in results.items():
    print(
        f"{model_name:{CW[0]}}|"
        + f" {result.parameters.OutputPostprocessType:{CW[1]}} |"
        + f" {result.observed_fps:{CW[2]}.1f} |"
        + f" {result.max_possible_fps:{CW[3]}.1f} |"
    )