## Model Evaluation & Export

This notebook demonstrates how to export the model to TensorRT for NVIDIA GPUs, OpenVINO for Intel CPUs, and ONNX, a versatile format supported by many different frameworks and devices.

### 1. Load a pretrained model

Choose from tiny, small, nano, medium, big, large, and xlarge according to your hardware capabilities.

In [2]:
from ednet import EDNet

In [14]:
model = EDNet('pretrained/xlarge.pt')

### 2. Evaluate the raw performance

In [15]:
results = model.val(data='visdrone-det.yaml', split='val', project='results/xlarge')

EDNet 1.0 ✅ Python-3.9.19 ✅ torch-2.0.1 ✅CUDA:0 (NVIDIA A100 80GB PCIe MIG 7g.80gb, 81038MiB)
ednet-x summary: 638 layers, 48,734,752 parameters, 0 gradients, 270.4 GFLOPs


[34m[1mval: [0mScanning /home/song/AOIUNO/datasets/VisDrone/VisDrone2019-DET-val/labels.cache... 548 images, 0 backgrounds, 0 corrupt: 100%|██████████| 548/548 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 35/35 [00:30<00:00,  1.15it/s]


                   all        548      38759      0.583      0.475      0.502      0.314
            pedestrian        520       8844      0.661      0.517      0.581      0.293
                people        482       5125      0.607      0.427      0.468      0.204
               bicycle        364       1287      0.353      0.265       0.25      0.119
                   car        515      14064      0.786      0.832      0.865      0.638
                   van        421       1975      0.596      0.515      0.538      0.393
                 truck        266        750      0.591      0.407      0.456      0.313
              tricycle        337       1045      0.522      0.376      0.379      0.222
       awning-tricycle        220        532      0.324       0.19        0.2      0.128
                   bus        131        251      0.773      0.637      0.692      0.533
                 motor        485       4886      0.612      0.581      0.596      0.299
Speed: 0.9ms preproce

### 3. Export the model - TensorRT

The example GPU used here is NVIDIA A100.

In [4]:
model.export(format='engine')

EDNet 1.0 ✅ Python-3.9.19 ✅ torch-2.0.1 ✅CUDA:0 (NVIDIA A100 80GB PCIe MIG 7g.80gb, 81038MiB)
ednet-x summary: 638 layers, 48,734,752 parameters, 0 gradients, 270.4 GFLOPs

[34m[1mPyTorch:[0m starting from 'pretrained/xlarge.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 300, 6) (93.9 MB)

[34m[1mONNX:[0m starting export with onnx 1.16.2 opset 17...
verbose: False, log level: Level.ERROR

[34m[1mONNX:[0m export success ✅ 4.5s, saved as 'pretrained/xlarge.onnx' (180.6 MB)

[34m[1mTensorRT:[0m starting export with TensorRT 10.2.0.post1...
[08/24/2024-16:37:53] [TRT] [I] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 1411, GPU 1669 (MiB)
[08/24/2024-16:37:54] [TRT] [I] [MemUsageChange] Init builder kernel library: CPU +1930, GPU +352, now: CPU 3341, GPU 2021 (MiB)
[08/24/2024-16:37:54] [TRT] [I] ----------------------------------------------------------------
[08/24/2024-16:37:54] [TRT] [I] Input filename:   pretrained/xlarge.onnx
[08/24/2024-16:37:54

'pretrained/xlarge.engine'

In [18]:
model_rt = EDNet('pretrained/xlarge.engine', task='detect')

In [19]:
results = model_rt.val(data='visdrone-det.yaml', split='val', project='results/xlarge')

EDNet 1.0 ✅ Python-3.9.19 ✅ torch-2.0.1 ✅CUDA:0 (NVIDIA A100 80GB PCIe MIG 7g.80gb, 81038MiB)
Loading pretrained/xlarge.engine for TensorRT inference...
[08/24/2024-16:56:06] [TRT] [I] Loaded engine size: 186 MiB
[08/24/2024-16:56:06] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +1, GPU +135, now: CPU 1, GPU 315 (MiB)


[34m[1mval: [0mScanning /home/song/AOIUNO/datasets/VisDrone/VisDrone2019-DET-val/labels.cache... 548 images, 0 backgrounds, 0 corrupt: 100%|██████████| 548/548 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 548/548 [00:13<00:00, 39.68it/s]


                   all        548      38759       0.59      0.471      0.502      0.314
            pedestrian        520       8844      0.673      0.508       0.58      0.292
                people        482       5125      0.611      0.426      0.468      0.203
               bicycle        364       1287      0.366      0.261      0.251       0.12
                   car        515      14064      0.788      0.831      0.863      0.637
                   van        421       1975      0.595      0.511      0.539      0.393
                 truck        266        750      0.612      0.404      0.457      0.314
              tricycle        337       1045      0.529      0.371      0.377      0.222
       awning-tricycle        220        532       0.32      0.186      0.197      0.126
                   bus        131        251       0.79      0.633      0.692      0.535
                 motor        485       4886      0.613      0.582      0.594      0.298
Speed: 0.5ms preproce

### 4. Export the model for CPU Inference: Intel
The example CPU used here is Intel Xeon Gold 6300

In [27]:
model = EDNet('pretrained/nano.pt')
model.export(format='openvino')

EDNet 1.0 ✅ Python-3.9.19 ✅ torch-2.0.1 ✅CPU (Intel Xeon Gold 6330 2.00GHz)
ednet-n summary (fused): 394 layers, 2,871,712 parameters, 0 gradients, 15.2 GFLOPs

[34m[1mPyTorch:[0m starting from 'pretrained/nano.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 300, 6) (6.0 MB)

[34m[1mOpenVINO:[0m starting export with openvino 2024.3.0-16041-1e3b88e4e3f-releases/2024/3...
[34m[1mOpenVINO:[0m export success ✅ 22.1s, saved as 'pretrained/nano_openvino_model/' (9.9 MB)

Export complete (24.2s)
Results saved to [1m/home/song/UAV/pretrained[0m
Predict:         yolo predict task=detect model=pretrained/nano_openvino_model imgsz=640  
Validate:        yolo val task=detect model=pretrained/nano_openvino_model imgsz=640 data=visdrone-det.yaml  
Visualize:       https://netron.app


'pretrained/nano_openvino_model'

In [29]:
model_intel = EDNet('pretrained/nano_openvino_model', task='detect')
results = model_intel.val(data='visdrone-det.yaml', split='val', device='cpu', project='results/nano')

EDNet 1.0 ✅ Python-3.9.19 ✅ torch-2.0.1 ✅CPU (Intel Xeon Gold 6330 2.00GHz)
Loading pretrained/nano_openvino_model for OpenVINO inference...
Using OpenVINO LATENCY mode for batch=1 inference...
Setting batch=1 input of shape (1, 3, 640, 640)


[34m[1mval: [0mScanning /home/song/AOIUNO/datasets/VisDrone/VisDrone2019-DET-val/labels.cache... 548 images, 0 backgrounds, 0 corrupt: 100%|██████████| 548/548 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 548/548 [01:02<00:00,  8.73it/s]


                   all        548      38759      0.449      0.344      0.341      0.199
            pedestrian        520       8844      0.499      0.379      0.403      0.184
                people        482       5125      0.446       0.29      0.295      0.118
               bicycle        364       1287      0.241      0.111     0.0908     0.0367
                   car        515      14064      0.654      0.778      0.786      0.544
                   van        421       1975      0.462      0.381      0.382      0.264
                 truck        266        750      0.431      0.258      0.251      0.156
              tricycle        337       1045       0.38      0.206      0.201      0.109
       awning-tricycle        220        532      0.246      0.148       0.11     0.0723
                   bus        131        251      0.648      0.462      0.489      0.327
                 motor        485       4886      0.484      0.424      0.402       0.18
Speed: 0.5ms preproce

### 5. Export the model for CPU Inference: ARM
The example CPU used here is ARMv8 Firestorm (performance cores)

In [3]:
model = EDNet('pretrained/tiny.pt')
model.export(format='onnx')

EDNet 1.0 ✅ Python-3.9.19 ✅ torch-2.0.1 ✅CPU (Apple M1)
ednet-t summary (fused): 366 layers, 1,781,088 parameters, 0 gradients, 14.0 GFLOPs

[34m[1mPyTorch:[0m starting from 'pretrained/tiny.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 300, 6) (3.9 MB)

[34m[1mONNX:[0m starting export with onnx 1.16.2 opset 17...
verbose: False, log level: Level.ERROR

[34m[1mONNX:[0m export success ✅ 1.3s, saved as 'pretrained/tiny.onnx' (6.0 MB)

Export complete (2.0s)
Results saved to [1m/Users/zhifansong/Desktop/EdgeDroneNet/UAV/pretrained[0m
Predict:         yolo predict task=detect model=pretrained/tiny.onnx imgsz=640  
Validate:        yolo val task=detect model=pretrained/tiny.onnx imgsz=640 data=visdrone-det.yaml  
Visualize:       https://netron.app


'pretrained/tiny.onnx'

In [4]:
model_arm = EDNet('pretrained/tiny.onnx', task='detect')
results = model_arm.val(data='visdrone-det.yaml', split='val', project='results/tiny')

EDNet 1.0 ✅ Python-3.9.19 ✅ torch-2.0.1 ✅CPU (Apple M1)
Loading pretrained/tiny.onnx for ONNX Runtime inference...
Setting batch=1 input of shape (1, 3, 640, 640)


[34m[1mval: [0mScanning /Users/zhifansong/Desktop/EdgeDroneNet/EdgeDroneNet/datasets/VisDrone/VisDrone2019-DET-val/labels.cache... 548 images, 0 backgrounds, 0 corrupt: 100%|██████████| 548/548 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 548/548 [00:39<00:00, 13.91it/s]


                   all        548      38759      0.423      0.341      0.332      0.195
            pedestrian        520       8844       0.48      0.369      0.391      0.179
                people        482       5125      0.437      0.295      0.293      0.116
               bicycle        364       1287      0.229      0.127     0.0982     0.0426
                   car        515      14064      0.635      0.782      0.783      0.541
                   van        421       1975      0.459      0.363      0.373      0.259
                 truck        266        750       0.37       0.26      0.233      0.149
              tricycle        337       1045      0.383      0.202      0.201       0.11
       awning-tricycle        220        532       0.23      0.128      0.108     0.0681
                   bus        131        251      0.539       0.45       0.43        0.3
                 motor        485       4886      0.469       0.43      0.405      0.182
Speed: 0.6ms preproce