### ADLS Proj: TensorRT with MASE for Multiple Precision Inference

This notebook demonstrates the integration of TensorRT passes into MASE as part of the MASERT framework.

Currently, our experiments are conducted on RTX 4060 and RTX 3070 GPUs, as our request for A100 access is still pending.

### Objective
Our goal is to plot trade-off curves that analyze the relationship between different variables, including:
- **GPU Type** (e.g., RTX 4060, RTX 3070, and A100 when available)
- **Dataset** (e.g., CIFAR-10)
- **Model Type** (e.g., ResNet18, ResNet50, VGG, AlexNet ...)
- **Precision vs. Runtime Trade-off** (FP32, FP16, INT8)

At this stage, we have successfully implemented inference using multiple models, such as **ResNet18 and ResNet50**, on the **CIFAR-10 dataset**. Further experiments will explore the precision-runtime trade-off across different GPU architectures.


### Training the Model for Quantization Experiments

In this section, we train an original model of a target model type. The trained model will later serve as a baseline for different precision quantization experiments, including FP32, FP16, and INT8. This process helps in evaluating the trade-offs between model accuracy and runtime efficiency across different GPU architectures.

#### Running the Training Script

To train the model, execute the following command:

```bash
!python3 ./ch train --config /workspace/ADLS_Proj/docs/tutorials/tensorrt/resnet18_INT8_quantization_by_type.toml



In [3]:
!python3 ./ch train --config /workspace/ADLS_Proj/docs/tutorials/tensorrt/resnet18_INT8_quantization_by_type.toml

INFO: Seed set to 0
I0311 00:15:51.512130 140276942296128 seed.py:57] Seed set to 0
+-------------------------+--------------------------+--------------+-----------------+--------------------------+
| Name                    |         Default          | Config. File | Manual Override |        Effective         |
+-------------------------+--------------------------+--------------+-----------------+--------------------------+
| task                    |      [38;5;8mclassification[0m      |     cls      |                 |           cls            |
| load_name               |           None           |              |                 |           None           |
| load_type               |            mz            |              |                 |            mz            |
| batch_size              |           [38;5;8m128[0m            |      64      |                 |            64            |
| to_debug                |          False           |              |                

## Explore Sparsity

In [29]:
FP16_SPARSITY_BY_TYPE_TOML = "/workspace/ADLS_Proj/docs/tutorials/proj/resnet18_FP16_spar.toml"
RES_CHECKPOINT_PATH = "/workspace/ADLS_Proj/mase_output/resnet18_cls_cifar10_2025-03-08/software/training_ckpts/best.ckpt"
!python ch transform --config {FP16_SPARSITY_BY_TYPE_TOML} --load {RES_CHECKPOINT_PATH} --load-type pl

INFO: Seed set to 0
I0319 01:41:30.468708 140167305331776 seed.py:57] Seed set to 0
+-------------------------+--------------------------+----------------------+--------------------------+--------------------------+
| Name                    |         Default          |     Config. File     |     Manual Override      |        Effective         |
+-------------------------+--------------------------+----------------------+--------------------------+--------------------------+
| task                    |      [38;5;8mclassification[0m      |         cls          |                          |           cls            |
| load_name               |           [38;5;8mNone[0m           |                      | /workspace/ADLS_Proj/mas | /workspace/ADLS_Proj/mas |
|                         |                          |                      | e_output/resnet18_cls_ci | e_output/resnet18_cls_ci |
|                         |                          |                      | far10_2025-03-08/sof