# Integrating Machine Learning into Simulations

## About

While the connection between simulations and machine learning enabled by automatic differentiation seeks to embed simulation, with added gradient information, into machine learning frameworks, there also exists the other way - embedding (trained) machine learning models into simulations. In this notebook we will go through the different approaches available to us to embed machine learning models into our simulations, breaking it down by programming language (C, C++, Fortran). Especially for this task there is a wide variety of tools available to us with their own constraints, and design philosophies.

## Outline

* [1. Integration into C](#export-into-c)
  * [1.1 Export with TensorFlow Lite](#tf-lite)
  * [1.2 Export with TVM's C Runtime](#_tvm)
  * [1.3 Export with ONNX](#_onnx)
  * [1.4 Export with IREE](#_iree)
* [2. Integration into C++](#integrate-cpp)
  * [2.1 Export with TorchScript](#torchscript)
  * [2.2 Export with TVM](#tvm-cpp)
  * [2.3 Export with ONNX](#onnx-cpp)
  * [2.4 Export with IREE](#iree-cpp)
  * [2.5 Export with TensorFlow Lite](#tflite-cpp)
  * [2.6 Export with AOT-compiled XLA](#xla-cpp)
* [3. Integration into Fortran](#export-into-fortran)
  * [3.1 Export with IREE](#iree-fort)
    * [3.1.1 Emit C Code](#iree-fort-c)
    * [3.1.2 Compile a Static Library](#iree-fort-static)

  > The options for export into Fortran are largely the ones of C, as such we only present one example here.

## 1. Integration into C <a name="export-into-c"></a>

For the integration of a trained machine learning model into a C based simulation, we begin with the example of PyTorch and show how we can export a model exploring the model export process from a trained JAX model to a TensorFlow model, then using TensorFlow Lite to export the model to C.

![](https://i.imgur.com/YZ0xFz0.png)

For the sake of this tutorial, we will henceforth consider the following pretrained ResNet-50 PyTorch model to include in our simulation:

```python
from torchvision.models import resnet50, ResNet50_Weights

# Initializing the model with the pre-trained weights, and setting it into evaluation mode
pretrained_weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=pretrained_weights)
model.eval()
```

With this we can now consider how to include said model in our simulations.

> An aspect we are (consciously) glossing over is the data structure exchange between different frameworks. For this we point the reader to the [DLPack](https://dmlc.github.io/dlpack/latest/) documentation and Python's [Array API specification](https://data-apis.org/array-api/2022.12/).

### 1.1 Export with [TensorFlow Lite](https://www.tensorflow.org/lite) <a name="tf-lite"></a>

To export a trained machine learning model into a simulation framework from JAX, or TensorFlow you have multiple options. Coming from JAX most of these begins with converting the JAX model into a TensorFlow model, the exception being IREE where one can export straight from JAX using the [IREE Runtime](#_iree).
4
1. Write JAX model & train it
2. Export to TensorFlow with `jax2tf` [link](https://github.com/google/jax/blob/main/jax/experimental/jax2tf/jax2tf.py)
3. Utilize TensorFlow's infrastructure to include your model

While this inclusion of the model can take the form of the presented cross-framework infrastructure afforded by TVM, ONNX, and IREE the JAX/TensorFlow ecosystem also has its own [model export to C](https://www.tensorflow.org/lite/guide/inference#load_and_run_a_model_in_objective-c) infrastructure in the form of [TensorFlow Lite](https://www.tensorflow.org/lite).

In the interest of focussing on PyTorch as our main machine learning framework for this tutorial, we are omitting a TensorFlow Lite example. If you are interested in exporting TensorFlow models into your C based simulation we would encourage you to take a look at the following selected TensorFlow Lite examples, which serve as good illustrations for the workflow and required project structure:

* [Image Classifier in C](https://github.com/tensorflow/tflite-support/blob/master/tensorflow_lite_support/ios/task/vision/sources/TFLImageClassifier.h)

### 1.2 Export with [TVM](https://tvm.apache.org) <a name="_tvm"></a>


TVM is able to utilize the graph representation emitted by PyTorch's JIT, or other frameworks in large parts, which it ingests and then transforms into its own internal graph representation called `Relay`. To then leverage that graph representation to let the TVM runtime library run on any device we need to

* [Build the TVM runtime library](https://tvm.apache.org/docs/how_to/deploy/index.html#build-the-tvm-runtime-library)

After which we can integrate the runtime API into our build system, and integrate the machine learning model exported with TVM in our simulation.

* [Integrate TVM into Your Project](https://tvm.apache.org/docs/how_to/deploy/integrate.html)

For which we want to use the [C_Runtime_API](https://github.com/apache/tvm/blob/main/src/runtime/c_runtime_api.cc#L262) to then integrate the machine learning model on the C level.

### 1.3 Export with [ONNX](https://onnxruntime.ai) <a name="_onnx"></a>

ONNX (Open Neural Network Exchange) is one of the main standards to exchange models between frameworks and one of the main promoters of interoperability between frameworks. To work with ONNX we first need to export the model from PyTorch into a ONNX module

```python
# Export the model
torch.onnx.export(resnet50,               # model being run
                  x,                         # model input (or a tuple for multiple inputs)
                  "resnet50.onnx",           # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=10,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                  dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                                'output' : {0 : 'batch_size'}})
```

after which we want to utilize the [ONNX Runtime](https://onnxruntime.ai) to integrate the machine learning model into our simulations. Afterwards we will follow a similar approach to other frameworks to include an ONNX model in our simulation:

1. Include `onnxruntime_c_api.h`
2. Call `OrtCreateEnv`
3. Create a session: `OrtCreateSession`
4. Create a tensor `OrtCreateMemoryInfo` & `OrtCreateTensorWithDataAsOrtValue`
5. `OrtRun`

A good example to see at play is the [FNS Candy](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/fns_candy_style_transfer) from the ONNX runtime example directory and the [ONNX Runtime C API Documentation](https://onnxruntime.ai/docs/get-started/with-c.html).

### 1.4 Export with [IREE](https://iree-org.github.io/iree/) <a name="_iree"></a>

To fully leverage IREE's C API we have to begin by building its runtime from source to then utilize its components inside of our simulation.

* [Building from Source - Getting Started](https://openxla.github.io/iree/building-from-source/getting-started/)

IREE is then able to use its own minimal virtual machine at runtime, with more complex logic being dispatched to the chosen hardware backend through a hardware abstraction layer (HAL) interface. As simulation users, seeking to embed our machine learning models into simulations we will mostly be interacting with either the virtual machine or the hardware abstraction layer.

> We lower further down to C if we'd seek to embed native C-code into our simulation, and avoid the IREE runtime dependency.

The involved steps to take a pretrained PyTorch model, and integrate it into our simulation code are the following:

1. Compilation with Torch-MLIR
2. Compilation to IREE's vmfb format
3. Register the components with IREE
4. Create an instance of the Virtual Machine
5. Load the pretrained model
6. Create the VM context

1. Compile the pre-trained model with [Torch-MLIR](https://github.com/llvm/torch-mlir) into the linear algebra dialect of [MLIR](https://mlir.llvm.org) as a first step.

```python
inference_args = (pretrained_weights, X_test)
graph = functorch.make_fx(model)(*inference_args)
strip_overloads(graph)
linalg_on_tensors_mlir = torch_mlir.compile(
    graph,
    inference_args,
    output_type="linalg-on-tensors")
```

2. Compile the generated linear algebra dialect into IREEs own `vmfb` standard to then be usable by the C API.

> We can accumulate a library of precompiled machine learning models which we can then include in our simulations that way. 

```python
iree_torch.compile_to_vmfb(
        linalg_on_tensors_mlir, args.iree_backend)
```

```c
// Including the headers
#include "iree/base/api.h"
#include "iree/hal/api.h"
#include "iree/vm/api.h"

// Include VM bytecode and HAL modules + HAL drivers in use
#include "iree/modules/hal/module.h"
#include "iree/hal/drivers/local_task/registration/driver_module.h"
#include "iree/vm/bytecode_module.h"
```

3. Register the individual components

```c
iree_hal_local_task_driver_module_register(
    iree_hal_driver_registry_default());
```

4. Create a VM instance

```c
iree_vm_instance_t* instance = NULL;
iree_vm_instance_create(iree_allocator_system(), &instance);

// Modules with custom types must be statically registered before use.
iree_hal_module_register_all_types(instance);

// Using the CPU driver
iree_hal_driver_t* driver = NULL;
iree_hal_driver_registry_try_create(...);

// Using the default device here
iree_hal_device_t* device = NULL;
iree_hal_driver_create_default_device(...);

// Create a HAL module for the newly created VM
iree_vm_module_t* hal_module = NULL;
iree_hal_module_create(...);

// Releasing reference to the driver
iree_hal_driver_release(driver);
```

5. Load the pretrained PyTorch model as a `.vmfb` bytecode module, which we generated above

```c
iree_vm_module_t* bytecode_module = NULL;
iree_vm_bytecode_module_create(
    instance,
    iree_const_byte_span_t{module_data, module_size},
    /*flatbuffer_allocator=*/iree_allocator_null(),
    /*allocator=*/iree_allocator_system(), &bytecode_module);
```

6. Create the VM context

```c
iree_vm_context_t* context = NULL;
iree_vm_module_t* modules[2] = {hal_module, bytecode_module};
iree_vm_context_create_with_modules(
    instance, IREE_VM_CONTEXT_FLAG_NONE,
    IREE_ARRAYSIZE(modules), modules,
    iree_allocator_system(), &context);

// Release references to the modules
iree_vm_module_release(hal_module);
iree_vm_module_release(bytecode_module);
```

Looking up the function call

```c
iree_vm_function_t main_function;
iree_vm_context_resolve_function(
    context, iree_string_view_literal("module.main_function"), &main_function)
);
```

After which we can finally invoke our pretrained machine learning model from within our simulation model

```c
// (Application-specific I/O buffer setup, making data available to the device)
iree_vm_invoke(context, main_function, IREE_VM_INVOCATION_FLAG_NONE,
               /*policy=*/NULL, inputs, outputs,
               iree_allocator_system());
```

> We can decorate every call here with `IREE_CHECK_OK` to catch (silent) errors while we are debugging the workflow.

For an end-to-end example in C with CMake, please take a look at the following MNIST example for IREE:

* [IREE Samples: Vision Inference Example](https://github.com/iree-org/iree-samples/tree/main/cpp/vision_inference)

For a minimal example of what an integration of an IREE layer in a simulation code would look like with inputs, and outputs from the VM we recommend the simple embedding example:

* [IREE Samples: Simple Embedding](https://github.com/openxla/iree/tree/main/samples/simple_embedding)

With the native lowering to C poorly documented at the current time, we recommend taking a look at the [tests](https://github.com/openxla/iree/tree/89b2201f717e16204f550e72cdd54a8901e77fdc/iree/compiler/Dialect/VM/Target/C/test) of the C-target, which point us to having to invoke iree through the command line interface to then emit a C-module. The syntax for this is

```bash
iree-translate -iree-vm-ir-to-c-module ResNet50.vmfb
```

where we built the `iree-translate` binary when we compiled IREE from source.

## 2. Integration into C++ <a name="integrate-cpp"></a>

For the integration of a trained machine learning model into a C++ based simulation, we begin with the example of PyTorch and show how we can export a model using TorchScript, TVM, and ONNX before looking at the tools available in the Tensorflow/JAX ecosystem.

![](https://i.imgur.com/e7mGvir.png)

For the sake of this tutorial, we will henceforth consider the following pretrained ResNet-50 PyTorch model to include in our simulation:

```python
from torchvision.models import resnet50, ResNet50_Weights

# Initializing the model with the pre-trained weights, and setting it into evaluation mode
pretrained_weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=pretrained_weights)
model.eval()
```

With this we can now consider how to include said model in our simulations.

> An aspect we are (consciously) glossing over is the data structure exchange between different frameworks. For this we point the reader to the [DLPack](https://dmlc.github.io/dlpack/latest/) documentation and Python's [Array API specification](https://data-apis.org/array-api/2022.12/).

### 2.1 Export with [TorchScript](https://pytorch.org/docs/stable/jit.html)  <a name="torchscript"></a>

To now apply PyTorch's own utility for the export of PyTorch's models to C++ [TorchScript]() we have to begin by tracing our machine learning model

```python
traced_model = torch.jit.trace(model, (**example_inputs))
```

this has now created a `torch.jit.ScriptModule` of which `TracedModule` is an instance. This records the definitions into an intermediate representation/graph. This approach has many advantages:

* TorchScript can be invoked by itw own interpreter
* We can save the model, and reload it from the storage place later on
* Interfacing to different backends/devices

> Attention: TorchScript tracing only records the code-path **taken** not the entire code!!!

This restriction of TorchScript can be circumvented by using a _script compiler_ this would be done with

```python
scripted_control = torch.jit.script(ControlFlowCode())

resnet = ResNet50(scripted_control)
scripted_resnet = torch.jit.script(resnet)
```

The files we generate with this all have the `.pt` format and can be stored or loaded with

```python
traced.save('scripted_resnet.pt')
loaded = torch.jit.load('scripted_resnet.pt')
```

which we can then [load into C++](https://pytorch.org/tutorials/advanced/cpp_export.html).

To load the model we then add the following syntax to our C++ code

```cpp
#include <torch/script.h>

#include <iostream>
#include <memory>

int main(int argc, const char* argv[]) {

  torch::jit::script::Module module;
  try {
    // Deserialize the ScriptModule from a file using torch::jit::load().
    module = torch::jit::load(argv[1]);
  }
  catch (const c10::Error& e) {
    std::cerr << "error loading the model\n";
    return -1;
  }

  std::cout << "Model loaded\n";
}
```

where the `</torch/script.h>` encompasses all dependencies on the LibTorch library to run the example, and then deserializes the module using `torch::jit::load()` taking the file path as an input, and then receive a `torch::jit::script::Module` object in return. The required added syntax for CMake is then

```cmake
find_package(Torch REQUIRED)

add_executable(resnet-50 resnet-50.cpp)
target_link_libraries(resnet-50 "${TORCH_LIBRARIES}")
set_property(TARGET resnet-50 PROPERTY CXX_STANDARD 14)
```

the only thing missing to run this with CMake is then having [LibTorch](https://pytorch.org/cppdocs/installing.html) present on your device, which you are able to [download](https://pytorch.org/get-started/locally/) from the PyTorch release page. And then build

```bash
mkdir build && cd build
cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
cmake --build . --config Release
```

With which we can run our minimal example.

For further exploration of the TorchScript pathway there are multiple resources to consult:

* [Examples Folder](https://github.com/pytorch/pytorch/tree/master/test/custom_operator)
* [TorchScript Reference Documentation](https://pytorch.org/docs/master/jit.html)
* [PyTorch C++ API Documentation](https://pytorch.org/cppdocs/)
* [PyTorch Docs](https://pytorch.org/docs/)

### 2.2 Export with [TVM](https://tvm.apache.org)  <a name="tvm-cpp"></a>

TVM is able to utilize the graph representation emitted by PyTorch's JIT in large parts, which it ingests and then transforms into its own internal graph representation called `Relay`. Taking over the first few lines of the TorchScript example

```python
scripted_control = torch.jit.script(ControlFlowCode())

resnet = ResNet50(scripted_control)
scripted_resnet = torch.jit.script(resnet)
```

Which we can then convert into a Relay graph

```python
input_name = "input0"
shape_list = [(input_name, img.shape)]
mod, params = relay.frontend.from_pytorch(scripted_resnet, shape_list)
```

With this representation, we now live inside of the TVM compiler and are subsequently able to compile the model to any backend TVM supports, such as on CPU machine for this example. For this one first compiles with LLVM before the utilizing a `graph_executor` to deploy the graph

```python
target = tvm.target.Target("llvm", host="llvm")
dev = tvm.cpu(0)
with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(mod, target=target, params=params)
```

Which is then deployed as a graph

```python
from tvm.contrib import graph_executor

dtype = "float32"
m = graph_executor.GraphModule(lib["default"](dev))
m.set_input(input_name, tvm.nd.array(img.astype(dtype)))
m.run()
tvm_output = m.get_output(0)
```

The export into C++ works similarly, and can best be inspected in

* [Deploy a TVM Module using the C++ API](https://tvm.apache.org/docs/how_to/deploy/cpp_deploy.html)
* [The Deployment Guide for C++](https://github.com/apache/tvm/tree/main/apps/howto_deploy)

### 2.3 Export with [ONNX](https://onnxruntime.ai/docs/reference/ort-format-models.html)  <a name="onnx-cpp"></a>

ONNX (Open Neural Network Exchange) is one of the main standards to exchange models between frameworks and one of the main promoters of interoperability between frameworks. To work with ONNX we first need to export the model from PyTorch into a ONNX module

```python
# Export the model
torch.onnx.export(resnet50,               # model being run
                  x,                         # model input (or a tuple for multiple inputs)
                  "resnet50.onnx",           # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=10,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                  dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                                'output' : {0 : 'batch_size'}})
```

after which we want to utilize the [ONNX Runtime](https://onnxruntime.ai) to integrate the machine learning model into our simulations. Afterwards we will follow a similar approach to other frameworks to include an ONNX model in our simulation:

1. Include `onnxruntime_cxx_api.h`
2. Call `OrtCreateEnv`
3. Create a session: `OrtCreateSession`
4. Create a tensor `OrtCreateMemoryInfo` & `OrtCreateTensorWithDataAsOrtValue`
5. `OrtRun`

A good example to see at play is the [FNS Candy](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/c_cxx/fns_candy_style_transfer) from the ONNX runtime example directory.

### 2.4 Export with [IREE](https://iree-org.github.io/iree/) <a name="iree-cpp"></a>

Google's IREE is another framework, which is able to ingest any model coming from PyTorch, JAX, and TensorFlow, export, and then expose it to a general programming language. IREE's C-API is the main export for the purpose of this tutorial, but there exists a semi-official template for C++ maintained by the _Fraunhofer Institute for AI and Autonomous Systems_ which shows IREE can be integrated with any C++ based project. We will go into more detail on IREE, and its syntax for model export in the tutorial on C, and would hence urge a look at the IREE subsection in the C-Tutorial if you are interested in utilizing IREE in your simulation framework, or a look at the IREE section of the slides if you are looking to scope its capabilities.

* [IREE Template for C++](https://github.com/iml130/iree-template-cpp)

### 2.5 Export with [TensorFlow Lite (TFLite)](https://www.tensorflow.org/lite/microcontrollers/build_convert)  <a name="tflite-cpp"></a>

To export a trained machine learning model into a simulation framework from JAX, or TensorFlow you have multiple options. Coming from JAX most of these begins with converting the JAX model into a TensorFlow model

1. Write JAX model & train it
2. Export to TensorFlow with `jax2tf` [link](https://github.com/google/jax/blob/main/jax/experimental/jax2tf/jax2tf.py)
3. Utilize TensorFlow's infrastructure to include your model

While this inclusion of the model can take the form of the above presented cross-framework infrastructure afforded by TVM, ONNX, and IREE the JAX/TensorFlow ecosystem also has its own [model export to C++](https://www.tensorflow.org/lite/guide/inference#load_and_run_a_model_in_c) infrastructure in the form of [TensorFlow Lite](https://www.tensorflow.org/lite).

In the interest  on focussing on PyTorch as our main machine learning framework for this tutorial, we are omitting a TensorFlow Lite example. If you are interested in exporting TensorFlow models into your C++ based simulation we would encourage you to take a look at the following selected TensorFlow Lite examples, which serve as good illustrations for the workflow and required project structure:

* [Minimal Example](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/minimal/minimal.cc)
* [Image Labeling](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/examples/label_image/label_image.cc)

### 2.6 Export with Ahead-of-Time (AOT) Compilation with [XLA](https://www.tensorflow.org/xla)  <a name="xla-cpp"></a>

TensorFlow2, through its compilation backend, equips the user with the powerful ability to ahead-of-time (AOT) compile the XLA-graph generated by the machine learning model into a C++ model. For this we broadly follow the following steps:

1. Train the TensorFlow model
2. Save the model with `tf.saved_model.save`
3. Perform AOT compilation with XLA
4. Create a shared library
5. Create the C++ file
6. Build with your favorite build system

Presuming that we have our ResNet-50 model trained in TensorFlow, we then used the [saved model](https://www.tensorflow.org/api_docs/python/tf/saved_model/save) format, and store it

```python
class ResNet50(tf.Module):
    ...

model = ResNet50()
tf.saved_model.save(
    model, './resnet', signature=None, options=None
)
```

Which then gives us a model which can take in an input tensor, and then run the model in a frozen (inference) state.

> Minimal example of the saved model API available [here](https://www.tensorflow.org/guide/saved_model)

From there we then use the `saved_model_cli` from the command line to perform AOT compilation with XLA

```bash
saved_model_cli aot_compile_cpu \
    --dir ./resnet/1 \
    --tag_set serve \
    --signature_def_key serving_default \
    --output_prefix compiled_resnet \
    --cpp_class Bar
```

We then need to build the XLA AOT Runtime for that we need to

```bash
cd /path/to/TensorFlow/xla_aot_runtime_src
cmake .
make
```

With which we then have the respective runtime, and have to invoke the subgraph from our simulation code just as done in this [subgraph invocation example](https://www.tensorflow.org/xla/tfcompile#step_3_write_code_to_invoke_the_subgraph), which gives us a `resnet50_invocation.cc`. Compiling it then with

```bash
clang++ resnet50_invocation.cc -I/resnet50.o -L/libtf_xla_runtime.a
```

which can be integrated into the build system of our simulation such as Makefile, or CMake.

## 3. Integration into Fortran <a name="export-into-fortran"></a>


For the integration of a trained machine learning model into a Fortran based simulation, we note that the main export paths are the ones presented in the export-section of C:

![](https://i.imgur.com/qLb7qEX.png)

> Am missing the IREE path in this figure

For the sake of this tutorial, we will henceforth consider the following pretrained ResNet-50 PyTorch model to include in our simulation:

```python
from torchvision.models import resnet50, ResNet50_Weights

# Initializing the model with the pre-trained weights, and setting it into evaluation mode
pretrained_weights = ResNet50_Weights.DEFAULT
model = resnet50(weights=pretrained_weights)
model.eval()
```

With Fortran mostly relying on the provided C-APIs, we will sketch one end-to-end example of what such a model export would look like for the above pretrained PyTorch model to Fortran.

> An aspect we are (consciously) glossing over is the data structure exchange between different frameworks. For this we point the reader to the [DLPack](https://dmlc.github.io/dlpack/latest/) documentation and Python's [Array API specification](https://data-apis.org/array-api/2022.12/).

### 3.1 Exporting a Trained PyTorch Model to Fortran with IREE <a name="iree-fort"></a>

To integrate a trained PyTorch model into Fortran we have two main options, which broadly follow the approaches taken for C while choosing to cover the two main approaches with IREE here as the example for Fortran. For this we need to have a source install of IREE on the system we perform the below model conversion on

* [Building from Source - Getting Started](https://openxla.github.io/iree/building-from-source/getting-started/)

After which we have two main pathways available to us, the two of them being:

1. Let IREE emit C code and treat it as normal C code.
2. Compile the model into a static library with IREE, and then link the static library into our code.

Both of which are highly viable paths and build on the Fortran C interface, which most large simulation codes already use today for e.g. system calls and can hence easily be integrated into existing build systems.

#### 3.1.1 Emit C Code <a name="iree-fort-c"></a>

Just as shown in the section on C, we have to begin at the Python level to get from the trained representation PyTorch has of our model into a `vmfb` module that IREE can then take to convert to native C code. Beginning by converting the PyTorch code with [Torch-MLIR](https://github.com/llvm/torch-mlir) into a [MLIR](https://mlir.llvm.org) [Linalg](https://mlir.llvm.org/docs/Dialects/Linalg/) module

```python
inference_args = (pretrained_weights, X_test)
graph = functorch.make_fx(model)(*inference_args)
strip_overloads(graph)
linalg_on_tensors_mlir = torch_mlir.compile(
    graph,
    inference_args,
    output_type="linalg-on-tensors")
```

After which we can use `iree-torch` to build this into a `vmfb` module for IREE.

```python
iree_torch.compile_to_vmfb(linalg_on_tensors_mlir, args.iree_backend)
```

with which we have our `vmfb` module, and can then convert this with the command line interface into native C code.

> There also exist [iree-jax](https://openxla.github.io/iree/getting-started/jax/), and [iree-tensorflow](https://openxla.github.io/iree/getting-started/tensorflow/) to follow this path from the JAX, and TensorFlow ecosystem.

On the command line we then have to use the `iree-translate` binary we previously built to convert from a `vmfb` representation to native C code

```bash
iree-translate -iree-vm-ir-to-c-module ResNet50.vmfb
```

which we can then readily integrate just as any other Code into our build system.

#### 3.1.2 Compile a Static Library <a name="iree-fort-static"></a>

To take the route of compiling a static library, we will first have to obtain the MLIR representation, produce the static library, and then explore the usage of the provided function in a hypothetical simulation. For this we will again utilize Torch-MLIR

```python
inference_args = (pretrained_weights, X_test)
graph = functorch.make_fx(model)(*inference_args)
strip_overloads(graph)
linalg_on_tensors_mlir = torch_mlir.compile(
    graph,
    inference_args,
    output_type="linalg-on-tensors")
```

with which we then have an MLIR module of our trained model in `resnet_50.mlir`. Compiling this now with IREE we will obtain three output files

* `resnet_50.h`
* `resnet_50.o`
* `resnet_50.vmfb`

which we produce with the IREE-compiler we previously built

```bash
iree-compile resnet_50.mlir -o resnet_50.vmfb\
--iree-hal-target-backends=llvm-cpu\
--iree-llvm-link-embedded=false\
--iree-llvm-link-static\
--iree-llvm-static-library-output-path=resnet_50.o\
--iree-vm-target-index-bits=32\
```

With IREE's bytecode module we can then generate the embed data

```cmake
iree_c_embed_data(
    NAME
      resnet_50_c
    IDENTIFIER
      iree_static_library_resnet_50
    GENERATED_SRCS
      resnet_50.vmfb
    C_FILE_OUTPUT
      resnet_50_c.c
    H_FILE_OUTPUT
      resnet_50_c.h
    FLATTEN
    PUBLIC
)
```

For an end-to-end example of this with integration into CMake we point the reader to IREE's static library sample:

* [IREE Static Library Sample](https://github.com/openxla/iree/tree/main/samples/static_library)

The static library can then be integrated into our simulation, and the simulation's build system as usual.