Skip to content

Create Inferencing Library Using ONNX#18

Merged
Ok3ks merged 2 commits intomainfrom
haksoat/initial_inference
Jan 28, 2026
Merged

Create Inferencing Library Using ONNX#18
Ok3ks merged 2 commits intomainfrom
haksoat/initial_inference

Conversation

@HAKSOAT
Copy link
Owner

@HAKSOAT HAKSOAT commented Jan 27, 2026

This pull request adds inferencing using an ONNX model.

It depends largely on the presence of the ONNX model weights.

NB: The rest of the instructions in this description should be run in the inference dir. You can do this by running cd inference. Also, the onnx runtime and cargo builds are targeted at abi arm64-v8a and api level 27.

The model weights can be gotten by running:

just onnx-export <output_dir>

This would create two files in the output dir. One with an extension .onnx and another with .onnx_data. You only need to point to the .onnx file, as it points to .onnx_data internally

The model weights can then be used by specifying the path via the env var ACHO_MODEL_PATH.

Host Device Run

Hence, usage via cargo run on the host device can be done via:

ACHO_MODEL_PATH=<output_dir>/model.onnx cargo run

Building a Binary for Android

Building for Android requires the ONNX runtime built for the target device.

This can be done via the just onnx-build-runtime recipe. However, the build for arm64-v8a and api level 27 already exists on Google drive. It is recommended to downloaded directly via just onnx-download-runtime arm64-v8a 27 onnx_runtimes.

When you have the runtime, the build can then be done with:

cargo install cargo-ndk

ANDROID_NDK_HOME=<path_to_android_ndk>
ORT_PREFER_DYNAMIC_LINK=1 \
ORT_LIB_LOCATION=<path_to_onnx_libonnxruntime.so> \
cargo ndk -t arm64-v8a -P 27 build --release

This would create a binary release that can be found at: target/aarch64-linux-android/release/inference

NB: You can download the android SDK zip file from: https://developer.android.com/ndk/downloads. Version r27d was used in builds for this PR.

Running the Binary via adb

After connecting to your device via adb, you need to do the following:

  1. Move the binary release, model weights, and onnx runtime to the device
  2. Execute the release

This can be done via:

adb push target/aarch64-linux-android/release/inference /data/local/tmp \
&& adb push <output_dir>/model.onnx /data/local/tmp/weights/model.onnx \
&& adb push <output_dir>/model.onnx_data /data/local/tmp/weights/model.onnx_data \
&& adb push <path_to_onnx_libonnxruntime.so> /data/local/tmp/ \
&& adb shell chmod +x /data/local/tmp/inference \
&& adb shell "cd /data/local/tmp && HF_HOME=/data/local/tmp LD_LIBRARY_PATH=/data/local/tmp ./inference"

@HAKSOAT HAKSOAT requested review from Ok3ks and opeolluwa January 27, 2026 18:14
@Ok3ks Ok3ks merged commit 59dd9b8 into main Jan 28, 2026
1 check passed
@Ok3ks Ok3ks deleted the haksoat/initial_inference branch January 28, 2026 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments