Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What kind of change does this PR introduce?
feature
What is the current behavior?
Since PR #436, is possible to use
onnx
inference by calling theglobalThis[Symbol.for('onnxruntime')]
What is the new behavior?
Coming from Issue #479, the Inference API is an user friendly interface that allows developers easily run their own models using the power of the low level
onnx rust backend
.It's based on two core componenents
RawSession
andRawTensor
RawSession
: A low levelSupabase.ai.Session
that can execute any.onnx
model. It's recommended for use cases where need more control of the pre/pos-processing steps like text-to-audio example, as well when need to executelinear regression
,tabular classification
and self-made models.RawTensor
: A low level data representation of the model input/output. Inference API's Tensors are fully compatible with Transformers.js Tensors. It means that developers can still be using the high-lavel abstractions thattransformers.js
provides, like:.sum()
,.normalize()
,.min()
.Examples:
Simple utilization:
Loading a
RawSession
:Executing a
RawSession
withRawTensor
:Generating embeddings from scratch:
This example demonstrates how Inference API can be used to complex scenarios while taking advantage of Transformers.js high-level functions
Self-made models
This example ilustrate how users can train their own model and execute it direclty from
edge-runtime
The model was trained to expect the following object payload
Then the model inference can done inside a common
Edge Function
TODO:
tryEncodeAudio()
, check out the text-to-audio example