Skip to content

feat(engine): execute ONNX model graph via tract to compute real embeddings #12

@dk-uppi-aks

Description

@dk-uppi-aks

Description

The pure-Rust ONNX embedding logic in embeddings.rs successfully loads and optimizes models at runtime, but the inference phase is stubbed to return a static zero-vector:

pub fn compute_embedding(&self, _tokens: &[i64]) -> Result<Vec<f32>, String> {
    if self.model.is_none() {
        return Err("Model not loaded".to_string());
    }

    // In native execution:
    // let input = Tensor::from_shape(&[1, _tokens.len()], _tokens)...
    // plan.run(tensors_in)...

    Ok(vec![0.0; 384]) // Returns typical 384-dimensional vector stub
}

Impact

Latent alignment checks (calculate_latent_alignment) always receive empty/zero vectors for runtime calculations, neutralizing similarity threshold policy checks.

Proposed Solution

  1. Construct a tract_onnx::prelude::Tensor input array from the supplied _tokens slice with a shape of [1, tokens.len()].
  2. Pass the tensor array to the optimized runnable plan via self.model.run(tensors_in).
  3. Extract the output layer's floating-point values and return them as a Vec<f32> payload.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions