Description
The pure-Rust ONNX embedding logic in embeddings.rs successfully loads and optimizes models at runtime, but the inference phase is stubbed to return a static zero-vector:
pub fn compute_embedding(&self, _tokens: &[i64]) -> Result<Vec<f32>, String> {
if self.model.is_none() {
return Err("Model not loaded".to_string());
}
// In native execution:
// let input = Tensor::from_shape(&[1, _tokens.len()], _tokens)...
// plan.run(tensors_in)...
Ok(vec![0.0; 384]) // Returns typical 384-dimensional vector stub
}
Impact
Latent alignment checks (calculate_latent_alignment) always receive empty/zero vectors for runtime calculations, neutralizing similarity threshold policy checks.
Proposed Solution
- Construct a
tract_onnx::prelude::Tensor input array from the supplied _tokens slice with a shape of [1, tokens.len()].
- Pass the tensor array to the optimized runnable plan via
self.model.run(tensors_in).
- Extract the output layer's floating-point values and return them as a
Vec<f32> payload.
Description
The pure-Rust ONNX embedding logic in embeddings.rs successfully loads and optimizes models at runtime, but the inference phase is stubbed to return a static zero-vector:
Impact
Latent alignment checks (
calculate_latent_alignment) always receive empty/zero vectors for runtime calculations, neutralizing similarity threshold policy checks.Proposed Solution
tract_onnx::prelude::Tensorinput array from the supplied_tokensslice with a shape of[1, tokens.len()].self.model.run(tensors_in).Vec<f32>payload.