Skip to content

[Wave-14a] JEPA-T ternary ingest pipeline (L-S50) #807

@gHashTag

Description

@gHashTag

Summary

Implement a Rust crate crates/jepa_t_ingest/ that streams plaintext corpora into ternary-quantized triplet sequences for JEPA-T training on Trinity silicon.

Technical Details

  • Quantizer: quantize_phi_prior(fp_q15: i16) -> i8 matching Wave-9b RTL byte-for-byte
    • Threshold: φ⁻² (Q1.15) = 12533 (0x30F4)
    • if |w| >= 12533 → sign(w); else 0
  • Ingest API: ingest_text(input: &str, cfg: &IngestConfig) -> Vec<Triplet>
  • CLI: jepa_t_ingest --input corpus.txt --output triplets.bin

Constraints

  • R1 CROWN: Rust ONLY, no Python in the crate
  • Apache-2.0 license

Deliverables

  • crates/jepa_t_ingest/ (full crate with Cargo.toml, src/, tests/, README.md)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions