Skip to content

0.3.0 - 2026-06-11

Latest

Choose a tag to compare

@github-actions github-actions released this 11 Jun 20:09

Release Notes

Added

  • 5 new tree-sitter grammars (Bash, Java, C, Ruby, C++) for the structural
    extractor. The engine now supports 9 languages for symbol-level
    indexing (function definitions, type declarations, imports/exports):
    Rust, TypeScript, Python, Go (the historical set) + Bash, Java, C, Ruby,
    C++. Each new grammar ships its own parser, import extractor, fuzz
    target, and a checked-in fixture under tests/fixtures/. Kotlin is
    deferred to v0.4.0
    per the plan's barred-entry rule (R-1.1):
    tree-sitter-kotlin has no 0.23.x line on crates.io, which would break the
    workspace's tree-sitter version pin policy.
  • Corpus freezes for the v0.3.0 measurement (.omc/plans/.../corpus_freeze.rs):
    • tests/fixtures/bench-corpus-frozen-A/: 22-file copy of
      examples/bench-corpus/ at the v0.2.0 commit, content-hash pinned.
    • tests/fixtures/bench-corpus-frozen-B/queries.json: 10-query golden-test
      subset for the F3 default-flip measurement.
    • The guard test crates/apohara-codesearch/tests/corpus_freeze.rs
      fails on any drift; refreezing requires a chore(bench): refreeze corpus X commit.
  • OpenSSF Scorecard audit (.omc/plans/apohara-codesearch-scorecard-audit.md):
    measured aggregate 7.0/10 (not 5.8 — that number in CLAUDE.md was
    stale). 9 of 18 checks at 10/10, 4 at 0-4. The QW-2 fix pins
    cargo-audit to the Cargo.lock version (ee8b06a). QW-1 (Maintained)
    is a structural repo-age penalty that resolves itself after 90 days.
    QW-3 re-score showed 0 immediate delta (scorecard needs 24-48h to
    re-index); expected +2 once indexed.
  • F3 BENCHMARK baseline (BENCHMARK.md v0.3.0 section): the v0.2.0
    hybrid-search baseline on the frozen corpus A is BM25
    recall@5=0.542/recall@10=0.625/MRR=0.326, vector 0.083/0.083/0.063,
    hybrid 0.458/0.542/0.285, with 9/24 queries where hybrid < best
    single mode (38%). The bench surface cannot measure the
    proposed-flip variants directly (see "Changed" below for why the flips
    are deferred).

Changed

  • No default flips this cycle. Per the v0.3.0 plan
    (.omc/plans/apohara-codesearch-3frentes.md §6), the proposed
    adaptive=true / diversify=true default flips require a
    data-driven positive-lift measurement that the bench-search harness
    cannot produce (both opt-ins live in the server-side search_code
    wrapper, not in the indexer-level rrf_fuse). F3-FLIP-CHECK
    therefore has no data to apply the split criteria, and Pablo
    chose to defer both flips to v0.4.0 with the appropriate plumbing
    to measure them server-side. The v0.3.0 release is therefore
    structural-extraction-focused, not ranking-focused.
    Rollback
    path for the flips is documented in the plan §10 and remains
    valid for the v0.4.0 measurement.
  • legacy.rb fixture renamed to legacy.foo in
    examples/demo-repo/. Reason: the v0.3.0 grammar expansion
    means .rb is now a parsed language; the ac4 integration test
    needed an extension no grammar recognizes. The file's content is
    unchanged.
  • test_detect_language_c updated to reflect that C++ extensions
    (.cpp/.hpp/.cc/.cxx/.hxx/.hh) now map to Language::Cpp
    instead of returning None. The C vs C++ split follows the
    tree-sitter convention (one grammar per major).
  • Module symbol kind added to SymbolKind enum (Ruby module
    declaration support).
  • Workspace tree-sitter dep set expanded: tree-sitter-bash,
    tree-sitter-java, tree-sitter-c, tree-sitter-ruby,
    tree-sitter-cpp (all at 0.23.x to match the existing pin).

Notes

  • Binary size on linux-x64: +7.99 MB (+62.58%) vs v0.2.0. Each new
    tree-sitter grammar contributes ~0.5-3.5 MB to the statically-linked
    binary (the C parser-table C code is the dominant cost; Java
    surprised as the smallest at +0.43 MB, Ruby at +2.05 MB, C++ at
    +3.45 MB). Pablo approved "all 6 grammars default" at the
    size-budget gate (the cumulative projection was revised from +60%
    to +62.58% as the actual measurements came in). The
    v0.3.0 plan's C++/SACRED resolution still applies: the
    windows-msvc artifact has a +20% budget; if the windows-msvc
    build exceeds it, C++ goes per-target default = [] and is
    opt-in via cargo build --features cpp. This must be verified
    at the F3-RELEASE / CI step.
  • OpenSSF Scorecard: 7.0/10 baseline measured. The 3 quick wins
    approved by Pablo (pin cargo-audit) are committed. No further
    Scorecard work in this release; the audit doc remains the source
    of truth for follow-ups.
  • Kotlin deferred to v0.4.0 — see "Added" notes.

Download apohara-codesearch 0.3.0

File Platform Checksum
apohara-codesearch-aarch64-apple-darwin.tar.xz Apple Silicon macOS checksum
apohara-codesearch-x86_64-apple-darwin.tar.xz Intel macOS checksum
apohara-codesearch-x86_64-pc-windows-msvc.zip x64 Windows checksum
apohara-codesearch-aarch64-unknown-linux-gnu.tar.xz ARM64 Linux checksum
apohara-codesearch-x86_64-unknown-linux-gnu.tar.xz x64 Linux checksum

Verifying GitHub Artifact Attestations

The artifacts in this release have attestations generated with GitHub Artifact Attestations. These can be verified by using the GitHub CLI:

gh attestation verify <file-path of downloaded artifact> --repo SuarezPM/apohara-codesearch

You can also download the attestation from GitHub and verify against that directly:

gh attestation verify <file-path of downloaded artifact> --bundle <file-path of downloaded attestation>