Skip to content

Yc octo#19

Draft
GordonYuanyc wants to merge 7 commits intomainfrom
yc-octo
Draft

Yc octo#19
GordonYuanyc wants to merge 7 commits intomainfrom
yc-octo

Conversation

@GordonYuanyc
Copy link
Copy Markdown
Collaborator

Delta protocol for Octo updates

Defined concrete delta types and promotion thresholds in src/sketches/octo_delta.rs:

  • CmDelta { row, col, value }
  • CountDelta { row, col, value }
  • HllDelta { pos, value }
  • CM_PROMASK, COUNT_PROMASK, HLL_PROMASK

This gives a common transport contract between worker-side updates and parent-side aggregation.

Worker-side delta emission APIs (insert_emit_delta)

Implemented sketch-level delta emission functions:

  • CMS: CountMin::insert_emit_delta
  • CS: Count::insert_emit_delta
  • HLL: HyperLogLog::insert_emit_delta

These functions perform normal insert logic while emitting delta records suitable for inter-thread transport.

Parent-side delta application APIs (apply_delta)

Implemented parent merge primitives:

  • CMS: CountMin::apply_delta
  • CS: Count::apply_delta
  • HLL: HyperLogLog::apply_delta

These are the aggregation-side operations consumed by the Octo runtime.

Octo runtime engine (OctoCore / OctoRuntime)

In src/sketch_framework/octo.rs:

  • Multi-worker input channels
  • Round-robin dispatch (next_worker)
  • Worker threads that process inputs and emit deltas
  • Aggregator thread that consumes deltas and applies them to parent sketch
  • Lifecycle controls: close, finish, graceful joins
  • Read-only live state access: OctoReadHandle::with_parent(...)

Usable APIs for both execution styles

  • Batch: run_octo(...)
  • Streaming: OctoRuntime::new, insert, insert_batch, finish

Delta emission behavior at threshold/improvement

apply_delta correctness
run_octo behavior for CMS/CS/HLL
Single-worker edge case
Streaming-vs-batch consistency
Runtime lifecycle behavior (close idempotence, insert-after-close panic)
Live read-handle observation semantics
Related sketch-level tests also cover insert_emit_delta/apply_delta behavior in CMS/CS/HLL files.

Current Limitations / Known Issues

Thread communication overhead in OctoCore remains a bottleneck.
In current benchmarking, Octo-style execution is not yet faster than full-merge baseline.
queue_capacity remains a compatibility field; transport currently uses unbounded MPSC.
Why This Is Still Valuable
Even without current speedup, this PR establishes the full functional Octo path:

clear delta contract,
sketch-native emit/apply primitives,
generic runtime interfaces

@GordonYuanyc GordonYuanyc force-pushed the yc-octo branch 3 times, most recently from df861bc to 4dbc63a Compare March 10, 2026 03:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant