Skip to content

Conversation

@cristianoc
Copy link
Contributor

No description provided.

Add DCE formalization and clarify reduce semantics

DCE formalization (main change):
- Add lean-formalisation/DCE.lean: complete Lean formalization of
  reactive Dead Code Elimination with graph fragments, reachability
  computation, and refcount maintenance (1346 lines)
- Add dce_reactive_view.tex: paper describing reactive DCE using
  reducers and mappers for aggregating distributed program graphs
- Update lean-formalisation/lakefile.toml to include DCE in
  defaultTargets and add DCE as a lean_lib entry
- Remove lean-formalisation/Reduce/Basic.lean (refactored to
  top-level Reduce.lean)

reduce.tex clarifications:
- Add "Asymmetric inverse law" remark after well-formedness definition,
  explaining why only $(a \oplus v) \ominus v = a$ is required (not the
  reverse), and how this ensures incremental updates work correctly
- Add "Why remove then add?" remark after delta application definition,
  explaining that the order mirrors incremental update semantics
- Add "Order matters when Δ⁺ and Δ⁻ overlap" example demonstrating
  that remove-then-add is essential when working with sets (add-then-remove
  fails when deltas overlap, incorrectly removing newly added elements)

Also regenerate reduce.pdf and add dce_reactive_view.pdf
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- REACTIVE_CALCULUS_NOTES.md: Add Skip reducer laws (commutativity, invertibility),
  document JSON key ordering, structural operators, per-key aggregation,
  lazy/compute nodes, and DCE case study (Section 9)

- bindings/SkipruntimeCore.res: Document JSON.t structure, Skip JSON ordering,
  and boolean serialization issue for slice/slices/take methods

- examples/JsonOrderingService.ts + JsonOrderingHarness.res: Comprehensive test
  for Skip JSON key ordering behavior (null < bools < numbers < strings < arrays < objects)
  and verification that boolean keys don't collide with numeric 0/1

- examples/BoolKVService.ts + BoolKVHarness.res: Minimal test for boolean
  key/value handling

- examples/LiveHarnessService.js: Refactor static fields for JS compatibility

- tsconfig.json: Simplify include pattern to examples/*.ts

- AGENTS.md: Add workspace rules for AI assistants
Comprehensive analysis of ~48 examples from the catalogue:
- Classify each by primitives needed (structural, standard/enriched reducer, partial, compute node)
- Identify that >50% use only standard reducers (count, sum)
- Discover simpler solutions for several 'complex' examples:
  - Top-K: structural via key ordering (no reducer)
  - Joins: map with context lookups
  - Resettable accumulator: epoch-based keys
  - Sliding windows: external eviction (Skip idiom)

Extended deep dives for 8 complex examples:
- Top-K, HLL, sliding window, undo/redo, resettable accumulator
- Acyclic joins, counting/DRed views, fixpoint algorithms

Propose calculus primitives in 5 tiers:
1. Structural (map, slice, take, merge)
2. Standard reducers (count, sum)
3. Combinators (product, mapInput, mapOutput)
4. Enriched patterns (average, freqMap, histogram)
5. Compute nodes (lazyCompute, fixpoint)

Reference DCE case study as concrete instantiation of the layered approach.
- Remove DBToaster subsection from Example 6.1
- Add as alternative approach under Example 6.3
- Adapt content to focus on join materialization (S ⋈ T)
- Add trade-off table comparing idiomatic vs DBToaster-style patterns
…tecture

- Frames DCE Layer 2 as instance of incremental fixpoint computation
- Proposes Level 1: low-level API with stepFromDelta and derivationCount
- Proposes Level 2: future DSL with automatic derivation (auto-diff analogy)
- Documents connection to semi-naive evaluation and counting-based deletion
- Lists examples beyond DCE: type inference, points-to, call graphs, Datalog
Formalizes the low-level incremental fixpoint API:

Core abstractions:
- MonotoneOp: monotone operator on sets
- DecomposedOp: F(S) = base ∪ step(S) decomposition
- isLeastFixpoint: fixpoint contained in all prefixpoints

Semi-naive evaluation:
- semiNaiveStep, semiNaiveIter, semiNaiveN
- semiNaive_subset_lfp (partial proof)

Expansion/Contraction:
- expands/contracts: operator comparison
- lfp_mono_expand: F ⊆ F' implies lfp(F) ⊆ lfp(F')
- lfp_mono_contract: F' ⊆ F implies lfp(F') ⊆ lfp(F)

Level 1 API:
- IncrFixpointConfig: stepFromDelta + derivationCount
- IncrState, ValidState, IncrResult
- expansion_correct, contraction_correct theorems

DCE instance:
- dceOp, dceStepFromDelta, dceDerivationCount
- dceConfig with correctness proofs
IncrementalFixpoint.lean:
- Add cascade algorithm: cascadeStep, cascadeN, cascadeFix
- Prove cascade soundness (cascade_sound_stable)
- Prove stability persistence and fixpoint characterization
- Add overall correctness theorems:
  - expansion_correctness (assuming step additivity)
  - contraction_correctness (S' ⊆ K* direction proven)
- Add lfp'_subset_cascadeN: new fixpoint survives cascade
- Define stepAdditive, monoDerivCount properties

incremental_fixpoint_notes.tex:
- Expand Formal Definitions and Correctness with all theorems
- Add Overall Correctness section with expansion/contraction theorems
- Update Lean Formalization remark with detailed proof status
- Document assumptions (step additivity, count monotonicity)

2 remaining sorry's:
- semiNaive_stable_step_subset (requires step additivity proof)
- contraction_correctness K* ⊆ S' direction (future work)
IncrementalFixpoint.lean:
- Add iterative construction: iterF, iterFLimit
- Add rank comparison: firstAppears, rankLt, inLimit
- Add well-founded cascade: wfShouldDie, wfCascadeStep, wfCascadeN, wfCascadeFix
- Prove wf_contraction_correctness (main theorem)
- Prove wfCascade_removes_non_lfp' (non-lfp' elements removed)
- Prove no_rankLt_to_non_limit (key lemma)
- Prove base_subset_wfCascadeN, wfCascadeN_mono

incremental_fixpoint_notes.tex:
- Add Well-Founded Derivations section
- Document iterative construction and rank
- Document well-founded cascade algorithm
- Add well-founded contraction correctness theorem
- Update Lean formalization remarks

Key insight: elements not in lfp' have no finite rank under F',
so nothing has strictly lower rank, hence no wf-derivers.
IncrementalFixpoint.lean:
- Complete iterFLimit_has_wf_deriver (no sorry)
- Add exists_first_appearance helper
- Add first_appearance_has_deriver helper
- Complete lfp'_subset_wfCascadeN (no sorry)
- wf_contraction_correctness now fully proven

The well-founded approach is complete:
- Elements not in lfp' have no finite rank
- No element has strictly lower rank than them
- So they have no wf-derivers and are removed
- Meanwhile, lfp' elements always have wf-derivers

Remaining 2 sorry's:
- semiNaive_stable_step_subset (expansion with additivity)
- contraction_correctness (old simple counting, superseded)
IncrementalFixpoint.lean:
- Add stable_step_delta_subset: step(delta_n) ⊆ current_n when stable
- Add current_eq_union_delta: current_{n+1} = current_n ∪ delta_{n+1}
- Add step_delta_subset_next: step(delta_i) ⊆ current_{i+1}

The expansion proof (semiNaive_stable_step_subset) still has one sorry
due to the complexity of proving current_n = ⋃ delta_i.

Status: 2 sorry's remain
- semiNaive_stable_step_subset (expansion with additivity)
- contraction_correctness (old simple counting, superseded by wf_contraction_correctness)
- Remove unused lemmas: iterF_mono', base_inLimit, wfCascadeN_mono,
  base_subset_wfCascadeN, semiNaive_stable_step_delta, stable_step_delta_subset,
  current_eq_union_delta, step_delta_subset_next, lfp_contract_eq, inLimit, inLimit_iff
- Remove unused definitions: wfDerives, expansionDelta, newlyDead
- Remove unused structures: IncrState, ValidState, IncrResult, ValidConfig
- Remove old 'Counting-Based Cascade' section from TeX (superseded by well-founded)
- Integrate well-founded derivations into main flow in TeX

Lean file: 763 -> 625 lines, still 1 sorry (semiNaive_stable_step_subset)
Main theorems preserved: wf_contraction_correctness, expansion_correctness,
incremental_update_correct
- Prove semiNaive_stable_step_subset: step(current) ⊆ current when stable
  Key insight: induction on m ≤ n using additivity to decompose step(current_m)
- Add helper lemmas: semiNaiveCurrent_mono', step_delta_subset_next,
  stable_step_delta_subset, current_union_delta
- Update header: 'All proofs complete (no sorry)'
- Update .tex: reflect fully proven status

The incremental fixpoint API is now fully verified in Lean:
- Semi-naive expansion correctness
- Well-founded cascade contraction correctness
- Main theorem: incremental_update_correct
Lean file:
- Add ImplConfig with stepFwd, stepInv for efficient implementation
- Add ImplState with current set and rank map
- Add detailed pseudo-code for expand (BFS) and contract (worklist cascade)
- Add dceImplConfig showing DCE as instance
- Explain why complexity matches dedicated implementations

TeX file:
- Replace abstract 'gaps' discussion with concrete pseudo-code
- Add extended API (stepInv, rank storage)
- Add expansion algorithm (BFS) with O(new + edges) complexity
- Add contraction algorithm (worklist) with O(dying + edges) complexity
- Add detailed remark on gaps between proofs and algorithms:
  * Refinement (obvious), Termination (clear), stepInv (assumed)

Key insight: storing ranks (one int per element) makes well-founded
check O(1), achieving optimal complexity matching dedicated DCE.
Reorganize section into:
1. API Specification: types, configuration, state, required properties
2. Algorithms: expansion (BFS) and contraction (worklist) pseudo-code
3. Correctness and Analysis: proven theorems, complexity table, gap analysis
4. Formal Definitions: concise reference for Lean proofs

- Add theorem environment to preamble
- Remove redundant explanations
- Add clear tables for types, complexity, and gaps
- Reduce from 473 to 394 lines (7 pages)
- Add remark explaining Simple vs Full API:
  * Simple (expansion only): base + stepFwd
  * Full (expansion + contraction): adds stepInv + rank
- Mark stepInv as optional in configuration table
- Note that system can build stepInv from stepFwd if not provided
- Clarify state: rank only needed for contraction
- Update DCE example to show stepInv as optional

Many applications only need expansion (monotonically growing).
The simple API is easier to implement.
- Fixpoint.res: Low-level algorithm requiring user to provide stepFwd
  and manage delta consistency (for experts/special cases)

- SkipruntimeFixpoint.res: Managed API that owns the step relation,
  eliminating user obligations (recommended for most use cases)

- DCEExample.res: Dead code elimination example using managed API

- FixpointTest.res: Comprehensive tests for the fixpoint algorithms

- Updated incremental_fixpoint_notes.tex to clarify:
  - User obligations (low-level API only)
  - Managed API eliminates all obligations
  - Automatic properties (monotonicity, stepInv correctness, etc.)
- Layer 1 (reactive aggregation): Kept as-is, still valuable
- Layer 2 (incremental DCE): Now just references incremental_fixpoint_notes.tex
- Reduced from ~300 lines to ~100 lines
- DCE-specific algorithms replaced by generic fixpoint instantiation

Note: lean-formalisation/DCE/Layer2/ is now redundant with IncrementalFixpoint.lean
- Semi-naive evaluation (Bancilhon & Ramakrishnan)
- Differential Dataflow (McSherry et al.)
- Well-founded semantics (Van Gelder et al.)
- DRed algorithm (Gupta et al.)
- Tarski's fixpoint theorem
Major changes:
- Replaced Belt.Map/Set with native JS Map/Set for O(1) operations
- Fixed algorithmic bug where contraction with stale ranks could
  incorrectly remove elements reachable via alternative paths
- Added re-derivation phase after contraction to ensure completeness

Implementation:
- Fixpoint.res: Complete rewrite using native JS collections
- SkipruntimeFixpoint.res: Converted to native JS collections
- Removed all Belt dependencies from .res files

Formalization:
- Updated Lean proof with cascadeAndRederive theorem
- Added cascadeN_stabilizes axiom for finiteness assumption
- All proofs complete (no sorry statements)

Testing:
- 35 comprehensive tests in FixpointTest.res
- Added Alternative Path demo to DCEExample.res
- All tests pass

Documentation:
- Updated incremental_fixpoint_notes.tex with corrected algorithm
- Updated interface files with comprehensive documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants