Skip to content

Fix #408: [Model] ExpectedRetrievalCost#743

Merged
isPANN merged 5 commits intomainfrom
issue-408
Mar 22, 2026
Merged

Fix #408: [Model] ExpectedRetrievalCost#743
isPANN merged 5 commits intomainfrom
issue-408

Conversation

@GiggleLiu
Copy link
Copy Markdown
Contributor

Summary

Add the implementation plan for the ExpectedRetrievalCost model and execute it in follow-up commits.

Fixes #408

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.63%. Comparing base (84e189b) to head (fddfd46).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##             main     #743    +/-   ##
========================================
  Coverage   97.62%   97.63%            
========================================
  Files         443      445     +2     
  Lines       54620    54775   +155     
========================================
+ Hits        53323    53478   +155     
  Misses       1297     1297            

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@GiggleLiu
Copy link
Copy Markdown
Contributor Author

Implementation Summary

Changes

  • Added the new ExpectedRetrievalCost misc model in src/models/misc/expected_retrieval_cost.rs, including schema metadata, validation, expected-cost evaluation helpers, satisfaction semantics, and an example-db-backed canonical YES instance.
  • Added focused unit coverage in src/unit_tests/models/misc/expected_retrieval_cost.rs for accessors, mass/cost computation, YES/NO instances, brute-force validation, and serde round-tripping.
  • Wired the model into the crate exports and misc registry surfaces in src/models/misc/mod.rs, src/models/mod.rs, and src/lib.rs.
  • Added CLI creation support in problemreductions-cli/src/cli.rs and problemreductions-cli/src/commands/create.rs with --probabilities, --num-sectors, and --latency-bound handling plus CLI tests.
  • Documented the problem in docs/paper/reductions.typ and added the Cody/Coffman reference in docs/paper/references.bib.

Deviations from Plan

  • The paper section uses a compact table-based worked example instead of a more custom visual layout after an initial Typst-heavy approach introduced parsing fragility during make paper.

Open Questions

  • None.

@GiggleLiu
Copy link
Copy Markdown
Contributor Author

Agentic Review Report

Structural Check

Structural Review: model ExpectedRetrievalCost

Structural Completeness

# Check Status
1 Model file exists PASS — src/models/misc/expected_retrieval_cost.rs is present
2 inventory::submit! present PASS — schema and size-field registrations are present
3 Serialize / Deserialize derive PASS
4 Problem trait impl PASS
5 SatisfactionProblem impl PASS
6 #[cfg(test)] + #[path = ...] test link PASS
7 Dedicated model test file exists PASS — src/unit_tests/models/misc/expected_retrieval_cost.rs
8 Test file has >= 3 test functions PASS — 7 tests
9 Registered in misc/mod.rs PASS
10 Re-exported in models/mod.rs PASS
11 declare_variants! entry exists PASS — default sat ExpectedRetrievalCost => "num_sectors ^ num_records"
12 CLI alias resolution entry PASS — manual problem_name.rs edits are not required because alias/canonical resolution is registry-backed
13 CLI create support PASS
14 Canonical model example registered PASS — src/models/misc/mod.rs extends expected_retrieval_cost::canonical_model_example_specs(), which is already consumed by src/example_db/model_builders.rs
15 Paper display-name entry PASS
16 Paper problem-def block PASS
17 Deterministic packet checks PASS — packet completeness passed; the packet whitelist warning was reviewed manually and every touched file is directly related to the new model surface

Build Status

  • make test: PASS
  • make clippy: PASS
  • make paper: PASS

Semantic Review

  • evaluate() correctness: OK — invalid configs are rejected via sector_masses(), and the ordered-pair cost matches the circular latency definition on independent brute-force checks for both the YES and NO examples.
  • dims() correctness: OK — vec![num_sectors; num_records] matches one m-ary variable per record.
  • Size getter consistency: OK — num_records() and num_sectors() exist and match the declared complexity string.
  • Example-db / paper consistency: OK — pred create --example ExpectedRetrievalCost, pred solve, and pred evaluate all agree with the paper example; the paper configuration has cost 1.0025, and brute force finds 54 satisfying assignments with the same minimum cost.

Issue Compliance (if linked issue found)

# Check Status
1 Problem name matches issue OK
2 Mathematical definition matches OK — circular latency and sector-mass aggregation follow the linked issue’s formulation
3 Problem type (opt/sat) matches OK — implemented as a decision problem with Metric = bool
4 Type parameters match OK — none
5 Configuration space matches OK — 0-based sector ids are equivalent to the issue’s 1..m notation
6 Feasibility check matches OK — valid configs satisfy iff the expected cost is at most the bound
7 Objective function matches OK — the ordered-pair sum matches the linked examples and independent brute-force validation
8 Complexity matches OK — registered baseline num_sectors ^ num_records matches the documented brute-force search

Summary

  • 17/17 structural checks passed
  • 8/8 issue compliance checks passed
  • No structural or semantic issues identified

Quality Check

Quality Review

Design Principles

  • DRY: OK — model logic is cleanly split into sector_masses(), expected_cost(), and is_valid_solution(), while CLI validation follows the existing per-model pred create pattern.
  • KISS: OK — the implementation is direct and mirrors the math without unnecessary abstraction.
  • HC/LC: OK — problem logic, CLI wiring, tests, and paper documentation remain in their usual module boundaries.

HCI (if CLI/MCP changed)

  • Error messages: OK — missing required flags produce actionable errors with a concrete usage string; verified with pred create ExpectedRetrievalCost ... --num-sectors 3.
  • Discoverability: OK — the model appears in pred list, pred show ExpectedRetrievalCost exposes fields/size fields/complexity, and pred create --example ExpectedRetrievalCost works.
  • Consistency: OK — flag names and help text follow existing pred create conventions.
  • Least surprise: OK — pred solve returns a satisfying assignment and pred evaluate confirms the paper configuration; both agree with independent brute-force math.
  • Feedback: OK — pred create reports the output path and solver/evaluation commands return structured JSON.

Test Quality

  • Naive test detection: OK
    • The new tests assert exact masses and costs, invalid-config rejection, serialization round-tripping, solver success, and the published YES/NO examples.

Issues

Critical (Must Fix)

None.

Important (Should Fix)

None.

Minor (Nice to Have)

None.

Summary

  • No design, HCI, or test-quality issues identified in the changed files.

Agentic Feature Tests

Feature Test Report: problem-reductions

Date: 2026-03-22
Project type: CLI + library
Features tested: ExpectedRetrievalCost model discovery, example creation, solving, evaluation
Profile: review-pipeline ephemeral
Use Case: A downstream CLI user wants to discover the new model from the catalog, inspect its schema, create the canonical example, solve it with brute force, and validate the paper configuration.
Expected Outcome: The new model is discoverable and the example flow works end to end from the CLI alone.
Verdict: pass
Critical Issues: 0

Summary

Feature Discoverable Setup Works Expected Outcome Met Doc Quality
ExpectedRetrievalCost CLI flow yes yes yes yes good

Per-Feature Details

ExpectedRetrievalCost

  • Role: downstream CLI user validating a newly added model from catalog and docs.
  • Use Case: list the model, inspect it, create the canonical example, solve it, and evaluate the paper configuration.
  • What they tried: pred list, pred show ExpectedRetrievalCost, pred create --example ExpectedRetrievalCost, pred solve ... --solver brute-force, and pred evaluate ... --config 0,1,2,1,0,2.
  • Discoverability: good — the model appears in the catalog and pred show exposes the schema and complexity.
  • Setup: good — the workspace CLI built and ran without extra manual setup.
  • Functionality: good — example creation wrote JSON, solving returned a satisfying configuration [0,0,1,1,2,2], and evaluating the paper config returned true.
  • Expected vs Actual Outcome: matched — independent brute-force verification reproduced the paper cost 1.0025, the 54 satisfying assignments for the YES instance, and the NO-instance optimum 1.07 > 0.5.
  • Blocked steps: pred reduce was not applicable because this PR adds a model, not a reduction rule; pred show correctly reports 0 incoming/outgoing reductions for the current graph state.
  • Friction points: none.
  • Doc suggestions: none.

Expected vs Actual Outcome

The expected end-to-end model flow was achieved from the CLI alone. The user-facing outputs matched both the implementation and the paper example.

Issues Found

None.

Suggestions

None.


Generated by review-pipeline

isPANN added 2 commits March 22, 2026 23:49
# Conflicts:
#	problemreductions-cli/src/commands/create.rs
#	src/models/mod.rs
@isPANN isPANN mentioned this pull request Mar 22, 2026
3 tasks
@isPANN isPANN merged commit e8fbdbe into main Mar 22, 2026
5 checks passed
@GiggleLiu GiggleLiu deleted the issue-408 branch April 12, 2026 00:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Model] ExpectedRetrievalCost

2 participants