Feat/week2 completion by vimscientist69 · Pull Request #2 · vimscientist69/PropSignal

vimscientist69 · 2026-04-27T10:34:27Z

No description provided.

- Introduced segment-based thresholds for scoring evaluation, defining metrics for `top_band`, `middle_band`, and `bottom_band` to enhance diagnostic capabilities. - Updated scoring configuration to reflect new stability metrics, including Jaccard and rank correlation thresholds for each segment. - Enhanced tests to validate the integration of segment thresholds and ensure correct evaluation reporting. - Improved documentation to clarify the purpose and structure of the new segment-based stability diagnostics.

…hresholds - Added percentage-based metrics for median absolute rank shift and p90 rank shift to enhance evaluation sensitivity. - Updated scoring configuration and evaluation logic to incorporate new percentage thresholds for segment stability checks. - Adjusted tests to validate the integration of percentage-based metrics and ensure correct evaluation reporting. - Enhanced documentation to clarify the purpose and structure of the new percentage-based metrics in scoring evaluation.

…lacement measures - Updated implementation steps to include segment overlap and rank correlation metrics for `top_band`, `middle_band`, and `bottom_band`. - Introduced global rank displacement metrics to enhance evaluation sensitivity and context. - Adjusted stability thresholds and auditability requirements in the scoring configuration. - Enhanced documentation to reflect the new metrics and their implications for scoring evaluation.

…resholds - Introduced tests for rank displacement metrics, validating expected values for intersection count and rank shifts. - Added tests for top-band perturbation thresholds, ensuring correct evaluation outcomes for both pass and fail scenarios. - Updated existing tests to remove outdated comments and improve clarity on evaluation logic. - Enhanced documentation to reflect the new tests and their significance in scoring evaluation.

… contracts - Revised PROJECT_NOTE.md to reflect the finalized goals and deliverables for the advanced scoring system (`advanced_v2`), including evaluation gates and structured reasoning payloads. - Updated week2-implementation-playbook.md to clarify the scope and execution order, emphasizing the single source of truth for implementation details. - Enhanced week2-interface-contract.md to define output expectations and scope boundaries, ensuring clarity on in-scope and out-of-scope elements for Week 2.

…trics - Updated PROJECT_NOTE.md to include new insights on scoring evaluation metrics and their implications. - Revised week2-implementation-playbook.md to improve clarity on execution steps and responsibilities. - Enhanced week2-interface-contract.md to better define output expectations and scope for the upcoming evaluation phase.

- Updated PROJECT_NOTE.md to include required performance baseline handoff updates and metrics context. - Added a new CLI command for benchmarking performance baselines, allowing users to assess API latency and SLOs. - Enhanced week2-phase4-performance-baseline-implementation.md with detailed follow-up actions and required updates for the upcoming API implementations. - Ensured documentation reflects the inclusion of dataset-size context and throughput metrics for meaningful performance comparisons.

…rresponding tests - Updated PropfluxListing model configuration to accept unknown fields, enhancing compatibility with evolving data schemas. - Added tests to validate ingestion of listings with extra fields, ensuring that records remain valid despite additional attributes. - Implemented partial validation tests to confirm that unknown fields do not invalidate the payload, supporting forward compatibility.

…ce baseline - Updated the performance baseline service to resolve and store absolute paths for validation and evaluation report files, ensuring consistency in file references. - Enhanced metrics path resolution for baseline summary and metrics files to prevent potential issues with relative paths.

- Introduced a new status "analyzed" to the IngestionJob model, expanding the range of job states for better tracking and management of ingestion processes.

- Added notes regarding a failure in scoring evaluation with minimal config changes, prompting a need for debugging. - Included specific file paths related to the evaluation process for better tracking and context.

- Modified the `_ranking_identity_map` function to accept a list of `ScoreResult` objects instead of a job ID, improving the accuracy of identity mapping. - Updated calls to `_ranking_identity_map` in `run_scoring_evaluation` to reflect the new parameter structure. - Added a new test to ensure that identity mapping correctly utilizes scored listing IDs, enhancing the robustness of scoring evaluations.

- Revised PROJECT_NOTE.md to reflect the successful completion of Week 2, including Phase 5 validation outcomes and final scoring profile values. - Updated current-project-status.md to indicate the transition to Week 3, highlighting the readiness of Week 2 outputs and outlining the next objectives for API/CLI/dashboard implementation. - Added details on the final validation decision and decision artifact for better tracking of project progress.

…it_visualization.py - Rearranged import statements for better organization, moving datetime import above others. - Enhanced readability by formatting complex expressions and return statements across multiple lines. - Updated the construction of HTML strings to use list comprehension for clarity and maintainability. - Made minor adjustments to variable assignments for improved consistency and readability.

…tion and performance baseline services - Updated type annotations for `slo_assessment` in `performance_baseline.py` to specify dictionary structure. - Enhanced type annotations for parameters in `_compute_jaccard` and `_spearman_rank_correlation` functions in `scoring_evaluation.py` to use `Sequence` for better flexibility. - Simplified the assignment of `identities` in `_ranking_identity_map` for improved readability. - Consolidated the construction of `fallback_order` in `generate_top5_audit_visualization.py` for cleaner code. - Removed unnecessary blank lines in various test files to maintain consistency and cleanliness in the codebase.

vimscientist69 added 17 commits April 22, 2026 14:28

Merge branch 'main' into feat/week2-completion

2774443

feat: add new status to IngestionJob model for enhanced tracking

ac0ade8

- Introduced a new status "analyzed" to the IngestionJob model, expanding the range of job states for better tracking and management of ingestion processes.

docs: update PROJECT_NOTE.md with debugging notes and file references

3023550

- Added notes regarding a failure in scoring evaluation with minimal config changes, prompting a need for debugging. - Included specific file paths related to the evaluation process for better tracking and context.

vimscientist69 merged commit 6fdb2fd into main Apr 27, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/week2 completion#2

Feat/week2 completion#2
vimscientist69 merged 17 commits into
mainfrom
feat/week2-completion

vimscientist69 commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vimscientist69 commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant