Skip to content

docs: record h2 order-control scout results#312

Merged
DeliciousBuding merged 1 commit into
mainfrom
docs/h2-order-control-scout-results-20260525
May 24, 2026
Merged

docs: record h2 order-control scout results#312
DeliciousBuding merged 1 commit into
mainfrom
docs/h2-order-control-scout-results-20260525

Conversation

@DeliciousBuding
Copy link
Copy Markdown
Owner

Summary

  • record the H2 output-cloud 256 / 256 shared-position order-control scout results
  • add the four small curated JSON artifacts for shared-position and same-size class-ordered subset reviews
  • update Research docs to state that class-ordered seed offset is not a sufficient explanation, while keeping H2 output-cloud candidate-only and out of the admitted Platform/Runtime bundle

Decision

  • no Platform/Runtime schema, runner, UI type, or admitted bundle row
  • no same-cache feature sweep
  • no full 512 / 512 shared-position rerun selected by default

Verification

  • python -X utf8 -m pytest tests/test_run_h2_response_strength_validation.py tests/test_review_h2_output_cloud_geometry_script.py -q
  • python -X utf8 scripts/check_markdown_links.py
  • python -X utf8 scripts/check_public_surface.py
  • python -X utf8 scripts/export_admitted_evidence_bundle.py --check
  • python -X utf8 scripts/run_pr_checks.py

Copilot AI review requested due to automatic review settings May 24, 2026 22:08
@DeliciousBuding DeliciousBuding merged commit 67e6ab6 into main May 24, 2026
2 of 3 checks passed
@DeliciousBuding DeliciousBuding deleted the docs/h2-order-control-scout-results-20260525 branch May 24, 2026 22:09
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the documentation and artifact indices to reflect the results of the H2 output-cloud geometry order-control scout. The changes confirm that the research-side signal remains strong after addressing the class-ordered seed-offset caveat. Feedback from the reviewer focuses on technical consistency within the new JSON artifact files, specifically recommending the use of forward slashes in file paths for cross-platform compatibility and correcting misleading or stale metadata fields.

"track": "black-box",
"method": "H2 output-cloud geometry scorer",
"mode": "cpu-cache-review",
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-class-ordered-subset-20260525\\response-cache.npz",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The path uses Windows-style backslashes. For cross-platform compatibility and consistency with the rest of the repository, please use forward slashes.

Suggested change
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-class-ordered-subset-20260525\\response-cache.npz",
"response_cache": "workspaces/black-box/runs/h2-response-strength-256-class-ordered-subset-20260525/response-cache.npz",

"reopen_allowed": false,
"requires_reseeded_or_interleaved_cache_before_promotion": true
},
"verdict": "weak_non_complementary_output_cloud_geometry",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The verdict weak_non_complementary_output_cloud_geometry is misleading given the high AUC (0.967) and strong strict-tail metrics. This likely occurs because the comparison block is null, but the label should reflect the actual signal strength.

Suggested change
"verdict": "weak_non_complementary_output_cloud_geometry",
"verdict": "candidate_output_cloud_geometry",

"track": "black-box",
"method": "H2 output-cloud geometry scorer",
"mode": "cpu-cache-review",
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-class-ordered-subset-20260525\\response-cache.npz",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The path uses Windows-style backslashes. Please use forward slashes for consistency.

Suggested change
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-class-ordered-subset-20260525\\response-cache.npz",
"response_cache": "workspaces/black-box/runs/h2-response-strength-256-class-ordered-subset-20260525/response-cache.npz",

"track": "black-box",
"method": "H2 output-cloud geometry scorer",
"mode": "cpu-cache-review",
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-20260525-r1\\response-cache.npz",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The path uses Windows-style backslashes. Please use forward slashes for consistency.

Suggested change
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-20260525-r1\\response-cache.npz",
"response_cache": "workspaces/black-box/runs/h2-response-strength-256-shared-position-20260525-r1/response-cache.npz",

"notes": [
"This is a CPU-only scorer review on an existing H2 response cache.",
"It intentionally excludes seed-to-output distance features so it cannot collapse back into H2 simple distance.",
"A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This note is stale for this specific artifact. Since this file represents the shared-position order-control scout, it has already ruled out the class-ordered sampling effects mentioned in the note.

Suggested change
"A positive result is candidate-only until reseeded or interleaved response-cache controls rule out class-ordered sampling effects.",
"A positive result is candidate-only until formal promotion mechanisms or independent consumption contracts are established.",

"track": "black-box",
"method": "H2 output-cloud geometry scorer",
"mode": "cpu-cache-review",
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-20260525-r1\\response-cache.npz",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The path uses Windows-style backslashes. Please use forward slashes for consistency.

Suggested change
"response_cache": "workspaces\\black-box\\runs\\h2-response-strength-256-shared-position-20260525-r1\\response-cache.npz",
"response_cache": "workspaces/black-box/runs/h2-response-strength-256-shared-position-20260525-r1/response-cache.npz",

@DeliciousBuding DeliciousBuding review requested due to automatic review settings May 24, 2026 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant