Skip to content

Implement Lane D finalize workflow with canonical records#32

Merged
Am1n3e merged 1 commit intofeature/add-leaderboardfrom
wip/implementation-plan
Mar 7, 2026
Merged

Implement Lane D finalize workflow with canonical records#32
Am1n3e merged 1 commit intofeature/add-leaderboardfrom
wip/implementation-plan

Conversation

@Am1n3e
Copy link
Contributor

@Am1n3e Am1n3e commented Mar 7, 2026

Summary

  • add a merged-PR finalize workflow for leaderboard-submissions that canonicalizes intake payloads, uploads canonical snapshots to HF, writes submissions/<submission_id>.json, and removes inbox folders
  • introduce a canonical submission record type plus finalize task wiring in leaderboard invoke tasks, and document Lane D implementation details with workflow/code interactions
  • add unit/type coverage for finalize event parsing, intake resolution, canonical record validation, and end-to-end finalize write/delete behavior

Validation

  • uv run ruff check dev/leaderboard/finalize.py dev/leaderboard/tasks.py src/webarena_verified/types/leaderboard/submission_record.py tests/leaderboard/test_finalize.py tests/types/test_leaderboard_types.py
  • uv run pytest tests/leaderboard/test_finalize.py tests/types/test_leaderboard_types.py
  • uv run inv --list dev.leaderboard

)


class FinalizeContext(BaseModel):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move models to a different files (model.py)

_MISSING_SENTINEL_FILE = ".missing"
_AGENT_RESPONSE_FILE = "agent_response.json"
_NETWORK_HAR_FILE = "network.har"
_SITE_KEYS = ("shopping", "reddit", "gitlab", "wikipedia", "map", "shopping_admin")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the str enum WebArenaSite.

if TYPE_CHECKING:
from pathlib import Path

_INTAKE_PREFIX = "submissions/inbox/"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use pydantic-settings for this


repo_quoted = parse.quote(repo, safe="")
while True:
url = f"https://api.github.com/repos/{repo_quoted}/pulls/{pr_number}/files?per_page=100&page={page}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add and use pygithub library

@Am1n3e Am1n3e force-pushed the wip/implementation-plan branch from 4b8f6ac to 39f141b Compare March 7, 2026 13:09
@Am1n3e Am1n3e merged commit f1a6970 into feature/add-leaderboard Mar 7, 2026
@Am1n3e Am1n3e deleted the wip/implementation-plan branch March 7, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant