Skip to content

Conversation

@nishu-builder
Copy link
Contributor

@nishu-builder nishu-builder commented Dec 3, 2025

  1. Centralized URI-to-PolicySpec resolution. Now metta://xxx works with non-mpt policies

A) The MettaSchemeResolver now returns either:

  • The S3 path containing the policy spec (preferred), or
  • The raw .mpt checkpoint path for legacy policies

B) policy_spec_from_uri() now handles all URI types uniformly:
- .mpt files → creates MptPolicy spec
- S3 paths → downloads and extracts policy spec archive
- Local dirs → loads policy_spec.json directly

  • resolve_uri() renamed to get_path_to_policy_spec_or_mpt() to clarify its purpose

Removed other contextmanager-based load-policy-from-s3 logic, and removed multiple points with custom .mpt handling; now it should all be in policy_spec_from_uri. We now just download to a local cache folder with optional cleanup in atexit handlers

  1. Fix passing of behavior-cloning policy URI

Before, we passed around an EnvSupervisorConfig containing a URI string that was being misinterpreted as a path. Now, TrainTool resolves the URI into a PolicySpec upfront and passes the resolved spec directly to VectorizedTrainingEnvironment.

  1. Cogames CLI policy parsing cleanup
    The CLI policy parser now uses more shared code for deciding if an input is a URI and parsing it, and allows passing in additional args via key=value pairs.

So now you can do e.g.:

uv run cogames evaluate -m machina_1.machinatrainerbig -p class=random -p metta://policy/dinky,proportion=2 -p s3://softmax-public/policies/daveey.1x4.cvc.dr.maps.1.bct.0/daveey.1x4.cvc.dr.maps.1.bct.0:v1990.mpt,proportion=4

  TrainTool.invoke():
    1. supervisor_policy_uri on TrainingEnvironmentConfig (in recipes)
    2. policy_spec_from_uri() resolves URI → PolicySpec in TrainTool.invoke
    3. VectorizedTrainingEnvironment(cfg, supervisor_policy_spec=...)
    4. make_vecenv(..., supervisor_policy_spec=...)
    5. MettaGridPufferEnv(..., supervisor_policy_spec=...)
@datadog-official
Copy link

datadog-official bot commented Dec 4, 2025

✅ Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 9137b53 | Docs | Was this helpful? Give us feedback!

@relh relh enabled auto-merge December 4, 2025 02:22
@relh relh added this pull request to the merge queue Dec 4, 2025
@nishu-builder nishu-builder removed this pull request from the merge queue due to a manual request Dec 4, 2025
@nishu-builder nishu-builder added this pull request to the merge queue Dec 5, 2025
Merged via the queue into main with commit 604fbc6 Dec 5, 2025
12 checks passed
@nishu-builder nishu-builder deleted the nishad/fix-bc_policy_uri-resolving branch December 5, 2025 00:13
nishu-builder added a commit that referenced this pull request Dec 6, 2025
…oning (#4161)

1. Centralized URI-to-PolicySpec resolution. Now `metta://xxx` works
with non-mpt policies

A)   The MettaSchemeResolver now returns either:
  - The S3 path containing the policy spec (preferred), or
  - The raw .mpt checkpoint path for legacy policies

B) policy_spec_from_uri() now handles all URI types uniformly:
    - .mpt files → creates MptPolicy spec
    - S3 paths → downloads and extracts policy spec archive
    - Local dirs → loads policy_spec.json directly
- resolve_uri() renamed to get_path_to_policy_spec_or_mpt() to clarify
its purpose

Removed other contextmanager-based load-policy-from-s3 logic, and
removed multiple points with custom .mpt handling; now it should all be
in policy_spec_from_uri. We now just download to a local cache folder
with optional cleanup in atexit handlers

2. Fix passing of behavior-cloning policy URI

Before, we passed around an EnvSupervisorConfig containing a URI string
that was being misinterpreted as a path. Now, TrainTool resolves the URI
into a PolicySpec upfront and passes the resolved spec directly to
VectorizedTrainingEnvironment.

3. Cogames CLI policy parsing cleanup
The CLI policy parser now uses more shared code for deciding if an input
is a URI and parsing it, and allows passing in additional args via
key=value pairs.

So now you can do e.g.:

```
uv run cogames evaluate -m machina_1.machinatrainerbig -p class=random -p metta://policy/dinky,proportion=2 -p s3://softmax-public/policies/daveey.1x4.cvc.dr.maps.1.bct.0/daveey.1x4.cvc.dr.maps.1.bct.0:v1990.mpt,proportion=4
```

---------

Co-authored-by: Richard Higgins <richard@relh.net>
Co-authored-by: Richard Higgins <richard@softmax.com>
zfogg pushed a commit that referenced this pull request Dec 20, 2025
…oning (#4161)

1. Centralized URI-to-PolicySpec resolution. Now `metta://xxx` works
with non-mpt policies

A)   The MettaSchemeResolver now returns either:
  - The S3 path containing the policy spec (preferred), or
  - The raw .mpt checkpoint path for legacy policies

B) policy_spec_from_uri() now handles all URI types uniformly:
    - .mpt files → creates MptPolicy spec
    - S3 paths → downloads and extracts policy spec archive
    - Local dirs → loads policy_spec.json directly
- resolve_uri() renamed to get_path_to_policy_spec_or_mpt() to clarify
its purpose

Removed other contextmanager-based load-policy-from-s3 logic, and
removed multiple points with custom .mpt handling; now it should all be
in policy_spec_from_uri. We now just download to a local cache folder
with optional cleanup in atexit handlers

2. Fix passing of behavior-cloning policy URI

Before, we passed around an EnvSupervisorConfig containing a URI string
that was being misinterpreted as a path. Now, TrainTool resolves the URI
into a PolicySpec upfront and passes the resolved spec directly to
VectorizedTrainingEnvironment.

3. Cogames CLI policy parsing cleanup
The CLI policy parser now uses more shared code for deciding if an input
is a URI and parsing it, and allows passing in additional args via
key=value pairs.

So now you can do e.g.:

```
uv run cogames evaluate -m machina_1.machinatrainerbig -p class=random -p metta://policy/dinky,proportion=2 -p s3://softmax-public/policies/daveey.1x4.cvc.dr.maps.1.bct.0/daveey.1x4.cvc.dr.maps.1.bct.0:v1990.mpt,proportion=4
```

---------

Co-authored-by: Richard Higgins <richard@relh.net>
Co-authored-by: Richard Higgins <richard@softmax.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants