Hard evals #3835

georgejwdeane · 2025-11-18T01:16:10Z

11 harder versions of the diagnostic evals

…ard-evals

graphite-app · 2025-11-20T06:37:18Z

packages/cogames/src/cogames/cogs_vs_clips/evals/diagnostic_evals.py

+        # Set starting energy to 30 and no regen
+        agent = cfg.game.agent
+        agent.initial_inventory = dict(agent.initial_inventory)
+        agent.initial_inventory["energy"] = 60


Comment-code mismatch on energy configuration. The comment on line 594 states "Set starting energy to 30" but line 597 actually sets initial_inventory["energy"] = 60. This inconsistency could lead to incorrect difficulty tuning.

# Either update the comment: # Set starting energy to 60 and no regen # Or update the code: agent.initial_inventory["energy"] = 30

Suggested change

# Set starting energy to 30 and no regen

agent = cfg.game.agent

agent.initial_inventory = dict(agent.initial_inventory)

agent.initial_inventory["energy"] = 60

# Set starting energy to 60 and no regen

agent = cfg.game.agent

agent.initial_inventory = dict(agent.initial_inventory)

agent.initial_inventory["energy"] = 60

Spotted by Graphite Agent

Is this helpful? React 👍 or 👎 to let us know.

George Deane added 2 commits November 17, 2025 17:14

harder versions of the diagnostic evals!

88b831d

updated evals

8ed51f8

github-actions bot assigned georgejwdeane Nov 18, 2025

daphnedemekas self-requested a review November 18, 2025 20:01

daphnedemekas approved these changes Nov 18, 2025

View reviewed changes

daphnedemekas and others added 13 commits November 18, 2025 12:05

Merge branch 'main' into hard-evals

b309d9d

resource reducer

a1d06d2

Merge branch 'main' of https://github.com/Metta-AI/metta into hard-evals

b674d74

fix

ce55f26

Merge branch 'hard-evals' of https://github.com/Metta-AI/metta into h…

7514b23

…ard-evals

Delete experiments/recipes/scratchpad/georgedeane.py

9dcb99b

update recipe

b172ef1

Merge branch 'hard-evals' of https://github.com/Metta-AI/metta into h…

9da1c61

…ard-evals

amend

c9ccced

Delete packages/cogames/src/cogames/cogs_vs_clips/temp_mission.py

261dbf4

Delete packages/cogames/src/cogames/map_utils/__init__.py

7193e02

Rename dense_training_env.py to dense_training_curriculum.py

d3e9257

Update dense_training_curriculum.py

9acd4c3

daphnedemekas approved these changes Nov 20, 2025

View reviewed changes

daphnedemekas and others added 3 commits November 19, 2025 17:59

Merge branch 'main' into hard-evals

9e39e1c

Merge branch 'main' into hard-evals

1e5f924

cleanup maps

634c16c

graphite-app bot reviewed Nov 20, 2025

View reviewed changes

Merge branch 'main' into hard-evals

d138bd6

daphnedemekas enabled auto-merge November 20, 2025 21:33

daphnedemekas added this pull request to the merge queue Nov 20, 2025

Merged via the queue into main with commit 71a1601 Nov 20, 2025
11 of 12 checks passed

daphnedemekas deleted the hard-evals branch November 20, 2025 21:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hard evals #3835

Hard evals #3835

Uh oh!

georgejwdeane commented Nov 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

graphite-app bot Nov 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Hard evals #3835

Hard evals #3835

Uh oh!

Conversation

georgejwdeane commented Nov 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

graphite-app bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

georgejwdeane commented Nov 18, 2025 •

edited by github-actions bot

Loading