You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: Add support for integrating with Fragile library
* chore: stash WIP fragile search mathy agent
- add `mathy_fragile.py` in the root for now. It requires a local mathy installation to use the hacked up library compatible with plangym: https://mathy.ai/contributing/#use-your-local-version-of-mathy
- requires plangym/fragile packages, installed by whatever means you use
* chore(env): add invalid move error scaling
- calculate the distance from the selected invalid action to the nearest valid action and provide a negative reward based on that.
- remove some extra state from the MathyGymEnv to ensure that fragile can cleanly reset all of the state during batch steps
* chore: finish cleaning up gym env for clean reset
- I changed the env to reset to the same initial state when called. I think this is more in-line with what an atari env might do.
* chore: use node types as observations
- sequence of node type IDs
- adjust scale of rewards to remove negatives
* test(state): fix to_string/from_string
* refactor(env): use positive reward signals for fragile search
- normalize all rewards so that dead-states return 0.0 and positive ones return > 0.0 (as in paper)
- treat invalid action selections as null states and continue the simulation
* chore: cleanup and fix test script
* chore: cleanup
* chore(fragile): add masked action selection
- this is pretty slow by comparison. See if there's a way to avoid looping over the action masks. Maybe build the probabilities into the observation so the calculation is unnecessary?
* chore: faster numpy action selection
- calculate mask probabilities and return as observation
- use numpy trickery to avoid looping over the observations to choose random actions from probabilities in batches
* chore: fix fragile oob/terminal conditions
* chore: fix tests
* chore: drop fragile script
0 commit comments