Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve Minefield #50

Closed
cswinter opened this issue Nov 29, 2021 · 5 comments
Closed

Solve Minefield #50

cswinter opened this issue Nov 29, 2021 · 5 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@cswinter
Copy link
Collaborator

cswinter commented Nov 29, 2021

Get > 0.99 episodic return on the Minefield task, which can be run with poetry run python enn_ppo/enn_ppo/train.py --gym-id=Minefield --env-kwargs='{"max_mines": 10}'.

@cswinter
Copy link
Collaborator Author

The Minefield task requires taking actions based on the features of other entities, which our current architecture is incapable of.
At a minimum, solving this environment requires #9, and possibly also better positional encodings.

Baseline run (67acae6): https://wandb.ai/cswinter/enn-ppo/runs/3tf8uk3l

@cswinter
Copy link
Collaborator Author

cswinter commented Dec 5, 2021

#108 makes the Minefield task more configurable, and also adds a translate argument that allows all x/y positions to be translated relative to the vehicle. This makes this task much easier, but it is still somewhat tricky due to the sparsity of rewards. It ought to be solvable now with the right hyperparameters. Additionally adding rotational invariance by rotating all object positions so that the direction of the actor is always 0 should also decrease its difficulty.

@cswinter
Copy link
Collaborator Author

cswinter commented Dec 5, 2021

Solution for max_mines=0 (W&B): poetry run python enn_ppo/enn_ppo/train.py --gym-id=Minefield --num-env=64 --num-steps=32 --track --total-timesteps=1000000 --n-layer=1 --env-kwargs='{"max_mines": 0, "translate": true}' --learning-rate=0.003

@cswinter cswinter added the good first issue Good for newcomers label Dec 5, 2021
@cswinter
Copy link
Collaborator Author

cswinter commented Dec 5, 2021

We can almost solve max_mines=1, but it still struggles to squeeze out the last bits of performance (W&B): poetry run python enn_ppo/enn_ppo/train.py --gym-id=Minefield --num-env=64 --num-steps=32 --track --total-timesteps=2500000 --n-layer=2 --env-kwargs='{"max_mines": 1, "translate": true}' --learning-rate=0.003

There might still be a bug in the environment that makes it unsolvable in some cases, or the hyperparameters are bad, or something else is missing.

@cswinter cswinter self-assigned this Dec 28, 2021
@cswinter
Copy link
Collaborator Author

Solved by poetry run python enn_ppo/enn_ppo/train.py --processes=16 --total-timesteps=25000000 --num-steps=512 --num-envs=512 --num-minibatches=64 --n-layer=2 --d-model=32 --learning-rate=0.01 --gamma=0.999 --ent-coef=0.0001 --gym-id=Minefield --track --env-kwargs="{\"max_mines\": 10, \"translate\": true}"
Commit: 766a379
W&B: 211230-205335-stack-more-layers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant