Added dual policy mode to toggle default locomotion policy#73
Conversation
|
|
||
| from holosoma_inference.config.config_types.task import TaskConfig | ||
|
|
||
| _MODELS_DIR = Path(__file__).parent.parent.parent / "models" |
There was a problem hiding this comment.
could you add a comment clarifying where this should point to? Seems like src/holosoma_inference/holosoma_inference/models
|
|
||
| def main(annotated_config=None): | ||
| """Main entry point. Extensions can pass their own AnnotatedInferenceConfig.""" | ||
| import argparse |
There was a problem hiding this comment.
why not import at the top of the file?
tomasz-lewicki
left a comment
There was a problem hiding this comment.
LGTM aside from the config. I think it should be pretty easy to skip all of this argparse logic, and let tyro handle this.
| restore_terminal_settings() | ||
|
|
||
|
|
||
| def _split_secondary_args(argv: list[str]) -> tuple[list[str], list[str]]: |
There was a problem hiding this comment.
Doesn't tyro handle this out-of-the box once you've defined TaskConfig.secondary? Maybe I'm missing something, but I think you can do:
run_policy.py --task.secondary.task.model-path weights.onnxWe can also just have a secondary optional InferenceConfig.
run_policy.py --task.model-path weights_dance.onnx --fallback_task.model-path weights_loco.onnx| id(self.secondary): self.secondary.handle_keyboard_button, | ||
| } | ||
|
|
||
| def patched_joy(cur_key): |
There was a problem hiding this comment.
Hopefully we refactor joystick/input handling soon so doing simple things doesn't have to be so convoluted.
asetapen
left a comment
There was a problem hiding this comment.
Very nice improvements and dual mode will be extremely helpful for testing.
I'm not too familiar with tyro so can't comment on the config issue.
This PR adds as a default inference config the option to switch to a robust locomotion policy by pressing X on the joystick (or x on keyboard), which is convenient for the robot to recover while testing experimental policies. The implementation is even more general than that, as the way it works is by introducing a dual mode setup that allows switching between any two policies at runtime.