Regarding inferencing the learnt policy #69

SiddSS · 2022-11-27T21:56:42Z

Hi,
We have created our custom environment for and wrapped it in a gym class. After training using MAPPO, we got the .pkl files. Can you elaborate upon how to inference the learned policy ?
We already have a visualization of the env using pygame and just want to load the learned policies and see them play.

Thanks in advance.

Theohhhu · 2022-11-29T06:45:39Z

It is doable. However, MARLlib decides not to incorporate the loading and rendering functions as we find it hard to unify all ten environments to render in a similar pattern.

We would like to provide you with instructions on how to implement this.
You can find a example for rendering here.
Also, the way to load the checkpoint is adding a restore path in tune.run(restore=YourCheckPointPath)
The complete configuration can be found in Trainer.

There is a thorough solution provided by Sven: multiagent-load-only-one-policy-from-checkpoint.

Any further question is welcome. We are happy to help you out.

SiddSS · 2022-12-11T17:02:25Z

Hi Thanks for the previous answer. But we have been unable to use the learned policy to compute actions for our agents. Our objective is to compute agent's actions based on the learnt policy. But when we use the function agent.compute_single_action(obs), where obs = env.reset()
We get the error that seq_lens in None type. We are not able to find where to find the sequence lengths to resolve this error. We added some print statements in the training and could observe the sequence lengths being printed there but compute_single_action does not seem to be workking with that.

It would be really helpful if you could provide some insights for the same. Also kindly let us know if we should be using some function other than agent.compute_single_action for the same purpose.

Theohhhu · 2023-01-17T07:56:25Z

Hi SiddSS,

Sorry for the late reply. Check out mamujoco example and mpe example for loading the checkpoint and rendering the environment in MARLlib.
You are welcomed for any further question.

Siyi

Theohhhu · 2023-03-03T00:30:16Z

New APIs are here for guiding how to render the pretrained model.

Theohhhu pinned this issue Jan 17, 2023

Theohhhu mentioned this issue Jan 17, 2023

Regarding inference of the learned policy after training #74

Closed

Theohhhu closed this as completed Feb 2, 2023

Theohhhu unpinned this issue Jun 15, 2023

arshad171 mentioned this issue Apr 7, 2024

Inferencing the learned Policies #233

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding inferencing the learnt policy #69

Regarding inferencing the learnt policy #69

SiddSS commented Nov 27, 2022

Theohhhu commented Nov 29, 2022 •

edited

SiddSS commented Dec 11, 2022

Theohhhu commented Jan 17, 2023 •

edited

Theohhhu commented Mar 3, 2023

Regarding inferencing the learnt policy #69

Regarding inferencing the learnt policy #69

Comments

SiddSS commented Nov 27, 2022

Theohhhu commented Nov 29, 2022 • edited

SiddSS commented Dec 11, 2022

Theohhhu commented Jan 17, 2023 • edited

Theohhhu commented Mar 3, 2023

Theohhhu commented Nov 29, 2022 •

edited

Theohhhu commented Jan 17, 2023 •

edited