Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing New Model Performance #62

Closed
DroneMesh opened this issue May 29, 2020 · 6 comments
Closed

Testing New Model Performance #62

DroneMesh opened this issue May 29, 2020 · 6 comments

Comments

@DroneMesh
Copy link

DroneMesh commented May 29, 2020

Hi Wil,

I have setup your latest gymfc with the nf1 example all is working, However previously we were able to run pi.act(False, ob=ob)[0] to test the model and do our own custom graphing but with your new baselines repo something has changed can you clarify how to run the recent model manually to test it.

Thanks

@wil3
Copy link
Owner

wil3 commented May 30, 2020

Hey @DroneMesh,
Could you provide me with some more information? Are you saying training with PPO1 from here isn't working anymore or just evaluation? Have you been able to generate the checkpoints with this script? What's actually not working, please provide all commands executed and their output.

@DroneMesh
Copy link
Author

HI Wil,

Using the PPO1 Example all is working well and checkpoints are being generated correctly. However, I am unable to test the model and graph its performance.

def train(env, num_timesteps, seed, flight_log_dir=None, ckpt_dir=None, 
          render=False, ckpt_freq=0, restore_dir=None, optim_stepsize=3e-4, 
          schedule="linear", gamma=0.99, optim_epochs=10, optim_batchsize=64, 
          horizon=2048):
 ........
    pi = pposgd_simple.learn(env, policy_fn,
            max_timesteps=num_timesteps,
            timesteps_per_actorbatch=horizon,
            clip_param=0.2, entcoeff=0.0,
            optim_epochs=optim_epochs, optim_stepsize=optim_stepsize, 
            optim_batchsize=optim_batchsize,
            gamma=0.99, lam=0.95, schedule=schedule,
            flight_log = flight_log,
            ckpt_dir = ckpt_dir,
            restore_dir = restore_dir,
            save_timestep_period= ckpt_freq
            )
    env.close()

    return pi

pi = train(num_timesteps=1, seed=args.seed, env_id=args.env)

        actuals = []
        desireds = []
        while True:
            desired = env.true_error
            actual = env.measured_error
            actuals.append(actual)
            desireds.append(desired)
            print("sp=", desired, " rate=", actual)
            action = pi.act(stochastic=False, ob=ob)[0] ### THIS IS NO LONGER WORKING
            ob, _, done, _ = env.step(action)
            if done:
                break
        plot_step_response(np.array(desireds), np.array(actuals))

The problem with this now is the pi.act function no longer working on this version of baselines. Do you know what has changed and if you have a premade script for testing your models performance that would help as well. I can do different variants of it and setup PR request for the current repo.

@DroneMesh
Copy link
Author

Hi Wil,

You can close this I ended up using the latest Stable-Baselines and modified the NF1 example slightly. Once I finish and clean up the code I will issue a pull request. For a new NF1 example that is compatible with the latest Stable-Baselines along with the script to test its performance where it graphs motor output and target setpoints with the error.

@wil3
Copy link
Owner

wil3 commented Jun 1, 2020

This project has only provided code and examples for OpenAI Baselines so if you are using Stable Baselines that's probably why it didn't work. Be sure to thoroughly read https://github.com/wil3/gymfc/blob/master/CONTRIBUTING.md before opening a PR. PRs are tied to issues so its up to you when you want to close an issue you open.

@DroneMesh
Copy link
Author

DroneMesh commented Jun 2, 2020 via email

@wil3
Copy link
Owner

wil3 commented Jun 23, 2020

Hi @DroneMesh since this opened issue was due to code that is supported in the repo I'm going to close it since the PR you briefly mentioned is a separate issue. If you are still interested in contributing and submitting a PR please open a feature request issue outlining the intended changes for that specific PR.

To add a policy for an other trainer you'll just need a new policy here and an example of how that trainer is instantiated like the OpenAI Baselines example. In fact the baselinespolicy.py can really be generalized to a Tensorflow checkpoint policy which is inheritted depending on the different tensor names for the input and output. I have a Tensorforce policy I plan to add soon too.

I've also added evaluation scripts and plotters in the examples directory that may help you.

@wil3 wil3 closed this as completed Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants