Model Convergence Problem #29

hnsyzjianghan · 2019-07-25T08:20:10Z

Hi william,
I am trying to run this project (gymfc v0.1.0) on Ubuntu 18.04 with gazebo 9。I use baseline's PPO1 algorithm to train the controller, just like the example you gave me.
https://gist.github.com/wil3/4115a31c527afd4a7f8ecfab88fa4a24
However, after 15 million steps, the EpRewMean seems does not converge, although it's much higher than its initial value. Because I load the trained graph and find the performance of the controller does not meet the criteria in your paper.

I use the gazebo 9 attached to ROS, using the code you provided(run_gymfc.py), please tell me where the problem is, thank you.

wil3 · 2019-07-25T13:22:46Z

Hi @hnsyzjianghan, RL can be really sensitive and finicky. I havn't tried the older version of gymfc with gz9 and ROS but if the PID example is working fine than that shouldnt cause any problems. The graph you are showing me isn't awful, its learning the step just not quite there. Some things to check/try,

Make sure the world step size is 0.001 or smaller
Make sure you train at least 3 independent trials with different seeds and take the best
When doing the validation average at least 5 different steps to get a good idea of the progress. Usually I'll watch as checkpoints are created and validate them on the fly.
Try some hyperparameter tuning, in our follow up neuroflight paper we got better results with step size of 1e-4, you can check that paper for all the hyperparameters we used. Also try hidden nodes of size 32.
If from your validation, rewards are still increasing then increase the number of simulation steps to 50 million.

hnsyzjianghan · 2019-07-26T08:19:03Z

Dear william,
Thank you very much for your answer and I'll try your advice. I am a novice with RL and gymfc, I still have others doubts:

You suggest me that trials with different seeds, but it seems not used in your demo run_gymfc.py. I believe that your demo is written in imitation of /baselines/ppo1/run_humanoid. py. However, make_mujoco_env() is used to build the environment in run_humanoid.py, and therefore seed's parameter input, but gym.make() is used to build the environment in your demo, without seeds. The parameter input of seed. What's the difference between these two functions in your project? What is the effect of seed in your environment?
In your neuroflight paper, I see the angular velocity error and delta angular velocity error are used in reward function. Do you use angular error (integral of angular velocity)? In gymfc version 0.1.0, it seems that the FDMPacket class in gazebo_env.py returns to the environment, among them unpacked [7:11] is like a quaternion form returning to the attitude angle of the aircraft. I tried to output them, but found that they were fixed to [1, 0, 0, 0]
In your newly updated version of gymfc 0.2.0, I see that the explanation possibly removing Gym portions. Does this version support the gym environment? If so, how can I use the new version to train the openai baseline RL algorithm? Can you give me some cases to refer to?
Thank you again for your answer and look forward to your recovery.

wil3 · 2019-07-26T14:08:22Z

So that script is just a demo script for interfacing with gymfc using openai baselines, it isn't the scripts used for the manuscript. For your own work its going to involve you implementing your own script to interface with the gym.

The gym source for the neuroflight paper was never published, just the neuroflight firmware code.

gymfc2 is a huge implementation change, its now a general flight control tuning framework. gymfc2 core it self an openai gym any more however this is extremely easy to do as you just need to inherit the FlightControlEnv class and provide your own state and reward functions. Will be porting over the openai gym interfaces to the new architecture when I have time. An example of creating a subclass can be found here. Further details will be in my thesis which will be completed in the next couple weeks, and when I have time I'll add more to the readme and additional examples. In the mean time these are new motor models that I've ported over for gymfc2 https://github.com/wil3/gymfc-aircraft-plugins.

hnsyzjianghan · 2019-08-07T09:09:49Z

Sorry, I went on a business trip the other day. Thank you for your reply. I will use your model to finish my project.

hnsyzjianghan · 2019-08-19T06:38:18Z

Hi william,
I got another problem，when I run the Simulator，sometimes the gazebo report (especially when I use mpiexec python(mpi4py MPI) in the terminal to accelerate the simulation.)
Broken Quadcopter connection，count 1~5/5.
and when count=5，it will report
Broken Quadcopter connection, resetting motor control.
It looks like the connection between gazebo and Python is broken. I find this report comes from /gymfc/envs/assets/gazebo/plugins/QuadcopterWorldPlugin.cpp.
I want to know that does this problem affect the simulation results? And how can i avoid this problem.

wil3 · 2019-08-22T13:33:15Z

Hi @hnsyzjianghan,
What you are seeing is the error reported by parts of the Arducopter plugin that were used in the first version of GymFC. That is caused by a network communication issue where packets sent from python to gazebo get dropped. If you cant communicate with the quadcopter then yes it will affect the results because the quadcopter isn't receiving the commands. The version of GymFC you are using is outdated and no longer supported, I suggest migrating to GymFCv2 on the master branch which is far more stable with significant enhancements. However if you still want to use the Iris quadcopter model we need help migrating it over to GymFCv2.

hnsyzjianghan · 2019-08-22T14:34:00Z

Okay, I'll change it in the future work, thank you.

wil3 closed this as completed Aug 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Convergence Problem #29

Model Convergence Problem #29

hnsyzjianghan commented Jul 25, 2019

wil3 commented Jul 25, 2019

hnsyzjianghan commented Jul 26, 2019

wil3 commented Jul 26, 2019 •

edited

Loading

hnsyzjianghan commented Aug 7, 2019

hnsyzjianghan commented Aug 19, 2019

wil3 commented Aug 22, 2019

hnsyzjianghan commented Aug 22, 2019

Model Convergence Problem #29

Model Convergence Problem #29

Comments

hnsyzjianghan commented Jul 25, 2019

wil3 commented Jul 25, 2019

hnsyzjianghan commented Jul 26, 2019

wil3 commented Jul 26, 2019 • edited Loading

hnsyzjianghan commented Aug 7, 2019

hnsyzjianghan commented Aug 19, 2019

wil3 commented Aug 22, 2019

hnsyzjianghan commented Aug 22, 2019

wil3 commented Jul 26, 2019 •

edited

Loading