New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulation slow af #78

Closed
tlbtlbtlb opened this Issue Sep 26, 2017 · 6 comments

Comments

Projects
None yet
4 participants
@tlbtlbtlb

tlbtlbtlb commented Sep 26, 2017

Simulation seems to be 3 orders of magnitude slower than the Mujoco-based simulations in OpenAI Gym. While the Humanoid environment in Gym can run at something like 10x real time, this runs at around 1/160th real time, so 1600x slower. Even though Humanoid is 3D and this is planar. The time to execute a step varies a lot (probably depending on the speed of movement and contacts), varying between 0.039 seconds and 3.500 seconds on a fast machine.

(and yes, visualization is turned off).

Wrapping step thusly:

        t0 = time.time()
        obs, reward, done, info = self.env.step(action)
        t1 = time.time()
        total_step_time += t1-t0
        total_steps += 1
        print('step time %0.3f avg %0.3f' % (t1-t0, total_step_time/total_steps))

On an Intel(R) Core(TM) i7-5930K CPU @ 3.50GHz, I see:

step time 0.166 avg 0.166
step time 0.039 avg 0.103
step time 0.044 avg 0.083
step time 0.064 avg 0.078
step time 0.095 avg 0.082
step time 0.051 avg 0.076
step time 0.103 avg 0.080
step time 0.067 avg 0.079
step time 0.063 avg 0.077
step time 0.145 avg 0.084
step time 0.106 avg 0.086
step time 0.141 avg 0.090
step time 0.207 avg 0.099
step time 0.156 avg 0.103
step time 0.205 avg 0.110
step time 0.180 avg 0.114
step time 0.199 avg 0.119
step time 0.276 avg 0.128
step time 0.285 avg 0.136
step time 0.283 avg 0.144
step time 0.297 avg 0.151
step time 0.378 avg 0.161
step time 0.649 avg 0.183
step time 0.997 avg 0.216
step time 0.694 avg 0.236
step time 0.901 avg 0.261
step time 3.762 avg 0.391
step time 1.804 avg 0.441
step time 0.484 avg 0.443
step time 0.347 avg 0.440
step time 0.296 avg 0.435
step time 0.197 avg 0.427
step time 0.177 avg 0.420
step time 0.168 avg 0.412
step time 0.258 avg 0.408
step time 0.286 avg 0.405
step time 0.381 avg 0.404
step time 0.370 avg 0.403
step time 0.445 avg 0.404
step time 0.488 avg 0.406
step time 0.537 avg 0.410
step time 0.761 avg 0.418
step time 0.972 avg 0.431
step time 1.059 avg 0.445
step time 1.224 avg 0.462
step time 1.327 avg 0.481
step time 1.027 avg 0.493
step time 1.292 avg 0.509
step time 1.543 avg 0.530
step time 1.588 avg 0.552
step time 1.634 avg 0.573
step time 1.675 avg 0.594
step time 1.654 avg 0.614
step time 1.583 avg 0.632
step time 1.883 avg 0.655
step time 2.110 avg 0.681
step time 1.930 avg 0.703
step time 2.389 avg 0.732
step time 2.614 avg 0.764
step time 2.402 avg 0.791
step time 2.656 avg 0.821
step time 2.908 avg 0.855
step time 2.430 avg 0.880
step time 2.631 avg 0.907
step time 2.947 avg 0.939
step time 3.078 avg 0.971
step time 2.874 avg 1.000
step time 3.010 avg 1.029
step time 3.041 avg 1.058
step time 3.018 avg 1.086
step time 3.044 avg 1.114
step time 3.233 avg 1.143
step time 3.140 avg 1.171
step time 3.107 avg 1.197
step time 3.028 avg 1.221
step time 3.084 avg 1.246
step time 3.277 avg 1.272
step time 3.030 avg 1.295
step time 3.082 avg 1.317
step time 3.240 avg 1.341
step time 3.258 avg 1.365
step time 3.268 avg 1.388
step time 3.224 avg 1.410
step time 3.200 avg 1.432
step time 3.517 avg 1.456
step time 3.808 avg 1.484
step time 2.974 avg 1.501
step time 3.064 avg 1.519
step time 2.993 avg 1.535
step time 2.725 avg 1.548
step time 2.923 avg 1.563
step time 3.039 avg 1.579
step time 3.107 avg 1.596
step time 3.245 avg 1.613
step time 2.597 avg 1.624
step time 2.060 avg 1.628
step time 2.023 avg 1.632
step time 1.748 avg 1.634
step time 1.047 avg 1.628
step time 0.744 avg 1.619

For comparison, the Gym Humanoid-v1 environment shows:

step time 0.001 avg 0.000621
step time 0.001 avg 0.000614
step time 0.001 avg 0.000610
step time 0.001 avg 0.000629
step time 0.001 avg 0.000624
step time 0.001 avg 0.000626
step time 0.001 avg 0.000624
step time 0.001 avg 0.000626
step time 0.001 avg 0.000631
step time 0.001 avg 0.000625
step time 0.001 avg 0.000641
step time 0.001 avg 0.000653
step time 0.001 avg 0.000669
step time 0.001 avg 0.000683
step time 0.001 avg 0.000689
step time 0.001 avg 0.000700
step time 0.001 avg 0.000711
step time 0.001 avg 0.000716
step time 0.001 avg 0.000719
step time 0.001 avg 0.000724
step time 0.001 avg 0.000727
step time 0.001 avg 0.000729
step time 0.001 avg 0.000732
step time 0.001 avg 0.000734
step time 0.001 avg 0.000735
step time 0.001 avg 0.000739
step time 0.001 avg 0.000740
step time 0.001 avg 0.000745
step time 0.001 avg 0.000746
step time 0.001 avg 0.000747
step time 0.001 avg 0.000748
step time 0.001 avg 0.000749
step time 0.001 avg 0.000751
step time 0.001 avg 0.000755
step time 0.001 avg 0.000758
step time 0.001 avg 0.000764
step time 0.001 avg 0.000770
step time 0.001 avg 0.000775
step time 0.001 avg 0.000776
step time 0.001 avg 0.000779
step time 0.001 avg 0.000780
step time 0.001 avg 0.000781
step time 0.001 avg 0.000783
step time 0.001 avg 0.000786
step time 0.001 avg 0.000787
step time 0.001 avg 0.000789
step time 0.001 avg 0.000790
step time 0.001 avg 0.000791
step time 0.001 avg 0.000792
step time 0.001 avg 0.000793
step time 0.001 avg 0.000797
step time 0.001 avg 0.000805
step time 0.001 avg 0.000812
step time 0.001 avg 0.000817
step time 0.001 avg 0.000819
step time 0.001 avg 0.000821
step time 0.001 avg 0.000824
step time 0.001 avg 0.000826
step time 0.001 avg 0.000831
step time 0.001 avg 0.000834
step time 0.001 avg 0.000838
step time 0.001 avg 0.000846
step time 0.001 avg 0.000851
step time 0.001 avg 0.000856
step time 0.001 avg 0.000858
step time 0.001 avg 0.000863
step time 0.001 avg 0.000867
step time 0.001 avg 0.000873
step time 0.001 avg 0.000875
step time 0.001 avg 0.000877
step time 0.001 avg 0.000882
step time 0.001 avg 0.000885
step time 0.001 avg 0.000888
step time 0.001 avg 0.000892
step time 0.001 avg 0.000895
step time 0.001 avg 0.000898
step time 0.001 avg 0.000904
step time 0.002 avg 0.000912
step time 0.002 avg 0.000922
step time 0.002 avg 0.000932
step time 0.001 avg 0.000939
step time 0.001 avg 0.000943
step time 0.001 avg 0.000945
step time 0.001 avg 0.000947
step time 0.001 avg 0.000948
step time 0.001 avg 0.000950
step time 0.001 avg 0.000951
step time 0.001 avg 0.000953
step time 0.001 avg 0.000954
step time 0.001 avg 0.000955
step time 0.001 avg 0.000958
step time 0.001 avg 0.000962
step time 0.001 avg 0.000966
step time 0.001 avg 0.000971
step time 0.001 avg 0.000975
step time 0.002 avg 0.000982
step time 0.002 avg 0.000989
step time 0.001 avg 0.000994
step time 0.002 avg 0.001000
step time 0.002 avg 0.001005
@kidzik

This comment has been minimized.

Show comment
Hide comment
@kidzik

kidzik Sep 26, 2017

Collaborator

That's true, our musculoskeletal environment is much slower than the Humanoid model from OpenAI gym. OpenSim (https://github.com/opensim-org/opensim-core) is optimized for different use cases.
We discussed a few ways of tuning the performance in
#33 and ctmarko suggested https://github.com/ctmakro/stanford-osrl#the-simulation-is-too-slow
Yet, it's still orders of magnitude slower. There is nothing we can do before the end of the challenge. After all, we need algorithms solving problems in environments where function evaluation is expensive...

Collaborator

kidzik commented Sep 26, 2017

That's true, our musculoskeletal environment is much slower than the Humanoid model from OpenAI gym. OpenSim (https://github.com/opensim-org/opensim-core) is optimized for different use cases.
We discussed a few ways of tuning the performance in
#33 and ctmarko suggested https://github.com/ctmakro/stanford-osrl#the-simulation-is-too-slow
Yet, it's still orders of magnitude slower. There is nothing we can do before the end of the challenge. After all, we need algorithms solving problems in environments where function evaluation is expensive...

@tlbtlbtlb

This comment has been minimized.

Show comment
Hide comment
@tlbtlbtlb

tlbtlbtlb Sep 26, 2017

The complexity is probably similar to the HalfCheetah environment. The best RL algorithms take around 3M timesteps to solve it, or 40 minutes of simulator time on a single core. Extrapolating from the numbers above, it'll take 40 days to solve gait9dof18musc on a single core. So I'd recommend to people in the competition that they'll need to use parallel algorithms like A3C to have any hope of getting somewhere with pure RL.

Also, it's probably wise to debug your algorithms on a fast environment (like HalfCheetah) so you can iterate quickly, before tackling the expensive environment.

tlbtlbtlb commented Sep 26, 2017

The complexity is probably similar to the HalfCheetah environment. The best RL algorithms take around 3M timesteps to solve it, or 40 minutes of simulator time on a single core. Extrapolating from the numbers above, it'll take 40 days to solve gait9dof18musc on a single core. So I'd recommend to people in the competition that they'll need to use parallel algorithms like A3C to have any hope of getting somewhere with pure RL.

Also, it's probably wise to debug your algorithms on a fast environment (like HalfCheetah) so you can iterate quickly, before tackling the expensive environment.

@hagrid67

This comment has been minimized.

Show comment
Hide comment
@hagrid67

hagrid67 Sep 27, 2017

It sounds like a worthy aim to want to solve the slow-environment problem (perhaps in general). But at the moment there doesn't seem to be a piece of infrastructure for running multiple envs in parallel. Is there some way the lab/project could help to foster and/or guide a parallel infrastructure? (eg ctmakro has made his parallel DPPG open source on github.)

What are the views on MPI - this seems to be in extensive use at OpenAI (eg see their "baselines") but it seems to be a bit stale - for example it doesn't seem to be possible to compile it on ubuntu because it depends on defunct libraries. This appears to tie us into Anaconda (who have managed to compile it somehow). And the API looks all a bit 1990s...

Pyro4 as used by ctmakro seems more amenable, popular and well-maintained. (But I'll be happy to hear arguments in favour of MPI...)

As it stands the competition seems to encourage people to build their own parallel infrastructure. It might be more productive if this element could be collectivised so that the disparate innovation efforts could concentrate on the learning algorithms. Perhaps in a future competition?

hagrid67 commented Sep 27, 2017

It sounds like a worthy aim to want to solve the slow-environment problem (perhaps in general). But at the moment there doesn't seem to be a piece of infrastructure for running multiple envs in parallel. Is there some way the lab/project could help to foster and/or guide a parallel infrastructure? (eg ctmakro has made his parallel DPPG open source on github.)

What are the views on MPI - this seems to be in extensive use at OpenAI (eg see their "baselines") but it seems to be a bit stale - for example it doesn't seem to be possible to compile it on ubuntu because it depends on defunct libraries. This appears to tie us into Anaconda (who have managed to compile it somehow). And the API looks all a bit 1990s...

Pyro4 as used by ctmakro seems more amenable, popular and well-maintained. (But I'll be happy to hear arguments in favour of MPI...)

As it stands the competition seems to encourage people to build their own parallel infrastructure. It might be more productive if this element could be collectivised so that the disparate innovation efforts could concentrate on the learning algorithms. Perhaps in a future competition?

@ctmakro

This comment has been minimized.

Show comment
Hide comment
@ctmakro

ctmakro Oct 13, 2017

Contributor

off-topic: I read a lot about @tlbtlbtlb (thru Paul Graham's essays), but not expecting him here, worring about the same problems we chose to ignore a long time ago.

Contributor

ctmakro commented Oct 13, 2017

off-topic: I read a lot about @tlbtlbtlb (thru Paul Graham's essays), but not expecting him here, worring about the same problems we chose to ignore a long time ago.

@tlbtlbtlb

This comment has been minimized.

Show comment
Hide comment
@tlbtlbtlb

tlbtlbtlb Oct 13, 2017

@ctmakro :-). I'm working on human-in-the-loop RL, which requires simulating entire episodes in interactive times, ideally < 100 mS, in parallel across many cores. I found that even the overhead of Gym and Mujoco-py was substantial, so the current design uses the C interface to Mujoco. I should have something to release in November.

tlbtlbtlb commented Oct 13, 2017

@ctmakro :-). I'm working on human-in-the-loop RL, which requires simulating entire episodes in interactive times, ideally < 100 mS, in parallel across many cores. I found that even the overhead of Gym and Mujoco-py was substantial, so the current design uses the C interface to Mujoco. I should have something to release in November.

@kidzik

This comment has been minimized.

Show comment
Hide comment
@kidzik

kidzik Mar 24, 2018

Collaborator

I added an option to change the accuracy of the integrator through thy python interface (kidzik/opensim-core@550c2a1) and it's already being tested in

self.manager.setIntegratorAccuracy(1e-1)
. This can potentially result in a speed up 2x-3x. Getting improvement of an order of magnitude larger will require some hacks on the OpenSim side, so it's beyond the scope of this repo :( Closing for now

Collaborator

kidzik commented Mar 24, 2018

I added an option to change the accuracy of the integrator through thy python interface (kidzik/opensim-core@550c2a1) and it's already being tested in

self.manager.setIntegratorAccuracy(1e-1)
. This can potentially result in a speed up 2x-3x. Getting improvement of an order of magnitude larger will require some hacks on the OpenSim side, so it's beyond the scope of this repo :( Closing for now

@kidzik kidzik closed this Mar 24, 2018

@adwardlee adwardlee referenced this issue May 16, 2018

Closed

slow simulation #8

@iandanforth iandanforth referenced this issue Jun 27, 2018

Closed

Benchmarking #4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment