Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The simulation scene is terminated before the set condition #657

Open
guanjiayi opened this issue Mar 9, 2021 · 11 comments
Open

The simulation scene is terminated before the set condition #657

guanjiayi opened this issue Mar 9, 2021 · 11 comments
Labels
help wanted Extra attention is needed

Comments

@guanjiayi
Copy link

High Level Description
[I want the simulation scene will terminate after the termination conditions are met]

Desired SMARTS version
[e.g. 0.4.12]

Operating System
[ Ubuntu 18.04]

Problems
[1. I set the training episode_num =1000, but the simulation scene is terminated at episode=(8~17) and the sum steps is fixed (7763) ]
[2. If i change and build the loop scene, and it is terminated in different sum steps, but if without any changes about the scenario the simulation scene it terminated in the same sum steps ]

@guanjiayi guanjiayi added the help wanted Extra attention is needed label Mar 9, 2021
@Gamenot
Copy link
Collaborator

Gamenot commented Mar 9, 2021

@guanjiayi Hello, thank you for the question. I need more information so that I can answer you.

Are you using rllib.py, or benchmark, or ultra to try and train? Or did you create your own training?

Are you seeing any errors when the simulation exits?

If you are using rllib.py the scenario could potentially end because the training crashes: and there will be an error log in "~/ray_results/rllib_example_multi/**/error.txt". If you are using rllib.py please provide that error text.

@guanjiayi
Copy link
Author

@Gamenot Hello, Thank you for your reply.

I Create my own training, but it base on the single_agent.py. The error still appeared when I check this problem through example/single_agent.py and just set its episodes=1000.

I use the loop scenario , The error have been mentioned that "ERROR: SMARTS: Simulation crashed with exception. Attempting to cleanly shutdown. ERROR:SMARTS:connection by SUMO"

@guanjiayi
Copy link
Author

The Error mentioned:
ERROR:SMARTS:Simulation crashed with exception. Attempting to cleanly shutdown.
ERROR:SMARTS:connection closed by SUMO
Traceback (most recent call last):
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 170, in step
return self._step(agent_actions)
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 219, in _step
provider_state = self._step_providers(all_agent_actions, dt)
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 695, in _step_providers
provider, actions, dt, self._elapsed_sim_time
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 734, in _step_provider
provider_state = provider.step(provider_actions, dt, elapsed_sim_time)
File "/home/jiayiguan/SMARTS/smarts/core/sumo_traffic_simulation.py", line 310, in step
self._traci_conn.simulationStep(self._cumulative_sim_seconds)
File "/usr/share/sumo/tools/traci/connection.py", line 302, in simulationStep
result = self._sendCmd(tc.CMD_SIMSTEP, None, None, "D", step)
File "/usr/share/sumo/tools/traci/connection.py", line 180, in _sendCmd
return self._sendExact()
File "/usr/share/sumo/tools/traci/connection.py", line 90, in _sendExact
raise FatalTraCIError("connection closed by SUMO")
traci.exceptions.FatalTraCIError: connection closed by SUMO
╰────────────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────╯
Traceback (most recent call last):
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 170, in step
return self._step(agent_actions)
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 219, in _step
provider_state = self._step_providers(all_agent_actions, dt)
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 695, in _step_providers
provider, actions, dt, self._elapsed_sim_time
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 734, in _step_provider
provider_state = provider.step(provider_actions, dt, elapsed_sim_time)
File "/home/jiayiguan/SMARTS/smarts/core/sumo_traffic_simulation.py", line 310, in step
self._traci_conn.simulationStep(self._cumulative_sim_seconds)
File "/usr/share/sumo/tools/traci/connection.py", line 302, in simulationStep
result = self._sendCmd(tc.CMD_SIMSTEP, None, None, "D", step)
File "/usr/share/sumo/tools/traci/connection.py", line 180, in _sendCmd
return self._sendExact()
File "/usr/share/sumo/tools/traci/connection.py", line 90, in _sendExact
raise FatalTraCIError("connection closed by SUMO")
traci.exceptions.FatalTraCIError: connection closed by SUMO

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "rlagentmodel/sacagent_dis/sacagent_dis20210310.py", line 669, in
epochs=args.epochs,
File "rlagentmodel/sacagent_dis/sacagent_dis20210310.py", line 598, in main
observations, rewards, dones, infos = env.step({AGENT_ID:agent_action})
File "/home/jiayiguan/SMARTS/smarts/env/hiway_env.py", line 160, in step
observations, rewards, agent_dones, extras = self._smarts.step(agent_actions)
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 181, in step
self.destroy()
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 460, in destroy
self._traffic_sim.destroy()
File "/home/jiayiguan/SMARTS/smarts/core/sumo_traffic_simulation.py", line 110, in destroy
self._close_traci_and_pipes()
File "/home/jiayiguan/SMARTS/smarts/core/sumo_traffic_simulation.py", line 270, in _close_traci_and_pipes
self._traci_conn.close()
File "/usr/share/sumo/tools/traci/connection.py", line 369, in close
if self._socket is not None:
AttributeError: 'Connection' object has no attribute '_socket'
Assertion failed: !is_empty() at line 2340 of panda/src/pgraph/nodePath.cxx
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/home/jiayiguan/anaconda3/envs/smarts/lib/python3.7/site-packages/direct/showbase/ShowBase.py", line 82, in exitfunc
builtins.base.destroy()
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 451, in destroy
self.teardown()
File "/home/jiayiguan/SMARTS/smarts/core/smarts.py", line 434, in teardown
self._root_np.clearLight()
AssertionError: !is_empty() at line 2340 of panda/src/pgraph/nodePath.cxx
/home/jiayiguan/anaconda3/envs/smarts/lib/python3.7/tempfile.py:798: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpuv7_ey91wandb'>
_warnings.warn(warn_message, ResourceWarning)
/home/jiayiguan/anaconda3/envs/smarts/lib/python3.7/tempfile.py:798: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmp4vccvsv3wandb-media'>
_warnings.warn(warn_message, ResourceWarning)
/home/jiayiguan/anaconda3/envs/smarts/lib/python3.7/tempfile.py:798: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpdnk9cmvwwandb-media'>
_warnings.warn(warn_message, ResourceWarning)

@Gamenot
Copy link
Collaborator

Gamenot commented Mar 11, 2021

@guanjiayi We are really sorry about that. This exception is a difficult bug to reproduce that is related to SUMO and we currently have a branch going to pass that crash along to the SUMO team #619.

I would suggest for now to remove the bubble in the loop scenario or use a different map since the crash occurs with some frequency on the loop map.

@Adaickalavan
Copy link
Member

Hi @guanjiayi,

It appears that you encouter the traci.exceptions.FatalTraCIError: connection closed by SUMO error.

Given the occurrence of traci.exceptions.FatalTraCIError: connection closed by SUMO error, could you try running all your commands and experiments inside a docker container?

$ docker run --rm -it --network=host huaweinoah/smarts:v0.4.13

Do not map the source code using -v $PWD:/src when running the docker container.

@guanjiayi
Copy link
Author

@Adaickalavan Thank you for your reply. I check the problem in difference way.
A difference way:

  1. I install our smarts without install other package, this problem did't appear when test the single_agent.py.
  2. Then i install the openai reinforcement learning package and the wandb package, this problem have been appeared.
  3. Now I change the install sequence, our smarts package is installed after the openai reinforcement learning package and without install the wandb package. ( I still check the test)

Use docker

  1. The sumo-gui can't be open in the docker.

@guanjiayi
Copy link
Author

Open AI reinforcement learning package https://spinningup.openai.com/en/latest/user/installation.html

@Gamenot
Copy link
Collaborator

Gamenot commented Mar 17, 2021

@guanjiayi OK, I will take a look today.

@guanjiayi
Copy link
Author

@Gamenot Thank you, The "connection closed by SUMO" didn't appear, when the running the example in the docker after install the spinningup package.

@guanjiayi
Copy link
Author

@Gamenot Thank you for your help!

@Gamenot
Copy link
Collaborator

Gamenot commented Mar 23, 2021

The progress on this is that we passed the bug along to SUMO via a reproducible crash we can generate in #619.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants