Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] [Ubuntu 20.04.2] [ROS1] System becomes completely unresponsive - Memory Problem with ROS extension in a Docker container #393

Closed
3 tasks
goupil35000 opened this issue Feb 21, 2021 · 24 comments
Labels
bug Something isn't working

Comments

@goupil35000
Copy link

goupil35000 commented Feb 21, 2021

  • Linux: Ubuntu 20.04.2
  • Windows: last versions of Windows Insider
  • ROS 1: Noetic

<Version of the plugin: 0.6.7 or 0.6.6>

<Version: 1.53.2>

what is the bug

The system becomes unresponsive when activating ROS extension in Docker (the container is configured with ROS activated, catkin_ws ... and ROS works very well without Visual Studio in this container).

Repro steps

  1. Launch a Docker container.
  2. Launch Vscode. Attach Visual Studio Code to the container
  3. Enable ROS extension in the container => The used memory grows up fast until the system becomes unresponsive.
  4. When the system is unresponsive, impossible to do anything in Ubuntu. You need to do a hard reboot.

expected behavior

It works well until activation of ROS extension. It's the same if the extension was already activated and you attach visual studio to the container.
It doesn't occur if you do this:

  1. Launch a Docker container.
  2. Launch roscore in Docker container
  3. Launch Vscode. Attach Visual Studio Code to the container
  4. ...

So the problem is related with roscore and ROS extension in a docker container. This problem doesn't occur outside of a container.

additional context

The problem was tested on multiple computers under Ubuntu 20.04.1 and 20.04.2, last versions of Windows Insider using WSL2 and Docker version 19.03 and 20.10.3

@goupil35000 goupil35000 added the bug Something isn't working label Feb 21, 2021
@caioaamaral
Copy link

I'm incurring into the same problem.
Linux: Ubuntu 20.04
ROS 1: Kinetic

@goupil35000
Copy link
Author

Hi,
The problem is still here for me. So now I launch code from inside docker and it's working like this.
But this is not a solution to the problem, only a solution to have something working.
Hope this helps.

@goupil35000 goupil35000 changed the title [bug] [Ubuntu 20.04.2] [ROS1] Memory Problem with ROS extension in a Docker container [bug] [Ubuntu 20.04.2] [ROS1] System becomes completely unresponsive - Memory Problem with ROS extension in a Docker container Apr 13, 2021
@goupil35000
Copy link
Author

Before posting here, I asked a question in a more general forum:
https://github.com/microsoft/vscode/issues/111020#issuecomment-817617824

It seems that the problem occurs also with Melodic.

@nicolaje
Copy link

nicolaje commented Apr 16, 2021

Hello, I have the exact same problem here, enabling the vscode-ros extension with VSCode attached to a docker container will crash the computer by consuming all memory, see attached pictures.

VSCode version :
1.55.2
3c4e3df9e89829dce27b7b5c24508306b151f30d
x64

Docker version 20.10.6, build 370c289

The docker container is based on a simple Ubuntu image.

VSCode-ros extension version : v0.6.7

ms-vscode-remote.vscode-remote-extensionpack version : v0.20.0

ROS Version installed in the container : Melodic

Host linux version : Ubuntu 18.04

On this picture you can see the RAM usage before the extension is enabled :

disabled

On this picture you see the RAM usage after the extension has been enabled :

enabled

it ultimately crashes everything.

@ajshort @kejxu @ooeygui @JamesGiller can you tell me how can I help you track this issue ?

@caioaamaral
Copy link

Maybe it's a problem with the includePaths. When I define C_Cpp.default.compileCommands inside my .code-workspace it works fine

@nicolaje
Copy link

What do you set the "C_Cpp.default.compileCommands" command to ?

@ooeygui
Copy link
Member

ooeygui commented Apr 18, 2021

@nicolaje Thank you for the analysis. I'll look into this problem this week.

@nicolaje
Copy link

Great, @ooeygui let me know if I can help you with this.

@ooeygui
Copy link
Member

ooeygui commented Apr 19, 2021

@nicolaje Could you provide more details on how I can reproduce this? Is there a docker image I can pull which exhibits the memory issue?

I've tried to reproduce this using the official ros docker images for melodic with a simple catkin workspace on an ubuntu 18.04 native host, with VSCode 1.55.2 and ROS 0.6.7 and the vscode remote extensions. I started the container in interactive mode, then attached the vscode extension; then installed the extensions. Once everything was set up. I used a terminal In the docker container, I started the /ros_entrypoint.sh, then roscore.

I really appreciate the time everyone is putting into this problem.
Lou

@nicolaje
Copy link

Hi Lou,

I will try to give you a reproductible example. Have you run your test with a workspace that's been already built ?

If I activate the ROS extension in VSCode docker where there are no build and devel folders it doesn't crash. If I then build the workspace, the extension doesn't crash.

But I I re-launch the container, VSCode attached to the container and activate the extension with the existing build and devel folders , only then the behavior reported in #393 (comment) appears.

@goupil35000
Copy link
Author

Hi Lou,
I confirm the problem is still present and that the problem only occurs in these cases:

  • you use vscode outside of docker, directly on the host;
  • if you have the extension ROS opened in docker container (and ROS name appeared in the bottom);
  • if roscore is not running in docker container. If roscore is running before launching code, the problem doesn't occur.

We had multiple times the problem with my students so now I asked them to use vscode directly in docker container but it's not a good solution.

Hope this helps to find a solution to the problem I first mentioned in https://github.com/microsoft/vscode/issues/111020#issuecomment-817617824

Goupil

@nicolaje
Copy link

nicolaje commented Apr 19, 2021

@goupil35000 has a point, it doesn't happen when roscore is running in docker !

On this picture, roscore is running in the container (top left terminal), and the workspace is opened with VSCode attached to the container (as you can see with the ">< Container ros:seasam (/ros)" on the bottom left of VSCode window. There is no memory issue.

with_roscore

I then kill roscore in the container, leave VSCode open in the container and no problem arise.

Then I close VSCode, reopen it in the container and you can see from this picture the memory usage explodes :

without_roscore

So a short term solution is to have roscore running in the container.

Edit : I also did the test of running the container with network=host, running roscore on the host instead of inside the container, then opening the workspace in VSCode attached to the container, and it did not crash, as was the case with roscore running inside the container.

@goupil35000
Copy link
Author

Hello Nicolaje,

Could you tell what are the 6 port forwarded in the bottom (center) of the screen. Click on it to get this information.
I created a new container with just ros noetic and it seems that I no longer have the problem. But I have no forwarded port at the bottom. When I'm looking at my previous videos, I have 1 forwarded port.
I don't know if this is related to our problem.

One last thing: your website on your profile is no longer active.

@nicolaje
Copy link

There are a lot of auto-forwarded ports :

ports

Thank you for notifying me about my webpage, I like to think indeed that I will put it back online one day :)

@ooeygui
Copy link
Member

ooeygui commented Apr 19, 2021

If I follow this thread correctly, does this only repro if you build your workspace on the host and project it into the container?

@goupil35000
Copy link
Author

goupil35000 commented Apr 19, 2021

Hello Lou,
Thanks for your reply. I tried to find a simple way to reproduce the problem. It's not exactly what I had with my students, but I think it's related to.
Here are the different steps.

Hope this helps you to find the problem.
Configuration: x64 (intel or amd) + nvidia gpu.
Ubuntu 20.04.2, Vscode 1.55.2 + extensions (all with the last versions) Python, Jupyter, ROS, Docker

Goupil

=========================== BEGINNING ================================================
docker run -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth --net=host --gpus all ros:noetic-ros-base

------------------ In container ------------

echo "source /opt/ros/noetic/setup.bash" >> ~/.bashrc
mkdir -p ~/catkin_ws/src
cd ~/catkin_ws/
catkin_make
echo "source $HOME/catkin_ws/devel/setup.bash" >> ~/.bashrc

------ Open Visual Code -------------
Connect your Individual Container using Right Click and "Attach Visual Code"
Then click on Menu "File" and "Add Folder to Workspace". Select /root/catkin_ws and ok.

Nothing change in the bottom => No ROS appearing in the bottom

Close Visual Studio. Reopen Visual Studio.
ROS appears in the bottom.

Memory consumption increases. You need to close Visual Studio before 100% and you need to kill some processes:

  • containerd-shim-runc-v2
  • all processes containing "docker exec -i -u root -w /root/.vscode-server/extensions"

====================END========================================

@ooeygui
Copy link
Member

ooeygui commented Apr 20, 2021

@nicolaje @goupil35000
Thank you for your time on this, I really appreciate it!
I'm really struggling to reproduce this locally. I suspect there may be another environmental issue which is conspiring to cause this. Would you be willing to set up a video conference with me and help me get a repro? If this is interesting to you, can you connect with me on linkedin so we can set up a screen share?

https://www.linkedin.com/in/louamadio/

@goupil35000
Copy link
Author

Hi Lou,
Sorry I'm not on Linkedin. I'm going to sign in, so I need some time to be familiar with the interface.
If you don't have this problem, maybe we can connect with a screen share at the end of this week (friday afternoon ? -- I'm in France).
I include a video here:
https://user-images.githubusercontent.com/8226158/115373132-dcd03e00-a1cb-11eb-88f0-567f740e8caa.mp4

I continued to investigate the problem. This is not related to gpu, so I remove the gpu in the 'docker run' line.
1/ I remove net==host. The command is this one.

docker run -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /tmp/.docker.xauth:/tmp/.docker.xauth -e XAUTHORITY=/tmp/.docker.xauth ros:noetic-ros-base

I then simply do this in container:
mkdir -p ~/catkin_ws/src
cd ~/catkin_ws/
catkin_make

And in Vscode 'Add Folder to workspace' (/root/catkin_ws). In practice when you have done this once in Vscode, it is doing it automatically after.

It works great (see video until 1:23). You can open visual studio, it discovers ROS (see on the bottom). You can close it, reopen it no problem.

2/ I add net==host (beginning at 1:23 in video). I can connect vscode on the container but when ROS opens in the bottom you can see the memory consumption increases very fast on the right. I close remote connection at 2:16 and you can see that memory consumption goes back to the initial value.
When looking at the processes on host, you can see many processes related to net (beginning in video at 2:30).
/root/.vscode-server/bin/3c4e3df9e89829dce27b7b5c24508306b151f30d/node -e ????const net = require('net'); ????process.stdin.pause(); ????const client = net.

These processes exist even when closing vscode completely.

Hope this will be enough to isolate the problem (I tested it on my home computer and on my laptop and I have the same on the 2 computers -- they have the same tools and software on the 2).

-------------- another problem detected ----------------
This new problem doesn't appear in video. When you close vscode, some processes stay in host. It seems to disappear sometimes after some time but some stay until I stop the container.

/root/.vscode-server/bin/3c4e3df9e89829dce27b7b5c24508306b151f30d/node /tmp/vscode-remote-containers-server-078f4de17fd45b7e236e3de9123704823211e2d7.js

/root/.vscode-server/bin/3c4e3df9e89829dce27b7b5c24508306b151f30d/node /root/.vscode-server/bin/3c4e3df9e89829dce27b7b5c24508306b151f30d/out/vs/server/main.js --force-disable-user-env --use-host

sh /root/.vscode-server/bin/3c4e3df9e89829dce27b7b5c24508306b151f30d/server.sh --force-disable-user-env --use-host-proxy --port 0 --extensions-download-dir /root/.vscode-server/extensionsCache -

docker exec -i -u root -w /root/.vscode-server/extensions 09e2088020abd258398bcddd96c7617df5d4f8473ca8f31dbcefdd073f11f8e7 /bin/sh -c # Watch installed extensions ???trap "exit 0" 16 ???old=ls -A --full-time ???counter=0 ???while [ $counter -lt 60 ] ???do ????sleep 1 ????new=ls -A --full-time ????if [ "$new" != "$old" ] ????then ?????exit 1 ????fi ????counter=expr $counter + 1 ???done ???exit 2 ??

docker exec -i -u root -w /root/.vscode-server/data/Machine 09e2088020abd258398bcddd96c7617df5d4f8473ca8f31dbcefdd073f11f8e7 /bin/sh -c # Watch machine settings ???trap "exit 0" 16 ???old=ls -A --full-time settings.json 2>/dev/null || true ???counter=0 ???while [ $counter -lt 60 ] ???do ????sleep 1 ????new=ls -A --full-time settings.json 2>/dev/null || true ????if [ "$new" != "$old" ] ????then ?????exit 1 ????fi ????counter=expr $counter + 1 ???done ??

Goupil

@ooeygui
Copy link
Member

ooeygui commented Apr 20, 2021

Thank you for the video link. I found the root cause of the bug.

This line creates a interval to update the icon (200ms seems a bit fast to me, I'll fix that):

this.timer = setInterval(() => this.update(), 200);

This creates a promise, which appears to never return:

return this.getPid().then(() => true, () => false);

But this only occurs when you port forward port 11311.

I'm going to add two fixes:

  • Reduce the timeout
  • Don't recheck when there is a pending check.

@ooeygui
Copy link
Member

ooeygui commented Apr 21, 2021

Plot thickens.

Even just doing 1 xmlrpc request on a port forwarded port causes this issue to manifest. I'll see if I can repro with a simple extension.

@ooeygui
Copy link
Member

ooeygui commented Apr 21, 2021

Confirmed that this can be reproduced with a simple extension that just does an xmlrpc call on a forwarded port with no listener.

@nicolaje
Copy link

Good job !
Let me know if you want the fix to be tested

@ooeygui
Copy link
Member

ooeygui commented Jun 8, 2021

This has been corrected upstream. Thank you for working through this with us!!!

@ooeygui ooeygui closed this as completed Jun 8, 2021
@nicolaje
Copy link

nicolaje commented Jun 8, 2021

Thank you for the follow up and your time !

Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants