Skip to content
This repository has been archived by the owner on Feb 4, 2021. It is now read-only.

test_launch_ros.test_node failures #266

Closed
mjcarroll opened this issue Feb 19, 2020 · 16 comments
Closed

test_launch_ros.test_node failures #266

mjcarroll opened this issue Feb 19, 2020 · 16 comments
Assignees
Labels

Comments

@mjcarroll
Copy link
Member

New failures in nightly_linux_release since at least Friday, Feb 14th, 2020.

https://ci.ros2.org/view/nightly/job/nightly_linux_release/1444/

test_launch_ros.test.test_launch_ros.actions.test_node.TestNode.test_create_node_with_invalid_parameters
test_launch_ros.test.test_launch_ros.actions.test_node.TestNode.test_launch_node
test_launch_ros.test.test_launch_ros.actions.test_node.TestNode.test_launch_node_with_parameter_dict
test_launch_ros.test.test_launch_ros.actions.test_node.TestNode.test_launch_node_with_parameter_files
test_launch_ros.test.test_launch_ros.actions.test_node.TestNode.test_launch_node_with_remappings
test_launch_ros.test.test_launch_ros.actions.test_node.TestNode.test_launch_required_node
@mjcarroll
Copy link
Member Author

@jacobperron and @ivanpauno could y'all take a look at this? You were the last to touch ros2/launch and ros2/launch_ros, I believe.

@jacobperron
Copy link
Member

I almost thought these failures were unique to ros2/launch_ros#122 (comment) 😅


Looks like the tests are having a problem finding demo_nodes_py:

package 'demo_nodes_py' found at '/home/jenkins-agent/workspace/ci_linux/ws/install/demo_nodes_py', but libexec directory '/home/jenkins-agent/workspace/ci_linux/ws/install/demo_nodes_py/lib/demo_nodes_py' does not exist

I just did a fresh build from source and can't reproduce locally...

@jacobperron
Copy link
Member

I can keep investigating, but extra eyes are welcome.

@jacobperron jacobperron self-assigned this Feb 19, 2020
@mjcarroll
Copy link
Member Author

I wasn't able to reproduce locally, either, but it has commonly appeared for the last several builds. I'm not sure what differences may exist.

@jacobperron
Copy link
Member

My money is on a pip dependency update... maybe setuptools 🤔
Updating now and will see if I can reproduce.

@ivanpauno
Copy link
Member

I have also seen this in CI, but I couldn't reproduce it locally either.

My money is on a pip dependency update... maybe setuptools thinking

That's a good bet.

Updating now and will see if I can reproduce.

Have you been able to reproduce it?

@jacobperron
Copy link
Member

Have you been able to reproduce it?

Sadly, no I haven't been able to repro yet. (Sorry forgot to post my result).

@pbaughman
Copy link

I can reproduce it very easily using the nightly docker image. See the instructions in the closed issue ros2/launch_ros#125

@pbaughman
Copy link

pbaughman commented Feb 20, 2020

One additional clue:
If you're in the nightly docker image and look in the /opt directory, you'll find the demo_nodes_py libs live in /opt/ros/foxy/lib/python3.6/site-packages/demo_nodes_py

root@86a155f8e9b2:/ws# find /opt -iname 'demo_nodes_py'
/opt/ros/foxy/lib/python3.6/site-packages/demo_nodes_py # <--------- Here!
/opt/ros/foxy/share/demo_nodes_py
/opt/ros/foxy/share/colcon-core/packages/demo_nodes_py
/opt/ros/foxy/share/ament_index/resource_index/packages/demo_nodes_py

And the folder looks like this:

root@86a155f8e9b2:/ws# ll /opt/ros/foxy/lib/python3.6/site-packages/demo_nodes_py
total 24
drwxr-xr-x 5 1001 1001 4096 Feb 20 12:57 ./
drwxr-xr-x 1 1001 1001 4096 Feb 20 13:46 ../
-rw-r--r-- 1 1001 1001    0 Feb 20 12:32 __init__.py
drwxr-xr-x 2 1001 1001 4096 Feb 20 12:57 __pycache__/
drwxr-xr-x 3 1001 1001 4096 Feb 20 12:57 services/
drwxr-xr-x 3 1001 1001 4096 Feb 20 12:57 topics/

When you build from source, you'll see there's an extra libs directory:

root@86a155f8e9b2:/ws# find . -iname 'demo_nodes_py'
./log/build_2020-02-20_16-04-25/demo_nodes_py
./build/demo_nodes_py
./build/demo_nodes_py/build/lib/demo_nodes_py
./install/demo_nodes_py
./install/demo_nodes_py/lib/demo_nodes_py  # <--------- Here!
./install/demo_nodes_py/lib/python3.6/site-packages/demo_nodes_py
./install/demo_nodes_py/share/demo_nodes_py
./install/demo_nodes_py/share/colcon-core/packages/demo_nodes_py
./install/demo_nodes_py/share/ament_index/resource_index/packages/demo_nodes_py
./src/demos/demo_nodes_py
./src/demos/demo_nodes_py/demo_nodes_py
./src/demos/demo_nodes_py/resource/demo_nodes_py

It appears to contain the executable scripts

root@86a155f8e9b2:/ws# ll ./install/demo_nodes_py/lib/demo_nodes_py
total 40
drwxr-xr-x 2 root root 4096 Feb 20 16:04 ./
drwxr-xr-x 4 root root 4096 Feb 20 16:04 ../
-rwxr-xr-x 1 root root  424 Feb 20 16:04 add_two_ints_client*
-rwxr-xr-x 1 root root  436 Feb 20 16:04 add_two_ints_client_async*
-rwxr-xr-x 1 root root  424 Feb 20 16:04 add_two_ints_server*
-rwxr-xr-x 1 root root  402 Feb 20 16:04 listener*
-rwxr-xr-x 1 root root  410 Feb 20 16:04 listener_qos*
-rwxr-xr-x 1 root root  424 Feb 20 16:04 listener_serialized*
-rwxr-xr-x 1 root root  398 Feb 20 16:04 talker*
-rwxr-xr-x 1 root root  406 Feb 20 16:04 talker_qos*

And the other directory appears to match what's in /opt/ in the nightly docker image:

root@86a155f8e9b2:/ws# ll ./install/demo_nodes_py/lib/python3.6/site-packages/demo_nodes_py
total 20
drwxr-xr-x 5 root root 4096 Feb 20 16:04 ./
drwxr-xr-x 4 root root 4096 Feb 20 16:04 ../
-rw-r--r-- 1 root root    0 Feb 20 16:01 __init__.py
drwxr-xr-x 2 root root 4096 Feb 20 16:04 __pycache__/
drwxr-xr-x 3 root root 4096 Feb 20 16:04 services/
drwxr-xr-x 3 root root 4096 Feb 20 16:04 topics/

Edit If I search for the python add_two_ints_client program, I find it in /opt/ros/foxy/bin/add_two_ints_client but it looks like the launch_ros test is looking for it in /opt/ros/foxy/lib/demo_nodes_py

nuclearsandwich added a commit to ros2/ci that referenced this issue Feb 24, 2020
Testing to see how this affects ros2/build_farmer#266
@nuclearsandwich
Copy link
Member

The first exhibition of this is co-incidental with the first nightlies to run with ros2/ci#385. This made me a bit nervous so I pushed a branch which uses virtualenv and got the same results.

The executables for demo_nodes_py are being installed into bin rather than the "libexec" path during the CI run but I haven't figured out why and even using the osrf/ros2:nightly docker image and an overlay workspace building demo_nodes_py installs the executables in the libexec path as configured.

@nuclearsandwich
Copy link
Member

I've been able to reproduce both the correct and incorrect behavior in different workspaces on a copy of a CI docker image but I have yet to determine what the difference between them is. I'll keep looking at it in the morning.

@dirk-thomas
Copy link
Member

dirk-thomas commented Mar 3, 2020

Can you check if build/demo_nodes_py/setup.cfg exists and is a symlink to the same file in the source directory?

@nuclearsandwich
Copy link
Member

Can you check if build/demo_nodes_py/setup.cfg exists and is a symlink to the same file in the source directory?

There's no symlinking or copying of setup.py or setup.cfg into the build directory in either the passing (installation to libexec) or failing (installing to bin) cases on my system.

In all cases I've seen that behavior within the virtualenv is incorrect while behavior outside is correct. This has been true even when using pip to install the exact same versions of packages reported with pip freeze --all in both the virtualenv and system-level.

I've started using pdb on the setup function and I'm seeing something interesting I'm about to pry into further. When installing via the system, after parsing the setup.cfg I get this result:

dist.command_options
{'develop': {'script_dir': ('setup.cfg', '$base/lib/demo_nodes_py')}, 'install': {'install_scripts': ('setup.cfg', '$base/lib/demo_nodes_py')}}

When within a virtualenv I get instead:

dist.command_options
{'develop': {'script_dir': ('setup.cfg', '$base/lib/demo_nodes_py')}, 'install': {}}

So the setup.cfg is being parsed but for whatever reason when parsing inside the virtualenv the install command's options aren't being picked up.

@nuclearsandwich
Copy link
Member

nuclearsandwich commented Mar 4, 2020

This is the code that's biting us. But this function was adapted/vendored from a similar one in distutils that was being used directly in earlier versions of setuptools so I'm not yet sure what has changed that we're only triggering this codepath now.

        # Ignore install directory options if we have a venv
        if not six.PY2 and sys.prefix != sys.base_prefix:
            ignore_options = [
                'install-base', 'install-platbase', 'install-lib',
                'install-platlib', 'install-purelib', 'install-headers',
                'install-scripts', 'install-data', 'prefix', 'exec-prefix',
                'home', 'user', 'root']

In classic fashion I've now advanced from "why isn't this working?" to "how did this ever work?" so progress!

@nuclearsandwich
Copy link
Member

sys.base_prefix has changed between virtualenv 16.7.9 and virtualenv 20.0 / venv.
In a virtualenv created with 16.7.9, sys.base_prefix == sys.prefix == the virtualenv base. When using venv or virtualenv 20, sys.base_prefix == /usr which is triggering the code to ignore installation options above.

@nuclearsandwich
Copy link
Member

While the solution we've arrived at is not my favorite. It does get these tests passing and our packages back to functional so it has gone in. I've opened ros2/ci#400 to continue the discussion of how to get off of virtualenv 16.x and am closing this issue.

Thanks everyone who contributed to the investigation and especially @pbaughman who noticed the binaries in the incorrect path. That was a huge lead!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants