Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for rmw_connextdds, and disable rmw_connext_cpp (#547) #549

Merged
merged 2 commits into from
Mar 23, 2021

Conversation

asorbini
Copy link
Contributor

This PR includes changes from branch asorbini/rmw_connextdds which switch the default RMW implementation for RTI Connext DDS from rmw_connext_cpp to rmw_connextdds.

The changes include:

  • Add a new variable to ci_launcher to enable rmw_connextdds in a plan (CI_USE_CONNEXTDDS, disabled by default).
  • Add rmw_connextdds to the "ignored RMWs" if CI_USE_CONNEXTDDS is false.
  • Automatically disable rmw_connextdds for armhf, and aarch64 builds.
  • Remove variables CI_USE_CONNEXT_STATIC, and CI_USE_CONNEXT_DYNAMIC.
  • Always add rmw_connext_cpp, and rmw_connext_dynamic_cpp to the list of "ignored RMWs".
  • Remove some references to (deprecated) rmw_opensplice_cpp.

Merging of this PR should be coordinated with two other PRs which will enable rmw_connextdds in package rmw_implementation, and in the default ros2.repos:

See rticommunity/rmw_connextdds#9 for a general discussion about the inclusion of rmw_connextdds in ROS2 Rolling (and phasing out of rmw_connext_cpp).

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the major problem with this as-is is that the same jobs are used for Rolling, Dashing, and Foxy. In the case of Dashing and Foxy, we need to keep the default Connext rmw as rmw_connext_cpp, while for Rolling we want to change it to rmw_connextdds.

The additional complication is that we use the same job for all of them, so there can be only one default (we distinguish them at runtime by a drop-down which has the name of the release in it). That piece of information leads to the probable solution: we have only a single checkbox for "Use Connext", and at runtime we select the correct RMW based on the name of the release (rmw_connext_cpp for 'dashing' or 'foxy', rmw_connextdds for anything else).

@asorbini Let me know if this makes sense, or if you need more information to change this PR to meet that.

@ivanpauno
Copy link
Member

Another detail: we sometimes run a job only selecting one rmw implementation, but the problem is that rosidl_typesupport_fastrtps is ignored if CI_USE_FASTRTPS_STATIC box isn't checked (see here), and that will cause the job to fail if only the rmw connextdds box was checked.
The rosidl_typesupport_fastrtps_* packages should only be ignored when neither rmw fastrtps or connext boxes were selected.

rosidl_typesupport_fastrtps_* also requires fastrtps, though it could actually depend only on fastcdr (there seems to be an unneded find_package making it required, but fastrtps is never used in code).
So until that is fixed, these two ignores will also be a problem.

PS: This only matters if we want the jobs to continue passing with only the connext box checked, if we're fine with remembering that also the fastrtps box has to be checked it doesn't matter.

@asorbini
Copy link
Contributor Author

asorbini commented Mar 17, 2021

Started some CI jobs to validate the latest changes:

EDIT: Unfortunately I'm unable to test the changes to the ci_launcher interface. I tried switching the CI branch, but I don't think that affects the actual Jenkins jobs, so I'm unable to set variable USE_CI_CONNEXTDDS and set up CI_ARGS correctly for the Python script. I should have realized this earlier, but running tests on the whole tree without selecting USE_CI_CONNEXT_STATIC made me think I was using my branch. At this point I'm not sure how to validate all changes, short of configuring another Jenkins instance with them.

EDIT2: Linux plan passed. Windows failed for unrelated reasons, and I will run it again tomorrow since I don't think I have enough time to complete a full build before the nightly jobs kick in. Osx still running.

@clalancette
Copy link
Contributor

@asorbini Where are we with this? We're starting to get awfully close to the Galactic freeze deadline.

@asorbini
Copy link
Contributor Author

Hi @clalancette, as I said in my last comment, I don't have a way to validate this changes on ci.ros2.org and I was asking for your input. Do you have any suggestions on how I should proceed?

I can only think of trying to set up a local Jenkins instance, but I'm not sure that's going to be sufficient testing, since I'd only be able to test on Linux.

@asorbini
Copy link
Contributor Author

I'm trying to set up a local Jenkins instance on my machine, but it's a whole new process and I'm having some issues, so it might take me some time. I also tried to determine which version of Jenkins you are running on ci.ros2.org, and which plugins you have installed, but my user doesn't seem to have sufficient credentials for that (or maybe I just couldn't find the info).

I did find a problem in my changes to create_jenkins_job.py which I committed to asorbini:asorbini/rmw_connextdds. Before opening another PR to merge that change here, I'll see if I can validate things further and squash any remaining issues.

Looking forward to any input you may have on how to speed this whole process up and move forward with integration of rmw_connextdds.

@asorbini
Copy link
Contributor Author

asorbini commented Mar 18, 2021

I've got as far as having the jenkins plans created on my local instance (this allowed to fix a couple more errors in my changes), and I'm almost able to also run them.

Unfortunately my plans fail to execute because they fail to clone repository osrf/qtaccount as a submodule, to which my GitHub user doesn't have access.

If you can grant me access to this repository, I might be able to continue and actually complete the work to reproduce the CI infrastructure locally.

In the mean time I will create another PR to merge the most recent changes from my branch asorbini/rmw_connextdds.

EDIT: Created #551

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got one clarification for now. I'll do another review pass over this later, but I want to start some CI jobs first.

job_templates/ci_job.xml.em Outdated Show resolved Hide resolved
@clalancette
Copy link
Contributor

clalancette commented Mar 19, 2021

I'm going to run several different CI setups (besides the usual) to see what happens.

Rolling Linux CI using this PR, but not adding rmw_connextdds to the ros2.repos file. This may be an unexpected situation for this PR, so it is OK if it fails:

  • Linux Build Status

Rolling Linux CI with this PR and all 6 of the outstanding PRs in ros2/rmw_connextdds#9 (comment):

  • Linux Build Status

Dashing Linux CI using this PR:

  • Linux Build Status

Foxy Linux CI using this PR:

  • Linux Build Status

Rolling Linux CI using this PR, not adding rmw_connextdds to the ros2.repos file, and unselecting connext from the job:

  • Linux Build Status

Foxy Linux CI using this PR, and unselecting connext from the job:

  • Linux Build Status

create_jenkins_job.py Show resolved Hide resolved
clalancette added a commit that referenced this pull request Mar 19, 2021
create_jenkins_job.py Outdated Show resolved Hide resolved
Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks OK to me. I'd appreciate a review from either @cottsay or @nuclearsandwich just to double-check.

create_jenkins_job.py Outdated Show resolved Hide resolved
Copy link
Member

@nuclearsandwich nuclearsandwich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one comment on an outstanding conversation but otherwise I think this looks as reasonable as something written in templated shell and batch can look. @asorbini I really appreciate you digging in and iterating on this one.

I'm going to do some pre-deployment checks and one of the suggestions for RHEL should be selected before the big ✔️ but we should be able to wrap this up today.

create_jenkins_job.py Outdated Show resolved Hide resolved
create_jenkins_job.py Outdated Show resolved Hide resolved
@clalancette
Copy link
Contributor

clalancette commented Mar 23, 2021

All right, one final set of CI before I merge and deploy this:

Rolling/master:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Foxy:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • macOS Build Status
  • Windows Build Status

Signed-off-by: Andrea Sorbini <asorbini@rti.com>
Signed-off-by: Chris Lalancette <clalancette@openrobotics.org>
@clalancette
Copy link
Contributor

Rolling aarch64 had one warning, but I'm pretty sure that will go away once I deploy this change.

Foxy Linux (amd64 and aarch64) had some test failures in sros2, but I doubt it is because of this PR.

Foxy macOS had a few test failures, but again, I doubt it is from this PR.

Foxy Windows seems stuck (at least, it has been in the same place for the last hour). I'm going to go ahead without that and merge this anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants