-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update RMW selection with a checkbox for each impl #212
Conversation
@@ -111,7 +110,7 @@ def main(argv=None): | |||
'linux-aarch64': { | |||
'label_expression': 'linux_aarch64', | |||
'shell_type': 'Shell', | |||
'use_connext_default': 'false', | |||
'ignore_rmw_default': data['ignore_rmw_default'] | {'rmw_connext_cpp', 'rmw_connext_dynamic_cpp'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any RMW impl using Connext is being ignored on aarch64 atm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we do this in a few places a set called ignored_on_arm
or something would be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could certainly be done. I just wanted to keep the changes minimal since the same was true before this patch. So I would rather keep that for a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow your reasoning. We've added a hard-coded list in multiple places in this PR whereas a flag was being set before. Given that we're modifying the rmw selection logic in this PR I don't see why we would save that change for a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo more variables implies more complexity. In order to understand a block of config values you will need to browse further up in the code to find what these variables are initialized with. Without the variables it is self contained (as before).
Please feel free to add the logic to this PR if you think it is important.
'ignore_rmw_default': { | ||
'rmw_connext_dynamic_cpp', | ||
'rmw_fastrtps_dynamic_cpp', | ||
'rmw_opensplice_cpp'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dynamic implementations of Connext and FastRTPS as well as OpenSplice are ignored by default
ros2_batch_job/__main__.py
Outdated
blacklisted_package_names += [ | ||
'connext_cmake_module', | ||
'rmw_connext_shared_cpp', | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If both of the two RMW packages using Connext are ignored the supporting packages can be ignored too.
Similar for FastRTPS and OpenSplice below.
if args.disable_connext_static: | ||
os.environ["CONNEXT_STATIC_DISABLE"] = '1' | ||
if args.disable_connext_dynamic: | ||
os.environ["CONNEXT_DYNAMIC_DISABLE"] = '1' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This custom logic is not necessary anymore with each RMW impl. being ignored by its name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's one edge case were if all the middlewares are enabled the command line will be malformed, but otherwise it lgtm.
@@ -234,23 +231,24 @@ if "!CI_BRANCH_TO_TEST!" NEQ "" ( | |||
if "!CI_COLCON_BRANCH!" NEQ "" ( | |||
set "CI_ARGS=!CI_ARGS! --colcon-branch !CI_COLCON_BRANCH!" | |||
) | |||
if "!CI_USE_CONNEXT!" == "true" ( | |||
set "CI_ARGS=!CI_ARGS! --connext" | |||
set "CI_ARGS=!CI_ARGS! --ignore-rmw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you run this with all middlewares this option will have no argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is intended and not a problem since the option accepts zero to N parameters.
@@ -242,23 +239,24 @@ if "!CI_COLCON_BRANCH!" NEQ "" ( | |||
if "!CI_USE_WHITESPACE_IN_PATHS!" == "true" ( | |||
set "CI_ARGS=!CI_ARGS! --white-space-in sourcespace buildspace installspace workspace" | |||
) | |||
if "!CI_USE_CONNEXT!" == "true" ( | |||
set "CI_ARGS=!CI_ARGS! --connext" | |||
set "CI_ARGS=!CI_ARGS! --ignore-rmw" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here if all middlewares are true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is intended and not a problem since the option accepts zero to N parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partial review; just reviewed the python logic so far. I think the set usage is tidier overall, though we can probably squeeze more benefit out of it with sets representing all RMW impls and those ignored on ARM, for example.
'use_connext_default': 'false', | ||
'use_fastrtps_default': 'true', | ||
'use_opensplice_default': 'true', | ||
'ignore_rmw_default': { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the purpose of this job has expanded now, its job name should be updated
Btw, if we're going to go with the set logic, it'd be more readable to have a set of all RMW impls and then have the ignored ones as = (All - Wanted).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would imply that all RMW implementations are known. Without requiring that information it is possible to feed a .repos
with additional RMW implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean within the scope of the create_jenkins_job.py
script, for the purpose of populating the ignore_rmw_default
value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the purpose of this job has expanded now, its job name should be updated
Can you clarify what you mean with this. The job is still building OpenSplice and tests against one other RMW impl - also still FastRTPS. So I am not sure what has changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean within the scope of the
create_jenkins_job.py
script, for the purpose of populating theignore_rmw_default
value.
Maybe I can't see how you imagine it to look like. Can you either describe it more concretely or propose a patch to clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify what you mean with this. The job is still building OpenSplice and tests against one other RMW impl - also still FastRTPS. So I am not sure what has changed.
Yep: currently the job has ospl in the name because that's its purpose: it's the only nightly with opensplice. Now it has an additional purpose: it's the only nightly with fastrtps dynamic, so the name should reflect that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see why we might be on different pages. The code as it stands is ignoring fastrtps_dynamic. My understanding (and the summary that is in the PR description) is that fastrtps dynamic should be enabled in a nightly with opensplice for each platform, including tests. In my head that is this job that we are discussing (perhaps that's where we disagree). Otherwise, fastrtps dynamic is only enabled in packaging jobs and the xenial job, right?
This gets at my motivation for wanting to better capture our intent (otherwise the reader has to do the arithmetic themselves). It would look something like this:
known_rmws = {'rmw_fastrtps_cpp', 'rmw_fastrtps_dynamic_cpp', 'rmw_connext_cpp', 'rmw_connext_dynamic_cpp', 'rmw_opensplice_cpp'}
enabled_rmws = {'rmw_opensplice_cpp', 'rmw_fastrtps_dynamic_cpp'}
ignore_rmw_default = known_rmws - enabled_rmws
'use_fastrtps_default': 'true', | ||
'use_opensplice_default': 'true', | ||
'use_connext_default': 'false' if os_name is 'linux-aarch64' else 'true', | ||
'ignore_rmw_default': {'rmw_connext_cpp', 'rmw_connext_dynamic_cpp'} if os_name is 'linux-aarch64' else set(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
think this line can be simplified to an empty set since we're in a if windows
branch already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just kept the logic as it was before - I thought the intention was to keep the value the same as in the other cases.
@@ -111,7 +110,7 @@ def main(argv=None): | |||
'linux-aarch64': { | |||
'label_expression': 'linux_aarch64', | |||
'shell_type': 'Shell', | |||
'use_connext_default': 'false', | |||
'ignore_rmw_default': data['ignore_rmw_default'] | {'rmw_connext_cpp', 'rmw_connext_dynamic_cpp'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we do this in a few places a set called ignored_on_arm
or something would be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
found a blocking bug
'use_fastrtps_default': 'true', | ||
'use_opensplice_default': 'true', | ||
'use_connext_default': 'false' if os_name is 'linux-aarch64' else 'true', | ||
'ignore_rmw_default': {'rmw_connext_cpp', 'rmw_connext_dynamic_cpp'} if os_name is 'linux-aarch64' else set(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the empty set here and elsewhere is going to cause connext_dynamic to be enabled on our jobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dirk-thomas (following on from your point in #212 (comment) that the builds will pass because of an AMENT_IGNORE): the checkbox will still be selected on all of those jobs, so the default for the checkbox should be set to false to not be misleading (or the description of the checkbox and the build summaries should be updated to mention that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these two parts should be coupled. The decision to place the marker file in the source repo is one thing. Updating the logic in this repo to "duplicate" that is not necessary.
The inverse is already the case if we drop another marker file in any of the RMW repos that won't affect this repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the difference is that we would not put a marker in an rmw implementation that we want enabled by default. we want connext dynamic disabled by default on our CI (whether there's an AMENT_IGNORE marker there for users building from source or not), and this should be indicated in the defaults for the jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want connext dynamic disabled by default on our CI
From my point of view we want Connext dynamic be disabled when users build the code from source since it currently doesn't work.
The Jenkins should build it just fine when the marker file is removed (e.g. by a branch in that repo). I don't think it is necessary to indicate the presence of the marker file in the default value and description of the Jenkins jobs.
Anyway if you think this is important please feel free to add the changes directly to this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't appreciate that changes that are requested become the responsibility of the reviewer to implement. If you can get consensus from others that having the checkbox enabled by default for an unsupported implementation is reasonable, then I'd be happy to reconsider my position, but otherwise, as I see it, the code can't be approved in its current state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this came up in the waffle meeting as an "RFC"...
I agree with @dhood, it's confusing to have "connext dynamic not ignored" in CI but it still not build because of the file in the source that disables it. I understand why both exist, but until we move to reenable connext dynamic I'd say it is less confusing to have it disable by default in CI too.
The other option would be to comment out or just remove the check box for connext dynamic until we reenable it, because if I understand correctly, if we default to off it will still not work if the user manually flips the switch to "not ignored", which is pretty misleading too. If the state of the check mark doesn't matter, then we should probably just remove it for now.
I don't appreciate that changes that are requested become the responsibility of the reviewer to implement.
I tend to agree, depending on the scale of the change requested. From my perspective, this one is small but there is a disagreement about whether or not to do it, and in that case asking the reviewer to do because you don't agree it should be done (not because it's technically difficult) is not the right response (imo). But we seem to disagree about that implicitly quite often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the state of the check mark doesn't matter, then we should probably just remove it for now.
Sounds sounds fine to me to avoid confusion about it not working either way. I commented out any Connext dynamic related stuff: 2bcdca6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine with me, thanks.
That is not the case. |
Without this change being deployed it isn't possible to unselect FastRTPS in the Jenkins jobs because the new package |
Summarising required changes IMO:
|
As mentioned in the above thread the job has
I have commented out the UI and logic specific parts in 2bcdca6 as suggested by William so that FastRTPS dynamic isn't shown in the Jenkins UI anymore. |
To clarify, the main issue is that AFAIK if we merge this as is I don't see where we get test coverage of fastrtps dynamic on windows, for example. (It might be in a separate nightly but I don't see one) |
Correct, and we can enable that as soon as we decide what we want to test in which frequency. The sooner we come to a decision on that the faster we can restructure our jobs. |
Alright, it wasn't clear to me that this was intended as followup work. The description of the PR made me expect that fastrtps dynamic should be enabled in more nightlies than just the packaging and xenial jobs, so since the code didn't reflect that I came to the conclusion it's a bug that should be fixed. If instead we agree that fastrtps dynamic isn't tested on nightlies now, then it's less of a bug and more of a TODO, correct? |
Based on an offline discussion we will rename the After this is merged but before it is being deployed I will rename the existing job to maintain the history. We won't setup redirects since we don't expect a significant number of links to this rather new job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, I added bd1cc20 which I think is the last of the references to connext dynamic in the jenkins UI. will be a bit difficult to test compilation of connext dynamic on CI now but we can revisit that when needed
also, thanks @dirk-thomas for taking on the work for this and ros2/rmw_fastrtps#218 🙇♀️ |
👍 I was looking at it but was worried "breaking" the xml - didn't realize its an Thank you for the review! |
I renamed the four ci_linux:
test_ci_linux:
ci_packaging_linux:
packaging_linux:
nightly_linux_debug:
ci_linux_coverage:
test_linux_coverage:
nightly_linux_coverage:
nightly_linux_extra_rmw_release:
nightly_linux_release:
nightly_linux_repeated:
ci_turtlebot-demo_linux:
nightly_turtlebot-demo_linux_release:
ci_linux-aarch64:
test_ci_linux-aarch64:
ci_packaging_linux-aarch64:
packaging_linux-aarch64:
nightly_linux-aarch64_debug:
nightly_linux-aarch64_extra_rmw_release:
nightly_linux-aarch64_release:
nightly_linux-aarch64_repeated:
ci_turtlebot-demo_linux-aarch64:
nightly_turtlebot-demo_linux-aarch64_release:
ci_osx:
test_ci_osx:
ci_packaging_osx:
packaging_osx:
nightly_osx_debug:
nightly_osx_extra_rmw_release:
nightly_osx_release:
nightly_osx_repeated:
ci_windows:
test_ci_windows:
ci_packaging_windows:
packaging_windows:
packaging_windows_debug:
nightly_win_deb:
nightly_win_extra_rmw_release:
nightly_win_rel:
nightly_win_rep:
nightly_xenial_linux_release:
packaging_xenial_linux:
nightly_xenial_linux-aarch64_release:
packaging_xenial_linux-aarch64:
ci_launcher:
|
The logic to check if Connext should be installed didn't work with the new command line arguments. So neither of the nightly builds used Connext and instead resulted in CMake warning for not finding Connext. I fixed the logic in 2b0c408. |
And another fix to disable Connext on aarch64 when triggering builds with the launcher job: c127e97 |
And another fix to the comparison operator used for checking for |
Follow up of ros2/rmw_fastrtps#218.
Restructure the selection of RMW implementations by passing a list of to-be-ignored packages. The main change is that the new RMW impl
rmw_fastrtps_dynamic_cpp
is ignored by default in CI builds and enabled by default in nightly builds (same as OpenSplice).