Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow programmatic configuration of unicast relays. #498

Merged
merged 9 commits into from
May 23, 2024

Conversation

mbeards
Copy link

@mbeards mbeards commented May 6, 2024

This change allows users to configure relays from code without having to setenv(GZ_RELAY).

馃帀 New feature

Summary

This change allows users to configure relays from code without having to setenv(GZ_RELAY).

Supersedes #497

Test it

This can be tested by starting a Node across a boundary where multicast isn't enabled, and setting the relay IP via NodeOptions.

Checklist

  • Signed all commits for DCO
  • Added tests
  • Added example and/or tutorial
  • Updated documentation (as needed)
  • Updated migration guide (as needed)
  • Consider updating Python bindings (if the library has them)
  • codecheck passed (See contributing)
  • All tests passed (See test coverage)
  • While waiting for a review on your PR, please help review another open pull request to support the maintainers

Note to maintainers: Remember to use Squash-Merge and edit the commit message to match the pull request summary while retaining Signed-off-by messages.

@mbeards mbeards requested a review from caguero as a code owner May 6, 2024 19:28
@github-actions github-actions bot added the 馃幍 harmonic Gazebo Harmonic label May 6, 2024
@mbeards
Copy link
Author

mbeards commented May 8, 2024

Friendly ping @caguero

Copy link
Contributor

@caguero caguero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By making Discovery::AddRelayAddress() public, I think we now have risk for potential race conditions, as this->relayAddrs can be modified from a user thread (via Node) and the discovery thread can read the same variable via SendUnicast().

@mbeards
Copy link
Author

mbeards commented May 8, 2024

By making Discovery::AddRelayAddress() public, I think we now have risk for potential race conditions, as this->relayAddrs can be modified from a user thread (via Node) and the discovery thread can read the same variable via SendUnicast().

Should I add a new mutex to guard relayAddrs or just use the existing Discovery::mutex?

@caguero
Copy link
Contributor

caguero commented May 8, 2024

By making Discovery::AddRelayAddress() public, I think we now have risk for potential race conditions, as this->relayAddrs can be modified from a user thread (via Node) and the discovery thread can read the same variable via SendUnicast().

Should I add a new mutex to guard relayAddrs or just use the existing Discovery::mutex?

I think you can reuse Discovery::mutex.

@mbeards
Copy link
Author

mbeards commented May 8, 2024

By making Discovery::AddRelayAddress() public, I think we now have risk for potential race conditions, as this->relayAddrs can be modified from a user thread (via Node) and the discovery thread can read the same variable via SendUnicast().

Should I add a new mutex to guard relayAddrs or just use the existing Discovery::mutex?

I think you can reuse Discovery::mutex.

Is there a performance concern to worry about re: serializing the user thread + discovery thread in the presence of unicast relays? Every unicast write would need to acquire that mutex.

@mbeards
Copy link
Author

mbeards commented May 13, 2024

Thoughts on this test failure https://build.osrfoundation.org/job/gz_transport-pr-win/99/ ? I'm having a hard time figuring out how this PR would trigger behavior change on those tests.

@caguero
Copy link
Contributor

caguero commented May 15, 2024

By making Discovery::AddRelayAddress() public, I think we now have risk for potential race conditions, as this->relayAddrs can be modified from a user thread (via Node) and the discovery thread can read the same variable via SendUnicast().

Should I add a new mutex to guard relayAddrs or just use the existing Discovery::mutex?

I think you can reuse Discovery::mutex.

Is there a performance concern to worry about re: serializing the user thread + discovery thread in the presence of unicast relays? Every unicast write would need to acquire that mutex.

By making Discovery::AddRelayAddress() public, I think we now have risk for potential race conditions, as this->relayAddrs can be modified from a user thread (via Node) and the discovery thread can read the same variable via SendUnicast().

Should I add a new mutex to guard relayAddrs or just use the existing Discovery::mutex?

I think you can reuse Discovery::mutex.

Is there a performance concern to worry about re: serializing the user thread + discovery thread in the presence of unicast relays? Every unicast write would need to acquire that mutex.

I'd expect a bit of impact if unicast is used because the mutex will be acquired when calling DispacthDiscoveryMsg(). It shouldn't make a big difference.

@caguero
Copy link
Contributor

caguero commented May 15, 2024

Thoughts on this test failure https://build.osrfoundation.org/job/gz_transport-pr-win/99/ ? I'm having a hard time figuring out how this PR would trigger behavior change on those tests.

ReplayTest looks unrelated.

@mbeards
Copy link
Author

mbeards commented May 20, 2024

Friendly ping @caguero .

Copy link
Contributor

@caguero caguero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

include/gz/transport/NodeShared.hh Outdated Show resolved Hide resolved
src/Node.cc Show resolved Hide resolved
@mbeards
Copy link
Author

mbeards commented May 22, 2024

Minor comments.

Done.

caguero
caguero previously approved these changes May 22, 2024
@caguero caguero dismissed their stale review May 22, 2024 17:00

ABI check

Copy link
Contributor

@caguero caguero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why the ABI checker doesn't catch it but I think that changing the access right (from private to public in our case) in Discovery::AddRelayAddress() breaks ABI.

I think the solution is to create a new public version of the function that calls the existing one.

src/Node_TEST.cc Outdated Show resolved Hide resolved
@mbeards
Copy link
Author

mbeards commented May 22, 2024

Not sure why the ABI checker doesn't catch it but I think that changing the access right (from private to public in our case) in Discovery::AddRelayAddress() breaks ABI.

I think the solution is to create a new public version of the function that calls the existing one.

Let's just target main?

@mjcarroll
Copy link
Contributor

mjcarroll commented May 22, 2024

Not sure why the ABI checker doesn't catch it but I think that changing the access right (from private to public in our case) in Discovery::AddRelayAddress() breaks ABI.

I think it's actually fine on Linux, it's only an issue with MSVC where the symbol mangling includes the access rights

(Edit: but targeting main is easy, too)

@mbeards mbeards force-pushed the programmatic_relay_addr_2 branch from 0c2f5ef to 05d2529 Compare May 22, 2024 17:16
@mbeards mbeards changed the base branch from gz-transport13 to main May 22, 2024 17:39
This change allows users to configure relays from code without having to
`setenv(GZ_RELAY)`.

Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
Signed-off-by: Michael Beardsworth <beardsworth@intrinsic.ai>
@mbeards mbeards force-pushed the programmatic_relay_addr_2 branch from 05d2529 to 811e865 Compare May 22, 2024 17:46
@caguero
Copy link
Contributor

caguero commented May 23, 2024

@osrf-jenkins run tests

@caguero
Copy link
Contributor

caguero commented May 23, 2024

Not sure why this gz_transport-ci-pr_any-jammy-amd64 job is stuck.

@azeey
Copy link
Contributor

azeey commented May 23, 2024

Our CI was recently changed so that main branches only running tests on Noble. But the branch protection for gz-transport wasn't updated. Should be good now.

@azeey
Copy link
Contributor

azeey commented May 23, 2024

We do need to update the GitHub workflow to use Noble . @caguero would you mind doing that when you get a chance?

@azeey azeey merged commit 508d28b into gazebosim:main May 23, 2024
7 checks passed
@caguero
Copy link
Contributor

caguero commented May 27, 2024

We do need to update the GitHub workflow to use Noble . @caguero would you mind doing that when you get a chance?

Thanks @scpeters #504

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
馃幍 harmonic Gazebo Harmonic
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

4 participants