Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

epgm: drop epgm event distribution and munge dependency #1746

Merged
merged 13 commits into from Oct 24, 2018

Conversation

Projects
None yet
3 participants
@garlick
Copy link
Member

garlick commented Oct 20, 2018

As discussed in #1743, the munge dependency in flux-core causes some user headaches and is really only used to encrypt event messages sent over epgm, which is rarely used. It's also a poor choice for encrypting messages since the RTT to the munge daemon adds latency for each message. This PR removes the munge dependency and sends epgm events in the clear, with a warning at wire-up that this mode has no security.

I opened #1745 to remind us to implement epgm privacy using libsodium. I think when we focus on the system instance, we may want to make this production worthy and can revisit at that time. (I could be convinced that we should remove epgm event distribution altogether until we revisit it, but that seemed like it wasn't really necessary. Thoughts?).

As a side effect, this removes code from the flux_sec class, which needs some work, so thinning it down is also progress.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Oct 20, 2018

Codecov Report

Merging #1746 into master will increase coverage by 0.27%.
The diff coverage is 81.81%.

@@            Coverage Diff            @@
##           master   #1746      +/-   ##
=========================================
+ Coverage   79.33%   79.6%   +0.27%     
=========================================
  Files         185     185              
  Lines       34721   34451     -270     
=========================================
- Hits        27547   27426     -121     
+ Misses       7174    7025     -149
Impacted Files Coverage Δ
src/broker/boot_config.c 89.87% <ø> (+2.66%) ⬆️
src/common/libflux/message.c 81.02% <ø> (-0.17%) ⬇️
src/broker/boot_pmi.c 63.43% <ø> (+8.24%) ⬆️
src/common/libflux/security.c 76.47% <0%> (+4.85%) ⬆️
src/broker/broker.c 77.86% <100%> (+1.86%) ⬆️
src/broker/overlay.c 84.38% <88.88%> (+11.54%) ⬆️
src/common/libpmi/clique.c 96.82% <0%> (-3.18%) ⬇️
src/broker/modservice.c 78.84% <0%> (-0.97%) ⬇️
src/cmd/flux-module.c 85.28% <0%> (-0.31%) ⬇️
... and 4 more
@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 22, 2018

I could be convinced that we should remove epgm event distribution altogether until we revisit it, but that seemed like it wasn't really necessary. Thoughts?

Dropping this now and re-adding it later in the context of the system instance might be a good plan. In a do-over, we could limit epgm's use to avoid the "event relay" complexity when running multiple ranks per node, and potential port collisions in hierarchical instances. The reimplementation would be held to our current higher standard of test coverage.

@garlick garlick changed the title epgm: drop munge encryption of messages epgm: drop epgm event distribution and munge dependency Oct 22, 2018

@garlick garlick force-pushed the garlick:less_munge branch from 32b749a to 269fd19 Oct 23, 2018

garlick added some commits Oct 20, 2018

broker: drop munge encryption on event overlay
Problem: flux-core's munge dependency is inconvenient
for spack users building single-user flux, because
the socket path in munge client library and server
can easily be mismatched.

Munge is only needed to provide privacy for EPGM
event distribution, which is not generally used since
event distribution defaults to the TBON.  Munge was probably
not the best choice anyway, given the RTT to the munge
daemon for every message - encrypting with libsodium would
be better.

Disable encryption of PUB/SUB messages for the time being
to resolve the inconvenience.  Revisit when EPGM is deployed
for the system instance.
broker/boot_config: drop mcast.endpoint
Drop broker bootstrap code that allowed mcast.endpoint
to be set from a config file.
broker/boot_pmi: drop mcast/relay code
Drop broker bootstrap code for setting mcast.endpoint when
bootstrapping from PMI.

Also remove the code for arranging for an "event relay"
when epgm is used and there are multiple brokers per node.
broker: drop code for managing event socket
Drop the code from the broker that routed events
over a specialized event socket, and the "muting"
of TBON event distribution once that other event
distribution method became active.

TBON is the only event distribution method now.
broker/overlay: drop event socket support
Drop the following functions that are no longer used:
  overlay_mute_child ()
  overlay_set_event ()
  overlay_get_event ()
  overlay_set_event_cb ()
  overlay_sendmsg_event ()
  overlay_recvmsg_event ()
  overlay_set_relay ()
  overlay_get_relay ()
  overlay_sendmsg_relay ()

@garlick garlick force-pushed the garlick:less_munge branch from 269fd19 to 8c27d90 Oct 23, 2018

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Oct 23, 2018

This is likely pretty safe to go, as it is mostly demolition.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Oct 24, 2018

LGTM!

@grondo grondo merged commit e4baafb into flux-framework:master Oct 24, 2018

3 checks passed

codecov/patch 81.81% of diff hit (target 79.33%)
Details
codecov/project 79.6% (+0.27%) compared to 27d2efc
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@garlick garlick deleted the garlick:less_munge branch Nov 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.