Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add FLUX_MSGFLAG_PRIVATE and allow guests to content load/store #1032

Merged
merged 11 commits into from Apr 13, 2017

Conversation

4 participants
@garlick
Copy link
Member

garlick commented Apr 7, 2017

This PR allows guest users (specifically users with FLUX_ROLE_USER) to perform content load and store operations. The motivation is to allow signed jobspec and/or resource tokens to be "passed by reference" between multi-user components, although the specifics of that design is still very much in flux (so to speak).

Since knowing the KVS root blobref would allow a guest user to walk the KVS namespace, and the KVS is currently intended only for the instance owner, a new message privacy flag is added so that messages can be restricted so they are only propagated to connections authenticated as the instance owner or the sending user. The privacy flag is added to the kvs.setroot event which contains root blobref updates.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 8, 2017

Coverage Status

Coverage increased (+0.01%) to 78.225% when pulling 0a30764 on garlick:open_content into 6fbe380 on flux-framework:master.

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Apr 8, 2017

Codecov Report

Merging #1032 into master will decrease coverage by <.01%.
The diff coverage is 96%.

@@            Coverage Diff             @@
##           master    #1032      +/-   ##
==========================================
- Coverage   77.92%   77.92%   -0.01%     
==========================================
  Files         150      150              
  Lines       25648    25666      +18     
==========================================
+ Hits        19985    19999      +14     
- Misses       5663     5667       +4
Impacted Files Coverage Δ
src/broker/broker.c 72.34% <ø> (ø) ⬆️
src/broker/log.c 68.32% <ø> (ø) ⬆️
src/broker/sequence.c 93.75% <ø> (ø) ⬆️
src/cmd/flux-ping.c 87.5% <ø> (-0.1%) ⬇️
src/broker/hello.c 79.24% <ø> (ø) ⬆️
src/modules/aggregator/aggregator.c 77.17% <ø> (ø) ⬆️
src/modules/content-sqlite/content-sqlite.c 54.54% <ø> (ø) ⬆️
src/modules/libjsc/jstatctl.c 78.56% <ø> (ø) ⬆️
src/modules/barrier/barrier.c 75.17% <ø> (ø) ⬆️
src/broker/attr.c 85.5% <ø> (ø) ⬆️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4ceed4...77e28da. Read the comment docs.

return -1;
if (!(cache->store_w = add_handler (h, "content.store", FLUX_ROLE_USER,
content_store_request, cache)))
return -1;
if (flux_get_rank (h, &cache->rank) < 0)

This comment has been minimized.

@grondo

grondo Apr 8, 2017

Contributor

Question: I was looking at this commit as an example of converting a service to other users than owner, but this is a bit confusing since content.load and content.store are still in the handlers vector. Is that on purpose because the subsequent handlers will override the previously registered versions, or just an oversight?

Can flux_msg_handler_allow_rolemask be called after flux_msg_handler_start? If so, a call to return the watcher from a handler vector might be really handy, eg..

flux_msg_handler_t *w = flux_msg_handler_vec_find (handlers, "content.load");
if (w)
    flux_msg_handler_allow_rolemask (w, FLUX_ROLE_USER);

If not, a version of addvec that directly takes allowed roles would be nice.

This comment has been minimized.

@grondo

grondo Apr 8, 2017

Contributor

Ah, I see now from broker/attr.c that allow_rolemask can be called after addvec. I suggest the flux_msg_handler_vec_find or similar approach here, which will make it easy for other services to use the same approach. (safer than raw index into the vector...)

This comment has been minimized.

@garlick

garlick Apr 8, 2017

Author Member

Ah that was actually an oversight - I meant to remove them from the handlers vector.

Agreed the interface needs work!

Can we just add a field to the end of the vector struct and assume it will be zeroed if not explicitly initialized? I wasn't sure if it's wise to depend on it being zeroed.

This comment has been minimized.

@grondo

grondo Apr 8, 2017

Contributor

I thought leaving fields off the initializer was only possible with designated initializers, however it does appear even with traditional struct initializer you can leave off trailing members. However, I'm also not sure if these are guaranteed to be zeroed -- you're probably right that depending on implicit zeroing is unwise.

Since this is security related, it seems wise to require an explicit call to enable access to roles other than FLUX_ROLE_OWNER, instead of relying on implicit behavior, even if initialization of members is guaranteed to be zeroed.

The convenience function to find a msghandler in a msg_handler_vec is trivial, so I guess for now service authors can roll their own, though it would be nice to discourage both explicit indicies and having msghandlers split between explicit creation and a flux_msg_handler_vec.

This comment has been minimized.

@garlick

garlick Apr 9, 2017

Author Member

Right, one could reasonably declare the vector on the stack where there is no implicit initialization. Hmm, there is -Werror=missing-field-initializers (valid for both clang and gcc).

This comment has been minimized.

@garlick

garlick Apr 9, 2017

Author Member

In my working branch I just went through the exercise of adding this flag and a rolemask field to the vector, and I think I like the result. It adds little verbosity to the code, while making the security policy explicit for each method implemented by a service. I'll clean it up and rebase this so you can see what you think.

External projects might still manage to improperly declare these vectors, but at least we can ensure core and sched are done right, and then people following the patterns implemented there without enabling the compiler check will at least be starting with good code...

@garlick garlick force-pushed the garlick:open_content branch from 0a30764 to 51e2484 Apr 9, 2017

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 9, 2017

Coverage Status

Coverage decreased (-0.004%) to 78.211% when pulling 51e2484 on garlick:open_content into 6fbe380 on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Apr 11, 2017

Just getting back to looking at this and, I agree, the fully specified vectors are definitely a nice incremental improvement. My only worry is if some other flags, mask, or member is added to the msg handlers in the future we'll have to break the API again. However, perhaps by then some other method for managing msghandlers will make itself apparent.

Presumably this PR needs a corresponding update in flux-sched, to be merged after this?

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 12, 2017

Sched should be updated to add -Werror=missing-field-initializers to CFLAGS and then fully specify its vectors, however it shouldn't need to be done in lock step with this. (I did just scan the sched vectors and none of them initialize more than the first three elements).

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Apr 12, 2017

Sched should be updated to add -Werror=missing-field-initializers to CFLAGS and then fully specify its vectors, however it shouldn't need to be done in lock step with this.

I was more worried about the allowed rolemask becoming uninitialized for users that haven't updated vectors, but I guess that isn't a worry in practice?

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 12, 2017

Should be OK as all of them in sched are in the bss, which is always zeroed.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Apr 12, 2017

ok, seems ready to merge once rebased

@garlick garlick force-pushed the garlick:open_content branch from 51e2484 to 2119019 Apr 12, 2017

@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 12, 2017

Thanks, I've rebased and I'm putting together a PR for sched.

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 12, 2017

Coverage Status

Coverage decreased (-0.004%) to 78.215% when pulling 2119019 on garlick:open_content into f915009 on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Apr 12, 2017

Oops sorry, merged the sharness update first so this one needs a rebase.

garlick added some commits Apr 6, 2017

libflux/message: add FLUX_MSGFLAG_PRIVATE
Add FLUX_MSGFLAG_PRIVATE message flag, and accessors:

flux_msg_set_private()
flux_msg_is_private()

This bit can be set on any message.  If set, connectors
must not forward private messages to connections other
than those authenticated with the instance owner role,
or the same userid as the sender of the message.

The motivating use case is to protect the kvs.setroot
event which exposes the KVS root blobref.  Direct use of
the KVS is restricted to the instance owner, but we would
like to make the content store available to guest users.
modules/connector-local: honor message privacy
If FLUX_MSGFLAG_PRIVATE is set in an event or response
message, it can only be forwarded to a connected client
if the client has authenticated with FLUX_OWNER_ROLE
or has the same userid as the sender of the message.
libflux/dispatch: add rolemask to addvec struct
Problem: it's awkward to change the security policy of
message handlers added with flux_msg_handler_addvec().

Add a new field 'rolemask' to struct flux_msg_handler_spec
which should be initialized to zero for the default security
policy of only handling messages with FLUX_ROLE_OWNER.

Ensure that all handler vectors initialize all fields,
and in preparation for -Werror=missing-field-initializers,
make some unrelated initializers explicit as well.
build: enable -Werror=missing-field-initializers
Problem: sloppy structure initialization on stack-allocated
storage can lead to uninitialized fields.

Tell the compiler to flag uninitialized fields as errors.
modules/kvs: set privacy bit on kvs.setroot
As a precursor to opening content-cache.load to guest
users, make the kvs.setroot (and kvs.error) events private
to the instance owner, as these are intended to be
"internal" to the KVS service.

A guest user should not be able to subscribe to kvs.setroot
and obtain the root blobref, which could would allow the
guest to walk the entire KVS namespace.
test/security: test private events
Test event distribution, as limited by connector-local,
and as limited by the default dispatch policy.

Test that content load/store, but not other content operations,
are available to a guest user.

Test that guest users can't see kvs.setroot events.
test/event-trace.lua: drop unnecessary test event
Problem: event subscription requests were made synchronous
by PR #380, but event-trace.lua still presumes they are
asynchronous and sends a test event through before beginning
the test.

Drop unnecessary handshake from event-trace.lua.
modules/connector-local: add DEBUG_OWNERDROP_ONESHOT
Add debug hook so that the next connection authenticated
as FLUX_ROLE_OWNER drops its authenticated credentials
to rolemask=FLUX_ROLE_USER, userid=FLUX_USERID_UNKNOWN.
test/event-trace-bypass.lua: new variant of script
Add a variant of event-trace.lua which calls f:recv_event()
rather than register an event handler with the reactor.
Since the message dispatcher is not involved, the default
restrictive security policy is bypassed.
modules/connector-local: receive all messages
Problem: connector-local does not receive messages from
guest users due to the default dispatch policy on its
shmem:// connector.  Therefore its connected users can
only receive messages sent by the instance owner.

Change the dispatch policy to allow FLUX_ROLE_ALL so that
all messages are delivered to the connector.  The connector
can then implement its own policy on what to route to its
connected users.
broker/content-cache: open load/store to USER role
Allow users with FLUX_ROLE_USER to load/store blobs from/to
the content cache.

While the KVS service is only available to the instance owner,
the content cache, which serves as backing store for KVS objects,
can be safely made available to guest users as long as the
blobrefs that refer to KVS objects are not exposed to guest
users.

@garlick garlick force-pushed the garlick:open_content branch from 2119019 to 77e28da Apr 13, 2017

@coveralls

This comment has been minimized.

Copy link

coveralls commented Apr 13, 2017

Coverage Status

Coverage increased (+0.01%) to 78.169% when pulling 77e28da on garlick:open_content into e4ceed4 on flux-framework:master.

@grondo

This comment has been minimized.

Copy link
Contributor

grondo commented Apr 13, 2017

My only comment is that a --no-reactor or --bypass option could have been added to the existing event-tracer script instead of a new script, however it hardly seems worth more work at this point, so merging.

@grondo grondo merged commit e99071f into flux-framework:master Apr 13, 2017

4 checks passed

codecov/patch 96% of diff hit (target 77.92%)
Details
codecov/project Absolute coverage decreased by -<.01% but relative coverage increased by +18.07% compared to e4ceed4
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.01%) to 78.169%
Details
@garlick

This comment has been minimized.

Copy link
Member Author

garlick commented Apr 13, 2017

Oops, yeah that would have been better. Thanks for the merge.

@garlick garlick deleted the garlick:open_content branch May 24, 2017

@grondo grondo referenced this pull request Aug 23, 2017

Closed

0.8.0 Release #1160

@garlick garlick self-assigned this Feb 13, 2018

@garlick garlick added this to To do in multi-user parallel jobs via automation Feb 13, 2018

@garlick garlick moved this from To do to Done in multi-user parallel jobs Feb 13, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.