Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvs: add optional namespace key prefix #1341

Closed
garlick opened this issue Feb 15, 2018 · 19 comments
Closed

kvs: add optional namespace key prefix #1341

garlick opened this issue Feb 15, 2018 · 19 comments
Assignees

Comments

@garlick
Copy link
Member

garlick commented Feb 15, 2018

In #1197, a suggested simpler alternative to the "mount points" discussed in RFC 16 is an optional namespace key prefix, specified like primary/a.b.c that could be the target of a symlink.

Example, guest namespace created for job is named 00b5c36c-1296-11e8-b0c4-43d860a24705. The primary namespace portion of the job schema could have a symlink like:

guestns -> 00b5c36c-1296-11e8-b0c4-43d860a24705/.

Then the user can lookup jobs.42.guestns.whatever like it was all one namespace.

The problem we discussed was that commit transactions might reference keys in multiple namespace, and that would break atomicity as well as complicate the commit handling code. We decided that case needed to be detected and failed, but then ran into a problem such a commit could be merged with a perfectly valid commit, and then the good commit would fail also.

A proposal to solve this latter problem is now in #1337. If we can get that fixed, then the proposed namespace prefix could presumably be implemented.

@chu11
Copy link
Member

chu11 commented Mar 9, 2018

thinking about this today about a plan of action to get this working

  1. support namespace prefix on lookup requests (i.e. lookup, watch, etc). This is pretty simple, a generic parse on the key passed in by the user within the request callback.

  2. support namespace prefix lookup on symlinks. This is going to be trickier. In the internal lookup API, I need to inform the caller that I need a new namespace, and thus a new root reference. Should be doable similar to the current "i need a missing reference" code.

  3. support namespace prefix on transactions (i.e. commits/fences). This is going to be even trickier. Same problems as Fixes for flux-start #2 above AND mechanism that will allow the internal kvstxn API to inform the caller that multiple namespaces have new root references. I imagine the "get new root reference" function will be turned into an iterating function w/ a callback.

@garlick
Copy link
Member Author

garlick commented Mar 9, 2018

mechanism that will allow the internal kvstxn API to inform the caller that multiple namespaces have new root references

Multiple? I think we decided that a txn that references more than one namespace is an error, since the transaction could not be atomic.

@chu11
Copy link
Member

chu11 commented Mar 9, 2018

@garlick oh dang, you're right. And it's written as such in the first comments above. I had just forgotten :-)

@chu11
Copy link
Member

chu11 commented Mar 17, 2018

So something I was thinking about while working on this issue.

Currently, a lookup RPC returns the rootdir from the lookup to the caller. This is so the caller can re-do the lookup at a later time and ensure they get the same information back.

With the potential to leap namespaces via symlink, the returned rootdir returned to the user is sort of useless. The next lookup can have a different result b/c the root reference of the other namespace may have changed. The solution would be to have the RPC return some list of rootdirs listing all the root references used in the lookup (and the user has the ability to pass in a list of rootdirs to use on all namespaces).

Unsure of the need, but it is definitely more of a long term thing.

Then it got me thinking. Longer out, maybe we would want to support "hard" and "soft" symlinks? The "hard" version could reference a specific root-ref. Hypothetically something like:

guestns -> othernamespace@sha1-0123456789/.

@garlick
Copy link
Member Author

garlick commented Mar 17, 2018

Looking at the libkvs code, I can't see that the rootdir in the lookup response is ever actually used.
For fun, I dropped it from the response message and all the tests still pass.

(Sorry, when we spoke yesterday I thought it was used for the "snapshot walk" feature in flux_kvsdir_t, but that seems to use the treeobj argument from flux_kvs_lookupat() instead).

I guess the issue you describe is still a bit problematic for "lookupat" in general, e.g. if you're expecting to walk a snapshot and you cross a symbolic link that takes you out of it, your expectations may be violated. However, I'm not sure that is fundamentally broken, since how you deal with a symlink in that situation is up to the user.

So maybe for now just drop the rootdir from the lookup response?

As for "hard symlinks", is there any point in including the namespace and the content reference? The content reference is namespace-independent, so it seems like that's just a "hard link", which we already support, implicitly and explicitly. Implicitly when two keys happen to point to content with the same hash, and explicitly with flux_kvs_txn_put_treeobj().

@chu11
Copy link
Member

chu11 commented Mar 17, 2018

Ok, lets drop rootdir's return in the RPC.

For the record, I can't think of a scenario where the hardlink is actually necessary as there are other ways of doing what needs to be done. It's just an idea I thought of in case a user did the following:

assume

a.symlink -> NS2/value
flux kvs --namespace=NS2 put value=1
flux kvs --namespace=NS1 get --at <a root reference> a.symlink
flux kvs --namespace=NS2 put value=2
flux kvs --namespace=NS1 get --at <same root reference> a.symlink

The first get returns 1 and the second one gets 2, b/c of the namespace crossing.

@garlick
Copy link
Member Author

garlick commented Mar 17, 2018

Yeah, I think that's probably the right outcome when accessing a symlink in a snapshot that crosses into another namespace. The immutability of the snapshot only holds for the symlink itself, not its target.

@grondo
Copy link
Contributor

grondo commented Mar 19, 2018

Just discussing idly this morning with @garlick about the possibility of future inconvenience by co-opting the / character for kvs namespace prefix. I fully admit I haven't been following this work closely, so feel free to ignore my statements. However, what if instead of using an arbitrary string namespace name + separator character for the namespace prefix, you require that the namespace prefix begin with a well-known string + namespace name + separator? (e.g. "ns:NAMESPACE/")

This would reduce the likelihood of users wanting to use a key with the namespace character embedded (since this would be allowed without the ns: prefix), and also allow the namespace matching code to always be able to look at the first, say, 3 characters of a key and know if it had a namespace prefix instead of scanning the whole string for /. (Of course, I admit this idea might be very naive, and we haven't proven to have a real problem with the current approach yet)

Just wanted to throw this idea out there.

@garlick
Copy link
Member Author

garlick commented Mar 19, 2018

Good idea.. Any thoughts on this @chu11?

@chu11
Copy link
Member

chu11 commented Mar 19, 2018

In general, I think it's a good idea. I was bring this up later on, b/c the separator isn't that important. We can change the parsing out of the key/namespace later on.

I originally was going to suggest ::, b/c its the "namespace" separator in C++ :-)

@chu11
Copy link
Member

chu11 commented Mar 24, 2018

So today I finished up namespace prefix support on lookups for the initial key passed in by the user. I then began to add support on the commit/fence side.

As I was working on this, I realized, there's a lot of code to support this. A lot of strchr(), a lot of branches, and on the commit/fence strcmp() (b/c we can't cross namespaces, but you can stay in your current namespace). Add in some mallocs while I'm parsing the keys too.

At some point, I pondered, is this a good idea? We're introducing a lot of branches and parsing and stuff for a common path, which is possibly not going to be used that much.

Do we care that much about namespace prefixes on the initial key/path input by the user? Perhaps namespace prefix only should be allowed for symlinks?

@grondo
Copy link
Contributor

grondo commented Mar 24, 2018

A lot of strchr()

Avoiding strchr() (in addition to avoiding adding another character people can't use in keys) was the reason I was advocating a constant prefix string for denoting a namespace. With the strchr() approach you have to search the entire string for every key, whereas if you prefix every namespace with something like "ns::" you can break out as soon as the first character, worst case 4 characters, and even if your namespace separator is / (e.g. "ns::NS1/") , you can still use /` in normal keys.
(sorry couldn't resist giving my $0.02)

Do we care that much about namespace prefixes on the initial key/path input by the user? Perhaps namespace prefix only should be allowed for symlinks?

I'm probably not going to be much help with the larger question here, sorry. Keep in mind the main use case for namespaces is so that the "job shell" and front end tools can "chroot" or otherwise have an easy handle to the job namespace. It would seem to be cleaner if this doesn't have to be done by bouncing through a symlink.

@chu11
Copy link
Member

chu11 commented Mar 24, 2018

Going with the ns:: approach would certainly be better, but I think there's so much parsing going on, I'm more questioning if this shouldn't be supported in the general case.

@garlick
Copy link
Member Author

garlick commented Mar 24, 2018

The use case I had in mind for the symlink support was for the instance owner to be able to walk the various job namespaces to obtain status or whatever without needing to track the mapping of jobs to namespaces in some other way. (An obvious way would be to store the namespace name in a per-job key in the primary namespace)

It wouldn't be a proper symlink if the target is special, i.e. you can't do a readllink on the target and manually look that up.

I guess this was a "nice to have" feature, so if it turns out to make a mess of things, we can always fall back to something like the above... Maybe talk more on monday?

@chu11
Copy link
Member

chu11 commented Mar 24, 2018

It wouldn't be a proper symlink if the target is special, i.e. you can't do a readllink on the target and
manually look that up.

Wasn't suggesting we don't support namespace prefix on symlink, just on the base keys that user inputs. So for example:

a.symlink -> NS/value

We'd still support:

flux kvs get a.symlink

But not support

flux kvs get NS/value

Users would have to do

flux kvs --namespace=NS get value

But we can talk more on Monday.

@chu11
Copy link
Member

chu11 commented Mar 26, 2018

Spoke to @garlick, went back and forth on thoughts. Jim had a good idea.

On commit/fence transactions, the user can send in a list of operations (i.e. key=value pairs). We will only check the first operation's key to determine the namespace this set of operations belongs to, instead of checking every key to determine if they all belong to the same namespace. When we apply the operations to the kvs, if the user goes outside of the namespace, we issue an error.

This simplifies things a lot.

In addition, chains of namespaces (i.e. a/b/c/d) are allowed on lookups, but they will not be allowed on commits/fences, as it violates the "going out of a namespace" conditions of commits/fences.

@garlick
Copy link
Member Author

garlick commented Apr 3, 2018

Just fixed a typo in the description of this issue which sent @chu11 down the wrong path in #1423. So sorry about that!

(I used slash instead of period as the KVS path separator, and that was ambiguous with the proposed namespace separator).

@chu11
Copy link
Member

chu11 commented Apr 3, 2018

After talking to @garlick, what we've decided is that "namespace chaining" i.e. something like key = "A/B/C/val" is not going to be allowed. The user can specify at most one namespace prefix. If more than one is specified, it's an error.

@chu11
Copy link
Member

chu11 commented Apr 7, 2018

with #1432 merged in, I think this issue is complete. There are some nit-picky issues remaining, but they are listed in other issue tickets.

@chu11 chu11 closed this as completed Apr 7, 2018
multi-user parallel jobs automation moved this from In progress to Done Apr 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests

3 participants