Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvs: convert missing_refs_list to hash or deal with duplicates #1751

Closed
chu11 opened this issue Oct 23, 2018 · 2 comments
Closed

kvs: convert missing_refs_list to hash or deal with duplicates #1751

chu11 opened this issue Oct 23, 2018 · 2 comments

Comments

@chu11
Copy link
Member

chu11 commented Oct 23, 2018

While investigating #1747, it occurred to me that the missing_refs_list that is returned in kvstxn could have duplicate references. Perhaps it'd be better to internally represent it as a hash, so that duplicate missing references aren't looked up twice.

@chu11 chu11 changed the title kvs: convert missing_refs_list to hash kvs: convert missing_refs_list to hash or deal with duplicates Oct 24, 2018
@chu11
Copy link
Member Author

chu11 commented Oct 24, 2018

I actually began to make this change, but wasn't sure if it was a net win. Basically, if a user were to do a kvs transaction with many writes to the same directory, such as

txn = flux_kvs_txn_create()
flux_kvs_txn_put (txn, "dir.a", "a")
flux_kvs_txn_put (txn, "dir.b", "b")
...
flux_kvs_txn_put (txn, "dir.y", "y")
flux_kvs_txn_put (txn, "dir.z", "z")
flux_commit (h, txn)

and the reference for directory "dir" happens to not yet be loaded, this would generate 26 missing references to be looked up, and 26 lookups. So ...

A) is this common? I don't think so? B/c in the above scenario, I imagine most of the time, it's more like:

txn = flux_kvs_txn_create()
flux_kvs_txn_mkdir (txn, "dir")
flux_kvs_txn_put (txn, "dir.a", "a")
flux_kvs_txn_put (txn, "dir.b", "b")
...
flux_kvs_txn_put (txn, "dir.y", "y")
flux_kvs_txn_put (txn, "dir.z", "z")
flux_commit (h, txn)

so the missing reference for "dir" isn't an issue.

B) if it is maybe something to consider, is detecting duplicates in the references list a net win? Regardless of how its done.

@chu11
Copy link
Member Author

chu11 commented Oct 25, 2018

was talking to @garlick about this, and he reminded me that while the above is true, higher level code in the cache / wait data structures probably handles this. @garlick's memory is far better than mine and he's right, although the code is a tad non-optimal.

Basically in the core missing references loop is:

for all missing references {
    entry = cache_lookup (ref);
    if (!entry) {
        entry = cache_create_entry (ref);
        content_store_load (ref);
    }
    if (cache_entry_doesn't_have_valid_data ())  {
        cache_add_me_to_waitlist (wait);
        stall;
    }
}

So the worst case of sending multiple rpcs to load the same data from the content store is avoided.

However, a small non-optimal thing is that every identical missing reference would be added to the waitlist on a cache entry. So in my worst case example above, the waitlist queue on the cache entry would be 26 long.

At the minimum some comments to clarify in the code should be done to explain this. So I'll leave this open until I add some comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant