Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mmapcache #2457

Merged
merged 110 commits into from Apr 7, 2019

Conversation

Projects
None yet
2 participants
@mpranj
Copy link
Member

commented Mar 4, 2019

Initial implementation of #2270. Some improvements are left for a second PR.

Basics

Do not describe the purpose here but:

  • Short descriptions should be in the release notes (added as entry in
    doc/news/_preparation_next_release.md which contains _(my name)_)
    Please always add something to the the release notes.
  • Longer descriptions should be in documentation or in design decisions.
  • Describe details of how you changed the code in commit messages
    (first line should have module: short statement syntax)
  • References to issues, e.g. close #X, should be in the commit messages.

Checklist

Check relevant points but please do not remove entries.
For docu fixes, spell checking, and similar none of these points below
need to be checked.

  • I added unit tests
  • I ran all tests locally and everything went fine
  • affected documentation is fixed
  • I added code comments, logging, and assertions (see doc/CODING.md)
  • meta data is updated (e.g. README.md of plugins)

@mpranj mpranj force-pushed the mpranj:mmapcache branch 3 times, most recently from 81874a2 to 886c929 Mar 6, 2019

@@ -86,6 +86,9 @@ typedef enum
/** Everything went fine and the function **did not** update the given keyset/configuration */
#define ELEKTRA_PLUGIN_STATUS_NO_UPDATE 0

/** Everything went fine and we have a cache hit */
#define ELEKTRA_PLUGIN_STATUS_CACHE_HIT 2

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 7, 2019

Contributor

Is it necessary to distinguish between NO_UPDATE and CACHE_HIT?

This comment has been minimized.

Copy link
@mpranj

mpranj Mar 7, 2019

Author Member

Yes. If all backends return:

  • NO_UPDATE: nothing at all to do.
  • CACHE_HIT: need to fetch from cache (invoke cache, update all pointers)
  • SUCCESS: need to fetch from storage

NO_UPDATE is faster than fetching the cache. I need to implement delaying the pointer update though.

@markus2330
Copy link
Contributor

left a comment

Amazing work, I cannot wait to merge this!

// Check if a update is needed at all
switch (elektraGetCheckUpdateNeeded (split, parentKey))
{
case -2: // We have a cache hit
ELEKTRA_LOG_DEBUG ("CACHE HIT");

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 7, 2019

Contributor

put this to a function?


## Usage

You can use `scripts/copy-cache`

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 7, 2019

Contributor

Please update.

This comment has been minimized.

Copy link
@mpranj

mpranj Mar 7, 2019

Author Member

Yep. Sorry, but it was not yet ready for review. Some problems which I did not encounter came up on the build servers.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 7, 2019

Contributor

I only looked over it now so that we hopefully can immediately merge it once the build server says okay. The obvious TODOs, like the code to be removed, I did not mark, though.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Mar 14, 2019

This pull request fixes 1 alert when merging c5d8a99 into 0f9f894 - view on LGTM.com

fixed alerts:

  • 1 for Missing header guard

Comment posted by LGTM.com

@mpranj mpranj force-pushed the mpranj:mmapcache branch 5 times, most recently from f473baf to 31507b0 Mar 15, 2019

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Mar 23, 2019

Any update here? It would be great to have this!

kdb ls system  7,74s user 0,59s system 40% cpu 20,734 total
kdb ls system > /dev/null 2> /dev/null  7,77s user 0,16s system 98% cpu 8,075 total
@mpranj

This comment has been minimized.

Copy link
Member Author

commented Mar 23, 2019

I'm still working on it. Some problems appeared on the build server. I have not been able to reproduce the problems locally, which makes it tedious to debug.

One test fails because it apparently reorders some ini ordering. I don't know why this happens yet.

--- a/user.check.dump.txt
+++ b/user.export.dump.txt
@@ -5,7 +5,7 @@ user/sw/tests/jna/1
 keyMeta 22 3
 internal/ini/key/last #0
 keyMeta 19 3
-internal/ini/order #2
+internal/ini/order #3
 keyMeta 20 5
 internal/ini/parent user
 keyMeta 21 1
@@ -16,7 +16,7 @@ user/tests
 keyMeta 22 3
 internal/ini/key/last #0
 keyMeta 19 3
-internal/ini/order #3
+internal/ini/order #2
 keyMeta 20 5
 internal/ini/parent user
 keyMeta 21 1

@mpranj mpranj force-pushed the mpranj:mmapcache branch from 641d073 to bc0f413 Mar 23, 2019

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Mar 23, 2019

I have not been able to reproduce the problems locally, which makes it tedious to debug.

Even not with docker?

One test fails because it apparently reorders some ini ordering.

We were thinking to throw this INI tests as we anyway want YAML and not INI as default (see #2499 for a previous situation where this test caused problems). But in your case it might be that it actually unveiled a bug?

If this is the only problem, please simply exclude your plugin in the INI tests, create an issue that we do not forget about it, and let us merge this.

@mpranj

This comment has been minimized.

Copy link
Member Author

commented Mar 23, 2019

Even not with docker?

I was able to reproduce it today with docker, but the root cause is still unclear to me.

If this is the only problem, [...]

There are other test failures and I think this has unveiled a real bug. It's weird that the same tests run fine on multiple other instances.

@mpranj

This comment has been minimized.

Copy link
Member Author

commented Mar 24, 2019

Ok, so part of the problem is that older macOS (using HFS+) is missing nanosecond timestamp information. We basically have this problem in our resolver already, which could be shown with a test case. The cache simply triggers the bug more often, since this timestamp information is persisted.

I think the best would be to disable the cache for HFS+ filesystems (where tv_nsec is always 0).

@mpranj mpranj force-pushed the mpranj:mmapcache branch from 5c24cc6 to 1d3be90 Mar 24, 2019

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Mar 24, 2019

I think the best would be to disable the cache for HFS+ filesystems (where tv_nsec is always 0).

Yes, I agree. Once we fixed #2533, we can activate the cache again.

@mpranj mpranj force-pushed the mpranj:mmapcache branch from 9c4d788 to 3be5ba4 Mar 25, 2019

@markus2330
Copy link
Contributor

left a comment

Congratulations! Seems like all test cases succeed now!

Please quickly do the remaining tasks so that we finally can merge this 👍

@@ -155,6 +155,10 @@ The following section lists news about the [modules](https://www.libelektra.org/

- We fixed an incorrect format specifier in a call to the `syslog` function. _(René Schwaiger)_

### Cache

- [cache](https://www.libelektra.org/plugins/cache) is a new global caching plugin. It uses [mmapstorage](https://www.libelektra.org/plugins/mmapstorage) as its storage backend and lazily stores keysets from previous ´kdbGet()´ calls. _(Mihael Pranjić)_

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 26, 2019

Contributor

Please highlight it and also put benchmarks here (does not need to be within this PR, though)

@mpranj

This comment has been minimized.

Copy link
Member Author

commented Mar 26, 2019

@markus2330 I'm afraid the reported problem with ini is not yet fixed. The tests succeeded because of an error where the cache was effectively disabled (always a cache miss).

With the cache disabled, the key sw/tests/jna/1 does not exist (is deleted). With the cache enabled, the key remains in the config. I'd like to fix this before we merge.

@mpranj mpranj force-pushed the mpranj:mmapcache branch from 3227a0d to b4b26dc Mar 26, 2019

@kodebach kodebach referenced this pull request Mar 30, 2019

Merged

Opts plugin #2471

7 of 9 tasks complete

@mpranj mpranj force-pushed the mpranj:mmapcache branch from 5867524 to 040ceb9 Apr 3, 2019

@mpranj

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2019

@markus2330 I implemented a check where we fetch from the backends and from the cache to see what is going on with this last problem. Doing some tests by hand it seems it only fails with ini and not with dump.

After a bit more investigation I noticed that ini returns different internal/ini/order metakeys for exactly the same key (e.g. user/test/foo) depending on the parentKey from the invocation. I did not expect this behaviour and it is definitely incompatible with the current design of the cache.

@tom-wa can you confirm my understanding og the internal/ini/order key?

If I am not missing something here, we should simply make the cache incompatible with ini. Either not compile it at all, or simply put in the README, or some other solution. I think a dynamic check whether ini is in a backend is maybe too much overhead.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Apr 6, 2019

I implemented a check where we fetch from the backends and from the cache to see what is going on with this last problem.

This sounds like a very handy debugging feature. Maybe you can leave this code or at least leave the commit in the history so that we can run this check again if we once again run into such a problem.

If I am not missing something here, we should simply make the cache incompatible with ini.

Yes, as already said I am fine with this solution! Simply document that INI as default and mmap cache do not play well together. (Maybe with a bit of description what is the worst thing that could happen.) And obviously disable your plugin in the INI-as-default test case.

It would be great to have this merged asap.

@mpranj

This comment has been minimized.

Copy link
Member Author

commented Apr 6, 2019

This sounds like a very handy debugging feature.

The way I implemented it is not really suitable because it clutters kdbGet a lot but I'll think of a nice way to add it. If I don't find a nice way I'll just leave the dirty way in the history.

Simply document that INI as default and mmap cache do not play well together.

I think ini makes problems even if it is just mounted anywhere. It keeps track of global ordering which is different when kdbGet is invoked with different parentKeys. I think the problem/effect occurs when ini is present in multiple backends.

Unfortunately I really thought it was a bug in the cache at first, which wasted a lot of time. From my current understanding it's really just an incompatibility issue.

It would be great to have this merged asap.

I should be able to finish it up tomorrow, I can't work on it today.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Apr 6, 2019

The way I implemented it is not really suitable because it clutters kdbGet a lot but I'll think of a nice way to add it. If I don't find a nice way I'll just leave the dirty way in the history.

leaving in the history is enough, do not spent extra time here.

mpranj added some commits Mar 24, 2019

mmapstorage: change logging approach for global cache
Warnings are suppresed to debug leve for the global cache mode.
Missing files and other warnings just mean that we have a cache miss.
WIP cache: debug mode, check keyset from backends vs keyset from cache
This commit leaves some debugging code in the history, but it is not
meant for production. Only use this to debug problems with the cache.

@mpranj mpranj force-pushed the mpranj:mmapcache branch from 885c8bd to a4b03b4 Apr 7, 2019

@mpranj

This comment has been minimized.

Copy link
Member Author

commented Apr 7, 2019

@markus2330 this is ready. 🎉

The cache debug code was left in c31979d and then reverted. As discussed, some improvements are left for a second PR. The cache is disabled dynamically whenever we encounter the ini plugin, otherwise it just works.

A best-scenario (cache hit) benchmark is available in ./bin/benchmark_kdb and shows roughly a 10x improvement over dump without cache. There might be room for improvement. (Concrete numbers and highlight in the news coming in a second PR)

@markus2330 markus2330 merged commit cf4888d into ElektraInitiative:master Apr 7, 2019

6 checks passed

LGTM analysis: C/C++ No new or fixed alerts
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No code changes detected
Details
continuous-integration/jenkins/pr-merge This commit looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage decreased (-0.007%) to 77.377%
Details
@markus2330

This comment has been minimized.

Copy link
Contributor

commented Apr 7, 2019

Thank you! I am looking forward using it tomorrow (I'll run tomorrow a script which issues many kdb commands.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.