Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PUP-5743) Fix issue with expiring cache as described in #353 #354

Closed
wants to merge 1 commit into from

Conversation

@garethr
Copy link
Contributor

garethr commented Jan 22, 2016

No description provided.

@garethr

This comment has been minimized.

Copy link
Contributor Author

garethr commented Jan 22, 2016

This could ideally do with a test, But it's late Friday and this appears to resolve the problem from manual testing on the Docker module.

Lots of context in #353 and https://tickets.puppetlabs.com/browse/PUP-5743

@adrienthebo

This comment has been minimized.

Copy link
Contributor

adrienthebo commented Jan 22, 2016

If we've verified that this solves the issue and are confident in it then I can write tests for this later and we can ship it now.

@garethr

This comment has been minimized.

Copy link
Contributor Author

garethr commented Jan 22, 2016

@adrienthebo @DavidS is just checking with the Apache module as well to get some confidence.

@@ -23,7 +23,7 @@ def get(*args, &blk)
private

def expire!
expired = @lra.slice!(0, @lra.size - MAX_ENTRIES)
expired = @lra.slice!(0, @lra.size - (MAX_ENTRIES-1))

This comment has been minimized.

Copy link
@hlindberg

hlindberg Jan 22, 2016

Collaborator

The -1 only helps with the case where the same args are used twice in a row at the border of MAX_ENTRIES, it does not help with the case when @ira[0] should be retained because the about to be added args are the same as @ira[0].
If args are give to expire! it could remove all those entries from @irl. In the case when the same args appear repeatedly all entries in the @ira will be the same, and there is nothing that should be evicted !

This comment has been minimized.

Copy link
@hlindberg

hlindberg Jan 22, 2016

Collaborator

With the fix I propose above, the -1 should not be there

@irl

This comment has been minimized.

Copy link

irl commented Jan 22, 2016

Hi, I would be happy to have those entries removed from myself. LGTM.

@garethr garethr force-pushed the garethr:fix-caching-issue branch from 5f3abd9 to f01224d Jan 22, 2016
if !@cache.has_key? args
@cache[args] = (blk || @default_proc).call(*args)
@lra << args
expire!

This comment has been minimized.

Copy link
@hlindberg

hlindberg Jan 22, 2016

Collaborator

if expire! is kept here, this cache can be a LRU cache by doing this on the line above:

@irl = (@irl << 1).reverse.uniq.reverse

As an example given [1,2,3,4] and appending 2, the result is [1,3,4,2]. A long as cache is a small number of entries that solution is good enough. Then, when expires removes entries, it will remove the MAX_ENTRIES least recently used entries

This comment has been minimized.

Copy link
@cardil

cardil Jan 22, 2016

That is some serious caching stuff. To deep for me to follow on Friday evening 😉

This comment has been minimized.

Copy link
@hlindberg

hlindberg Jan 22, 2016

Collaborator

[1,2,3,4], becomes [1,2,3,4,1] when appending 1. if you do a unique, the result is [1,2,3,4]. if instead reversing the array first it becomes [1,4,3,2,1] and a unique then gives [1,4,3,2], which reversed (again) is [2,3,4,1]. The 1 was "touched" and moved last in the array. Now if cache max was 4, and we add an element 5, we first get [2,3,4,1,5], and the expire will evict the first 3 and we end up with [1, 5]. The '1' survived because it was used later in time than 2, 3, and 4.

That is the idea at least.

There are other ways of doing the same thing - in pseudo code:
if cache-hit
find the corresponding entry in @ira and move it last in @ira (i.e. this indicates "used last")

if cache-miss
   evict from cache if there are too many entries, but only evict one (seems bad to evict all of them)
  add the args to @ira, and @cache
end

However that is not the problem here (unless a very odd combination of args/cache is interacting). The above is however a better cache implementation if the idea is to just cache catalogs given its args (input); hard to tell if there are other requirements.

Will continue looking at the other problem.

@DavidS

This comment has been minimized.

Copy link
Collaborator

DavidS commented Jan 22, 2016

This change fails on puppet 3.8.5; I've went back and checked: it fails there in the same way as without this change.

running git bisect on puppet 3.8.5 points at puppetlabs/puppet@8136aff as the cause for this issue, which plausibly could have impact on the catalogue_id or the args used as key in the cache.

Investigation continuing ...

@hlindberg

This comment has been minimized.

Copy link
Collaborator

hlindberg commented Jan 22, 2016

Looking at the code the bisect work found.

@hlindberg

This comment has been minimized.

Copy link
Collaborator

hlindberg commented Jan 22, 2016

The change in 3.8.5 is also in 4.3.2. Before the commit in question, the actual environment and the string that is used as the name of the environment for a node could diverge. That is a good thing I think.

@hlindberg

This comment has been minimized.

Copy link
Collaborator

hlindberg commented Jan 22, 2016

I have a theory that the changed cache logic in this PR will cause a recompile when one is not needed, it will place the new entry in the cache where the old logic would not. That would for sure cause identities to be different. However, the same failure is not seen when puppet < the commit in question, so there must be something else that causes rspec_puppet expectancy to not be true after the change. Need to dig deeper into rspec_puppet.

@hlindberg

This comment has been minimized.

Copy link
Collaborator

hlindberg commented Jan 22, 2016

A posted #355 with what I hope is a better cache algorithm. Will be interesting to see if that changes the other problem. (Waiting for travis run there).

DavidS added a commit to DavidS/rspec-puppet that referenced this pull request Jan 25, 2016
Since applying a fix for PUP-5522 (puppet 3.8.5 and 4.3.2) puppet symbolizes environment names
internally. The catalog cache needs to assume that the facts and other args do not change between
runs, so we have to mirror this here. Puppet versions before the change require a string as environment
name, or they fail with "Unsupported data type: 'Symbol' on node xyz"
See rodjek#354 and PUP-5743 for discussion of this
DavidS added a commit to DavidS/rspec-puppet that referenced this pull request Jan 25, 2016
Since applying a fix for PUP-5522 (puppet 3.8.5 and 4.3.2) puppet symbolizes environment names
internally. The catalog cache needs to assume that the facts and other args do not change between
runs, so we have to mirror this here. Puppet versions before the change require a string as environment
name, or they fail with "Unsupported data type: 'Symbol' on node xyz"
See rodjek#354 and PUP-5743 for discussion of this
@DavidS

This comment has been minimized.

Copy link
Collaborator

DavidS commented Jan 25, 2016

Closing this as it didn't address the stated issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.