Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply hashing to remote storage cache key #1030

Conversation

bitprophet
Copy link
Member

Prevents hitting the memcache 250-char key limit when rendering very long metric paths in cluster setups.

This looks like it regressed in f18bb3c and since that looks like a squashed merge I can't tell why it was changed during the megacarbon work; presumably it was to update the data used to construct the key. Either way, this change doesn't modify the nature of the key, only ensures it is hashed like all other memcached-using code (eg in render/views.py).

Prevents hitting the memcache 250-char key limit when rendering
in cluster setups & using very long metric paths.
@bitprophet
Copy link
Member Author

FTR this is the traceback one will hit in the wild when this bug is present:

Traceback (most recent call last):
  File "/mnt/services/graphite/lib/python2.7/site-packages/django/core/handlers/base.py", line 111, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/mnt/services/graphite/lib/graphite/render/views.py", line 115, in renderView
    seriesList = evaluateTarget(requestContext, target)
  File "/mnt/services/graphite/lib/graphite/render/evaluator.py", line 10, in evaluateTarget
    result = evaluateTokens(requestContext, tokens)
  File "/mnt/services/graphite/lib/graphite/render/evaluator.py", line 21, in evaluateTokens
    return evaluateTokens(requestContext, tokens.expression)
  File "/mnt/services/graphite/lib/graphite/render/evaluator.py", line 24, in evaluateTokens
    return fetchData(requestContext, tokens.pathExpression)
  File "/mnt/services/graphite/lib/graphite/render/datalib.py", line 142, in fetchData
    raise Exception("Failed after %i retry! See: %s" % (settings.MAX_FETCH_RETRIES, e))
Exception: Failed after 2 retry! See: Key length is > 250

And the "real" traceback when one temporarily nukes the try/except there:

Traceback (most recent call last):
  File "/mnt/services/graphite/lib/python2.7/site-packages/django/core/handlers/base.py", line 111, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/mnt/services/graphite/lib/graphite/render/views.py", line 115, in renderView
    seriesList = evaluateTarget(requestContext, target)
  File "/mnt/services/graphite/lib/graphite/render/evaluator.py", line 10, in evaluateTarget
    result = evaluateTokens(requestContext, tokens)
  File "/mnt/services/graphite/lib/graphite/render/evaluator.py", line 21, in evaluateTokens
    return evaluateTokens(requestContext, tokens.expression)
  File "/mnt/services/graphite/lib/graphite/render/evaluator.py", line 24, in evaluateTokens
    return fetchData(requestContext, tokens.pathExpression)
  File "/mnt/services/graphite/lib/graphite/render/datalib.py", line 136, in fetchData
    seriesList = _fetchData(pathExpr,startTime, endTime, requestContext, seriesList)
  File "/mnt/services/graphite/lib/graphite/render/datalib.py", line 99, in _fetchData
    fetches = [(node, node.fetch(startTime, endTime)) for node in matching_nodes if node.is_leaf]
  File "/mnt/services/graphite/lib/graphite/storage.py", line 24, in find
    remote_requests = [ r.find(query) for r in self.remote_stores if r.available ]
  File "/mnt/services/graphite/lib/graphite/remote_storage.py", line 25, in find
    request.send()
  File "/mnt/services/graphite/lib/graphite/remote_storage.py", line 58, in send
    self.cachedResult = cache.get(self.cacheKey)
  File "/mnt/services/graphite/lib/python2.7/site-packages/django/core/cache/backends/memcached.py", line 57, in get
    val = self._cache.get(key)
  File "/mnt/services/graphite/lib/python2.7/site-packages/memcache.py", line 793, in get
    return self._get('get', key)
  File "/mnt/services/graphite/lib/python2.7/site-packages/memcache.py", line 761, in _get
    self.check_key(key)
  File "/mnt/services/graphite/lib/python2.7/site-packages/memcache.py", line 954, in check_key
    % self.server_max_key_length)
MemcachedKeyLengthError: Key length is > 250

@beevee
Copy link
Contributor

beevee commented Dec 1, 2014

I'm also experiencing a different bug that should be fixed by hashing cache keys.

Grafana dashboard sometimes sends render requests with invalid metric name (appends select%20metric in the end). This causes graphite-web to crash with 500 response code due to "control character" (space) in cache key:

Exception encountered in <GET http://graphite/metrics/find/?query=some.query.select%20metric>
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/django/core/handlers/base.py", line 115, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/opt/graphite-front/webapp/graphite/metrics/views.py", line 128, in find_view
    matches = list( STORE.find(query, fromTime, untilTime, local=local_only) )
  File "/opt/graphite-front/webapp/graphite/storage.py", line 41, in find
    remote_requests = [ r.find(query) for r in self.remote_stores if r.available ]
  File "/opt/graphite-front/webapp/graphite/remote_storage.py", line 24, in find
    request.send()
  File "/opt/graphite-front/webapp/graphite/remote_storage.py", line 57, in send
    self.cachedResult = cache.get(self.cacheKey)
  File "/usr/lib/python2.7/site-packages/django/core/cache/backends/memcached.py", line 64, in get
    val = self._cache.get(key)
  File "/usr/lib/python2.7/site-packages/memcache.py", line 898, in get
    return self._get('get', key)
  File "/usr/lib/python2.7/site-packages/memcache.py", line 847, in _get
    self.check_key(key)
  File "/usr/lib/python2.7/site-packages/memcache.py", line 1062, in check_key
    "Control characters not allowed")
MemcachedKeyCharacterError: Control characters not allowed

@beevee
Copy link
Contributor

beevee commented Dec 1, 2014

@bitprophet could you resolve merge conflicts, please?

…e-hashing

Conflicts:
	webapp/graphite/remote_storage.py
@bitprophet
Copy link
Member Author

Hrm I apparently pushed this without actually updating my source branch (I develop patches against my workplace's internal, stable-thus-old checkout - but usually remember to then rebase against public master before PR'ing), git log was showing me the previous commit was from early 2013. Hilarious?

Just merged to latest master, the diff looks identical to me still (though yes - git did think there was some sort of "both updated" situation) so we'll see what Travis says.

@drax68
Copy link

drax68 commented Jan 28, 2015

+1 Hit this issue too. Any chance that this will be merged soon?

drax68 pushed a commit to drax68/graphite-web that referenced this pull request Jan 28, 2015
@bitprophet
Copy link
Member Author

Given I barely contribute, I'd like to merge this but would ❤️ a simple 👍 from any other @graphite-project/committers first to make sure I'm not committing a faux pas. An emoji will do! A day or two of silence may be taken as consent :)

@JeanFred
Copy link
Member

@bitprophet The rationale sounds sensible to me, and the code looks fine − but I’m not familiar enough with the codebase in general for even an emoji on this :-)

@JeanFred
Copy link
Member

Wait, looks like #1480 is the same as this? (Good news is that @deniszh +1 it ;-)

@obfuscurity
Copy link
Member

I prefer the bit savings in #1480 but the logic seems ok either way.

@deniszh
Copy link
Member

deniszh commented Apr 20, 2016

Yep, or this or #1480 - both looks fine. Pick one, ditch another, merge. 👍

@obfuscurity
Copy link
Member

Merged #1480.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants