-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor redis cache #430
Refactor redis cache #430
Conversation
sentinel = Sentinel([(redis_host, 26379)], password=redis_pass, socket_timeout=5) | ||
if read_only: | ||
logger.debug("Looking up read only redis slave using sentinel protocol") | ||
r = sentinel.slave_for('mymaster') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since everything else is configurable, why not the sentinel master name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point I'll create another PR as it isn't directly related to this refactoring. #436
Get a connection to the master redis service. | ||
|
||
""" | ||
logger.debug("Connecting to redis") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this debug message could also show the redis server address and port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done.
def set_deserialized_filter(dp_id, python_filters): | ||
if len(python_filters) <= config.MAX_CACHE_SIZE: | ||
logger.debug("Pickling filters and storing in redis") | ||
key = 'clk-pkl-{}'.format(dp_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you rely on the implicit assumption that dp_id will be unique throughout the servers life. This might be true now, but might change in the future. Better adopt a more defensive strategy here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair comment, this could instead be stored using the clk's upload token.
With our default config this entire encodings.py
cache isn't actually used so I'd prefer a good offense as the best defense and remove the whole thing.
8de09a5
to
f4ee78a
Compare
Fix typo in logging config for werkzeug and jaeger_tracing
Remove propagate setting where the default is used.
f4ee78a
to
3c46532
Compare
This PR refactors the
entityservice.cache
to make it easier to understand and extend. As with many refactorings it looks more scary than it is. I've tried to clean up import code to be more explicit about where functions from the cache module.One of the few (minor) changes is at the start of a run the current number of comparisons is set to zero, instead of waiting for the first chunk of similarities to be completed:
progress_cache.save_current_progress(comparisons=0, run_id=run_id)
In preparation for some future use of the cache the PR also extracts a new function
assert_valid_run
, but doesn't change the code.