New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

None result obtained from cache #9

Merged
merged 1 commit into from Dec 3, 2018

Conversation

Projects
None yet
3 participants
@poleha
Contributor

poleha commented Oct 12, 2018

If cached function returns None - that's a result too and should be distinguished from cache miss.

@codecov

This comment has been minimized.

codecov bot commented Oct 12, 2018

Codecov Report

Merging #9 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master     #9   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files           1      1           
  Lines          39     41    +2     
=====================================
+ Hits           39     41    +2
Impacted Files Coverage Δ
src/cache_memoize/__init__.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 062c710...4cc0abc. Read the comment docs.

@peterbe

I'm sorry for not having noticed your PR sooner. Bloody GitHub notifications!

I look forward to this change. Can you take a look at some of the feedback. Also, you're going to need to merge in master (or rebase) so you get the new cache_key = ... functionality that has landed in master.

THANK YOU!

from django.utils.encoding import force_text, force_bytes
NONE_PLACEHOLDER = '___NONE_PLACEHOLDER___'

This comment has been minimized.

@peterbe

peterbe Nov 3, 2018

Owner

Instead of an odd string, can't this be NONE_PLACEHOLDER = object()?

This comment has been minimized.

@peterbe

peterbe Nov 3, 2018

Owner

Also, if you use object() instead, let's call it simply MARKER instead of NONE_PLACEHOLDER since that lingo is common for patterns like this.

This comment has been minimized.

@poleha

poleha Nov 5, 2018

Contributor

I am sorry, I don't see how it can work with object().
Please consider this code:


from django.core.cache import cache

MARKER = object()

cache.set('marker', MARKER)

print(id(MARKER))
# 140275214923040

marker = cache.get('marker')

print(id(marker))
# 140275214923056

print(marker == MARKER)
# False

print(id(marker) == id(MARKER))
# False

print(marker is MARKER)
# False

The point of NONE_PLACEHOLDER to cache value which will mean None but won't be None. I can't see how we can achieve this with object().

This comment has been minimized.

@peterbe

peterbe Nov 5, 2018

Owner

The trick is to never do cache.set('mykey', MARKER).

The core of the idea is this:

marker = object()
value = cache.get('key', marker)
if value is marker:
    # the key was not in the cache
    value = compute_value()
    cache.set('key', value)
return value

If cache was a pure Python dict you wouldn't need this because you could do:

if 'key' in cache_dict:
    value = cache_dict['key']
else:
    value = compute_value()
    cache_dict['key'] = value
return value

So that's, potentially, two read queries from the data structure. One to ask "Do you even have this key?" and one to ask "Alright, what's the value of this key?". If the network latency matters, it's oftentimes more performant to send only 1 read query and deal with consequences if it comes back empty.

But, even with a pure Python dict, it's sometimes pleasant to use the .get() method because you only need 1 branch:

marker = object()
value = cache_dict.get('key', marker)
if value is marker:
    # the key was not in the cache dict
    value = compute_value()
    cache_dict['key'] = value
return value

In the last example, suppose you didn't use the "marker techique"...

value = cache_dict.get('key')
if value is None:
    # the key was PROBABLY not in the cache dict
    value = compute_value()
    cache_dict['key'] = value
return value

...then, what if compute_value() genuinely returns None. What would happen is that the caching would not protect compute_value(). And that's what you want to achieve.

This comment has been minimized.

@poleha

poleha Nov 5, 2018

Contributor

Makes sense, that'll definitely make code look better. Will update PR tomorrow.

if miss_callable:
miss_callable(*args, **kwargs)
elif hit_callable:
hit_callable(*args, **kwargs)
if result == NONE_PLACEHOLDER:
result = None
return result

This comment has been minimized.

@peterbe

peterbe Nov 3, 2018

Owner

I think all of the above can be simplified now. Like this:

def inner(*args, **kwargs):
    cache_key = _make_cache_key(*args, **kwargs)
    if kwargs.pop('_refresh', False):
        result = NONE_PLACEHOLDER
    else:
        result = cache.get(cache_key, NONE_PLACEHOLDER)
    if result is NONE_PLACEHOLDER:
        result = func(*args, **kwargs)
        ...
    elif hit_callable:
        ...

This comment has been minimized.

@poleha

poleha Nov 5, 2018

Contributor

The only idea of my PR to save something instead on None and distinguish it from None which we get when key is absent. I am sorry but I don't see how we can achieve this by this simplified variant.

This comment has been minimized.

@peterbe

peterbe Nov 5, 2018

Owner

See comment above.

result = runmeonce(20)
assert len(calls_made) == 1
assert result is None
result = runmeonce(20)

This comment has been minimized.

@peterbe

peterbe Nov 3, 2018

Owner

Why this line?

This comment has been minimized.

@poleha

poleha Nov 5, 2018

Contributor

To ensure that on repeated call we get from cache the same result as on first. We actually can remove line runmeonce(20). That's just to reassure that even on 3 calls we're still OK.

This comment has been minimized.

@peterbe

peterbe Nov 5, 2018

Owner

If it works for 2 it should always work for 3 too. And 4. And 5.

This comment has been minimized.

@poleha

poleha Nov 5, 2018

Contributor

Agreed. Will update PR tomorrow.

@poleha poleha force-pushed the poleha:none_placeholder branch from da4fc0f to 4cc0abc Nov 5, 2018

@peterbe peterbe merged commit 3bd4ba8 into peterbe:master Dec 3, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@peterbe

This comment has been minimized.

Owner

peterbe commented Dec 3, 2018

I am so sorry for noticing that you had pushed more code to this PR. And it was 28 days ago!
I blame GitHub :)
I wish there was a way to get notifications about commits to a PR just like you get notifications about comments.

@poleha

This comment has been minimized.

Contributor

poleha commented Dec 3, 2018

No problem. Thank you for suggestion how to improve my initial approach.

peterbe added a commit to mozilla-services/tecken that referenced this pull request Dec 4, 2018

Update django-cache-memoize to 0.1.5 (#1393)
This PR updates [django-cache-memoize](https://pypi.org/project/django-cache-memoize) from **0.1.3** to **0.1.5**.



<details>
  <summary>Changelog</summary>
  
  
   ### 0.1.5
   ```
   ~~~~~

- Fix when using ``_refresh=False`` and the ``.invalidate()``.
  `pull19 &lt;https://github.com/peterbe/django-cache-memoize/pull/19&gt;`_
   ```
   
  
  
   ### 0.1.4
   ```
   ~~~~~

- Ability to have the memoized function return ``None`` as an actual result.
  `pull9 &lt;https://github.com/peterbe/django-cache-memoize/pull/9&gt;`_
   ```
   
  
</details>


 

<details>
  <summary>Links</summary>
  
  - PyPI: https://pypi.org/project/django-cache-memoize
  - Changelog: https://pyup.io/changelogs/django-cache-memoize/
  - Repo: https://github.com/peterbe/django-cache-memoize
</details>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment