Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow unpickling failures to be treated as cache misses #96

Open
slingamn opened this issue Aug 17, 2012 · 8 comments
Open

allow unpickling failures to be treated as cache misses #96

slingamn opened this issue Aug 17, 2012 · 8 comments

Comments

@slingamn
Copy link
Contributor

Consider the following scenario:

  1. One version of an application stores a Python object in memcached via pylibmc, which automatically pickles it.
  2. A new version of the application which no longer has the object's class is deployed.
  3. The new application tries to retrieve the object, resulting in an exception (for example, an AttributeError).

One natural response to this would be to treat the cache retrieval as a miss. However, implementing this against the current pylibmc behavior is somewhat awkward and difficult. Every call to _pylibmc.Client.get must be wrapped in a try-except that distinguishes _pylibmc.MemcachedError (problems with remote memcached) from other exception types (unpickling failures).

Would it make sense to submit a patch to pylibmc adding a configuration switch such that if it is enabled, _pylibmc.Client.get responds to an unpickling error by suppressing it and returning None, as in the case of a miss?

@bukzor
Copy link

bukzor commented Aug 17, 2012

This makes sense to me.
+1

@lericson
Copy link
Owner

I'm wondering if a subclass is maybe a better idea?

@bukzor
Copy link

bukzor commented Aug 19, 2012

How would you go about subclassing in order to unambiguously turn unpickling issues into cache misses?

@slingamn
Copy link
Contributor Author

The problem I ran into is that since the unpickling code is inside the C module, it's sort of inaccessible to customization.

If the issue is that another configuration variable would be too crufty, perhaps pylibmc could define a special _pylibmc.SerializationException, then catch exceptions thrown during unpickling and wrap them with SerializationException? I think the core issue here is unambiguously communicating that there was a pickling problem, rather than having to rely on a blanket except Exception:.

@bukzor
Copy link

bukzor commented Sep 27, 2012

bump

@lericson
Copy link
Owner

lericson commented Oct 2, 2012

I'm not going to wrap all exceptions in a parent exception because that sucks.

@bukzor
Copy link

bukzor commented Oct 2, 2012

Could you address my question please?

How would you go about subclassing in order to unambiguously turn unpickling issues into cache misses?

(Disclaimer: This is not a rant, I'm just trying to be very clear, and avoid confusion.)

I believe we could wrap get with a subclass method which catches all exceptions and simulates a cache miss, but I'd be worried about catching unrelated memcache errors in that case. We'd have to catch all exceptions because unpickling errors don't generate a particular exception type. In the general case it's just calling a function with the wrong data, so you'd often get TypeError, KeyError, or AttributeError (which we're currently getting in our test environment).

Even if we disregarded that issue, get_multi presents a separate issue: an unpickling failure for one key causes no value at all (a NULL pointer) to be propagated back. Even if we wrapped it with a bare try: except:, we wouldn't be able to generate the "correct" return value in any reasonable way. For example if we had a get_multi([1,2,3]) and the value for 2 resulted in an unpickling error, we'd still want to return the key/value pairs for 1 and 3.

So I wonder how we'd do this with subclassing.

I think we solve the above issues in a fairly neat way if we simply refactor _PylibMC_parse_memcached_value and/or _PylibMC_Unpickle to be override-able class methods. That way we know exactly which errors we're suppressing, and can get the desired behavior from get_multi in a natural way. Incidentally, it would also give us a good spot to send messages to our python-based logging system when we have this type of cache miss, so that we can know if they get out of hand.

Would you accept this kind of patch? Can you give tips on the best way to do it?

@bukzor
Copy link

bukzor commented Oct 2, 2012

Incidentally, exposing overridable methods solves #75 in a much neater way: people that prefer json can override their client to use json for serialization, assuming that you similarly refactor _PylibMC_SerializeValue to be a Client method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants