Reduce likelihood of unexpected behavior by avoiding overriding of context keys #11

tomage · 2021-05-14T17:38:12Z

The problem

I recently discovered a tricky little bug in our system.

We've set things up so that every request, Celery task or manage.py invocation starts off a pghistory.context, which that particular request/task/manage.py-session will then use to add context to pghistory-tracked objects.

Unfortunately, what I found in a few places was that we were running particular code multiple times in a single request (way), in a way where we'd be overriding a key that was previously set.

Examples

A tivial (and kinda nonsensical) example:

def increase_payout_api_endpoint(request):
    for job in Job.objects.all():
        update_payout(job)

def update_payout(job):
    pghistory.context(old_payout=job.payout)
    job.payout += 10
    job.save()

This is nonsensical in the sense that pghistory is most likely going to track the old value on the payout in a JobEvent table. It's unnecessary really to use pghistory.context() to store the old_payout. But that's not the point.

The point is more that update_payout() might be using pghistory.context() in some more useful way, and might have been written at a time when it was really only called once per request/task. But then someone might have started to call it in a for-loop (for example).

This is unfortunate, since what'll happen is that the last time pghistory.context(old_payout=...) is called, that's the value that's going to end on the context for that whole request, so the same value will be attacked to all jobs.

Another example where this can happen:

def some_api_endpoint(request):
    
    play_musical_note(request.POST.get('musical_note'))
    send_a_small_note(request.POST.get('text_note'))

def play_musical_note(note):
    pghistory.context(note=f"Playing note {note}")
    play(note)  # This might change Model instances tracked by pghistory

def send_a_small_note(note):
    pghistory.context(note=f"Sending small note: {note}")
    send_note(note)  # This might change Model instances tracked by pghistory

Possibly solution

To track down the problematic cases, I monkey-patched pghistory.context with something like the following, and then ran our unit-tests:

orig_context = pghistory.context                                                
                                                                                
class strict_context(orig_context):                                             
    def __init__(self, **metadata):                                             
                                                                                
        _tracker = pghistory.tracking._tracker                                  
        if hasattr(_tracker, 'value'):                                          
            keys_already_set = set(metadata.keys()) & set(                      
                _tracker.value.metadata.keys()                                  
            )                                                                   
            if keys_already_set:                                                
                raise RuntimeError(                                             
                    f'The following keys have already been set in the '         
                    f'pghistory.context: {keys_already_set}'                    
                )                                                               
        super().__init__(**metadata)                                            
                                                                                
pghistory.context = strict_context

While pretty crude, it did break a bunch of unit-tests in my codebase. Most of them were somewhat trivial to fix, tho some were harder (eager execution of Celery tasks meant that many of those broke - I did fix that by hacking in sth that re-news the context for those.. not quite an elegant fix, but enough to get unit-tests to pass). I also needed to adjust the HistoryMiddleware class a bit to not make it re-insert the user again (somewhat trivial).

If we were to introduce something that disallows re-insertion of same key into the context, we could also gently introduce it, either by having a settings.py config to control it's default behavior, and/or by adding in a special flag on the context. I.e. the signature of the __init__() could reads sth as:

def __init__(self, allow_overrides=False, **metadata):
    ...

or

def __init__(self, allow_overrides=None, **metadata):
    allow_overrides = getattr(settings, 'PGHISTORY_ALLOW_OVERRIDES_DEFAULT', False)
    ...

Any thoughts @wesleykendall ?

The text was updated successfully, but these errors were encountered:

wesleykendall · 2022-09-06T00:30:57Z

I like the ability to have a PGHISTORY_STRICT_CONTEXT setting that doesn't allow duplicate keys. I plan on adding this as a configurable setting in a subsequent version or at least warn when it happens. Thanks for the suggestion! Will think about the impacts of it more and won't make it the default though

tomage mentioned this issue May 14, 2021

Allow nested contexts #12

Open

wesleykendall added the enhancement New feature or request label Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce likelihood of unexpected behavior by avoiding overriding of context keys #11

Reduce likelihood of unexpected behavior by avoiding overriding of context keys #11

tomage commented May 14, 2021

wesleykendall commented Sep 6, 2022

Reduce likelihood of unexpected behavior by avoiding overriding of context keys #11

Reduce likelihood of unexpected behavior by avoiding overriding of context keys #11

Comments

tomage commented May 14, 2021

The problem

Examples

Possibly solution

wesleykendall commented Sep 6, 2022