Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: a more robust implementation for the AUTOCOMMIT option & make it the default on Django ≥ 1.9 #187

Open
aaugustin opened this issue Sep 5, 2016 · 7 comments

Comments

@aaugustin
Copy link
Contributor

Django's on_commit hook seems perfectly appropriate for delaying enqueue() until the database transaction commits.

Django 1.9+ maintains its own per-database-connection (database connections are thread-local) queue of functions to run when the current database transaction commits. You don't have to roll your own.

This is better than hooking to request-related signals because it deals properly with partial rollbacks in case of nested atomic blocks, etc. It works regardless of whether the project uses ATOMIC_REQUESTS and also outside of the request-response cycle.

I think it would be good to make it the default. This would be technically backwards-incompatible but it's likely a better default in the long run, given that many web developers are uncomfortable with database transactions.

The only difficulty is the interaction with multi-database setups. You might have to add a kwarg to tell which database this task depends on. Setting that kwarg to None would enqueue the task immediately. The kwarg would default to 'default'.

Please let me know if you'd like me to elaborate on why I think this is the best way to hook with Django's transaction management.

(Although I'm not the author of on_commit — that's @carljm — I'm responsible for most of the current transaction management features in Django and I'm interested in helping third party projects make the best use of them.)

@aaugustin
Copy link
Contributor Author

aaugustin commented Sep 5, 2016

Here's a custom Queue class that implements the approach I described above.

from django.db import transaction
from rq.queue import Queue


DEFAULT = object()


class OnCommitQueue(Queue):
    """
    RQ Queue that enqueues jobs only after the database transaction commits.

    As a consequence, the enqueue function doesn't return the job.

    See also https://github.com/ui/django-rq/issues/187.

    """
    def __init__(self, *args, **kwargs):
        # Drop autocommit parameter -- this class takes a different approach
        # to the same issue.
        kwargs.pop('autocommit', None)
        return super().__init__(*args, **kwargs)

    def enqueue(self, *args, **kwargs):
        on_commit = kwargs.pop('on_commit', DEFAULT)

        # Explicitly passing on_commit=None forces enqueuing the task
        # immediately. This escape hatch should be rarely needed.
        if on_commit is None:
            super().enqueue(*args, **kwargs)

        # Not passing on_commit enqueues the task after the current
        # transaction on the default database commits.
        if on_commit is DEFAULT:
            on_commit = None

        # super() without arguments must be resolved within this function.
        enqueue = super().enqueue

        transaction.on_commit(
            lambda: enqueue(*args, **kwargs),
            using=on_commit,
        )

EDIT -- fixed code sample after testing.

@aaugustin
Copy link
Contributor Author

aaugustin commented Sep 5, 2016

Note that this approach makes it impossible to return the job synchronously, since it isn't created until the database transaction commits and won't be created if the transaction fails.

depaolim added a commit to depaolim/django-rq that referenced this issue Sep 12, 2016
@depaolim
Copy link
Contributor

I built a test case to show what @aaugustin described. I hope the comments in the code make it explicative so that the test cases can be used as a base for a correct on_commit hook implementation.
Let me recall the @aaugustin worlds:

This is better than hooking to request-related signals because it deals properly with partial rollbacks in case of nested atomic blocks, etc. It works regardless of whether the project uses ATOMIC_REQUESTS and also outside of the request-response cycle.

@selwin
Copy link
Collaborator

selwin commented Sep 12, 2016

Sorry for the late response. I think this is a good idea, though I'm unlikely to have any time to work on this anytime soon. I've been super busy since my daughter was born a few weeks ago.

If someone wants to work on this, I'll be more than happy to review and merge the resulting PR.

P.S: thanks for overhauling Django's transaction management features a few releases ago @aaugustin 😃

@aaugustin
Copy link
Contributor Author

Congratulations :-)

The custom Queue class works just fine for now, no hurry!

@depaolim
Copy link
Contributor

Congratulations :-)

@selwin
Copy link
Collaborator

selwin commented Sep 12, 2016

Thanks 😁

Sent from my phone

On Sep 12, 2016, at 9:14 PM, Marco De Paoli notifications@github.com wrote:

Congratulations :-)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants