Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Django-q calls task twice or more #183

Closed
Eagllus opened this issue Jul 14, 2016 · 10 comments
Closed

Django-q calls task twice or more #183

Eagllus opened this issue Jul 14, 2016 · 10 comments

Comments

@Eagllus
Copy link
Collaborator

Eagllus commented Jul 14, 2016

My background process is called twice (or more) but I'm really sure that should not be happening.
My settings for Django Q:

Q_CLUSTER = {
    'name': 'cc',
    'recyle': 10,
    'retry': -1,
    'workers': 2,
    'save_limit': 0,
    'orm': 'default'
}

My test task function:

def task_test_function(email, user):
    print('test')

calling it from the commandline:

> python manage.py shell
>>> from django_q.tasks import async
>>> async('task_test_function', 'email', 'user')
'9a0ba6b8bcd94dc1bc129e3d6857b5ee'

Starting qcluster (after that I called the async)

> python manage.py qcluster
13:48:08 [Q] INFO Q Cluster-33552 starting.
...
13:48:08 [Q] INFO Q Cluster-33552 running.
13:48:34 [Q] INFO Process-1:2 processing [mobile-utah-august-indigo]
test
13:48:34 [Q] INFO Process-1:1 processing [mobile-utah-august-indigo]
test
13:48:34 [Q] INFO Processed [mobile-utah-august-indigo]
13:48:34 [Q] INFO Processed [mobile-utah-august-indigo]
...

And the function is called twice...
For most functions I wouldn't really care if they run twice (or more) but I have a task that calls send_mail and people that are invited receive 2 or more mails...

Is this a bug in Django Q or in my logic?

@qb1989
Copy link

qb1989 commented Jul 21, 2016

You know that :

async('task_test_function', 'email', 'user')

is already starting the task right ?

If you firstly want to set up a task and run it manually later on by command you must use the class Async and the run method.

e.g. :

test_task = Async('task_test_function', 'email', 'user')
test_task.run

because you wrote

Starting qcluster (after that I called the async)

which looks like you started the task twice.

@Eagllus
Copy link
Collaborator Author

Eagllus commented Jul 22, 2016

I can see that the tasks is queued in django_q_ormq tabel but when the qcluster starts the task I see the output that it's called multiple times.

On our server django-q is running non-stop (handled by circus) so I would assume if I do a async call the task will be queued. Qcluster will check the queue once every few seconds and handle the task. I use something almost identical to this example http://django-q.readthedocs.io/en/latest/examples.html

@cyliang
Copy link

cyliang commented Aug 14, 2016

I've been facing similar problem that some of my long tasks are being processed again and again even successfully processed.

I eventually resolved this by configuring a longer retry time.

According to http://django-q.readthedocs.io/en/latest/configure.html#retry, retry is

The number of seconds a broker will wait for a cluster to finish a task, before it’s presented again.

So I think that's why my tasks are pushed into the queue again after 60 seconds (the default retry time) of unfinished execution and processed repeatedly

I'm not sure if this can help you but I think there might be someone like me being confused by similar problem.

Sorry for my poor English.

@Koed00
Copy link
Owner

Koed00 commented Aug 14, 2016

The ORM broker is a bit experimental and not as stable as the other brokers unfortunately. It might be a problem with your specific database type. None of you mention what your database backend is, it might be that it is very slow and tasks are not locked fast enough. Let me know what backend you're using, maybe this needs to be tested for specifically.

I'll look into making the dequeue function lock better, but in the mean time you can experiment by adjusting the poll settings in your configuration. The default 200ms might just be too fast and you need to set it to 1 second or even slower to have a reliable lock.

@cyliang
Copy link

cyliang commented Aug 14, 2016

I've tried AWS SQS broker and the ORM broker, with mysql, postgreSQL as database backend. The problem occurs on all of these brokers.

I found out it is the retry setting that makes my task being taken by the cluster again and again. Once the task is finished and unlocked, that task will be pushed into a worker again. As a result, I resolved my problem by setting retry to a larger value. So I'm wondering if the original poster can also resolve the problem by re-configuring his retry setting.

@Koed00
Copy link
Owner

Koed00 commented Aug 14, 2016

Yes, if your tasks will be long running you should set your retry setting to a number beyond that of what you expect the longest running task will take.
Alternatively you can set this on individual, long running, tasks in the kwargs.

I personally run most of my servers with the SQS broker, because it's cheap and reliable.
You have to take the time to set up your timeouts and retries in a way that works with your particular setup.

Maybe I should set the global default for retry to a higher number, to accommodate the problems you are experiencing and prevent them from happening to new users.

@510908220
Copy link

I use orm broker,I also encountered the same problem when task execution time is longer than Conf.RETRY.

How about add an option such as no_ack, when set no_ack=True, dequeue just remove the task from broker.

@Eagllus
Copy link
Collaborator Author

Eagllus commented Aug 29, 2016

I had my retry on -1 and now I changed it to a higher value (for testing 1000000) and
this has seem to fix my problem. 😄

Will run this in production and check if this solves the problem there also.

@jordanmkoncz
Copy link

I was having the same problem as you @Eagllus. I had a task which would run for about 180 seconds, so I had configured a timeout of 600 seconds to accommodate this task, but I still had my retry configured to 120 seconds (like it's configured in the ORM example at http://django-q.readthedocs.io/en/latest/configure.html#orm). As a result, my task would get picked up by a worker, then 120 seconds later the same task would be picked up by a second worker, and then both workers would successfully process the task (in my case this means generating a report twice and sending it by email twice, which is obviously not what I want).

Configuring my retry to be longer than my timeout (I made it 60 seconds longer) resolved the issue for me too. I don't feel like it's obvious from the docs right now that django-q will behave the way it does if you have your retry configured to be shorter than your timeout and have a task which will take longer to process than your retry time. If someone hadn't already read through the comments on this issue, I feel like it would be easy to misinterpret the docs for the retry setting.

I'm struggling to imagine a scenario where you'd actually want to have retry configured to be shorter than your timeout and get the duplicate task processing behaviour that occurs as a result. I assume there is a scenario where this would make sense, but it seems like for the majority of use cases you'd want to make sure your retry is longer than your timeout so that you can avoid this duplicate task processing behaviour. Maybe the docs for the retry setting could be updated to explain all of this, i.e. explain how django-q will behave when retry is shorter or longer than timeout, and recommend configuring it to be longer for most use cases.

@Eagllus
Copy link
Collaborator Author

Eagllus commented Dec 12, 2017

@jordanmkoncz thank you for the update! I had setup my configuration a while ago and was never a problem until a connection timeout was triggered (triggering the 'bug' in my logic) and ending up with more then 100k of the same task.

I updated my configuration and this idd solved the problem!

@Eagllus Eagllus closed this as completed Dec 12, 2017
janneronkko added a commit to janneronkko/django-q that referenced this issue Feb 6, 2019
This issue has been reported many times in Django-Q's issue tracker:
Koed00#183
Koed00#180
Koed00#307

All these issue have been closed and responses have noted that retry should
be set bigger than timeout or duration of any task.
SchrodingersGat added a commit to SchrodingersGat/InvenTree that referenced this issue Oct 5, 2023
According to this thread, should reduce multiple workers taking the same task:

Koed00/django-q#183 (comment)
SchrodingersGat added a commit to inventree/InvenTree that referenced this issue Oct 6, 2023
* Improved handling of race condition when saving setting value

* Improvements for managing pending migrations

- Inform user if there are outstanding migrations
- reload plugin registry (if necessary)

* Increase django-q polling time

According to this thread, should reduce multiple workers taking the same task:

Koed00/django-q#183 (comment)

* Revert default behavior

* Better logging

* Remove comment

* Update unit test

* Revert maintenance mode behaviour

* raise ValidationError in settings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants