Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc backend doesn't seem to report message status correctly. #4084

Open
mke21 opened this issue Jun 13, 2017 · 25 comments
Open

rpc backend doesn't seem to report message status correctly. #4084

mke21 opened this issue Jun 13, 2017 · 25 comments

Comments

@mke21
Copy link

mke21 commented Jun 13, 2017

This is version 4.0.2
If I use rabbitmq, the rpc result backend, and a custom queue the message's status never seems to change and stays 'PENDING', even if the logging in the worker reports that it has been successfully executed until I do a get() of some sorts when the status changes to 'SUCCESS'. When I change the backend to amqp this the system works as expected, giving SUCCESS before doing the get(). Also the redis backend doesn't give this problem, so it seems to be rpc specific.

Note that not setting a custom queue, so using the default also works as expected just like the other backends!

I've got in my tasks.py:

# tasks.py
from celery import Celery

app = Celery('tasks', 
broker='amqp://',
backend='rpc://' # here I use the rpc backend
)
@app.task
def test_task(s):
    return s

I start this with `celery -A tasks worker --loglevel=info -Q myqueue

On the other side I do:

>>> import tasks
>>> result = tasks.test_task.apply_async(queue='myqueue', args=['blaat',]) # set custom queue

>>> result.ready() # worker already executed, so should give True
False
>>> result.status # with RPC backend it returned False while it should have been True in previous statement
'PENDING'
>>> print(result.get())
'This is a string'
>>> result.ready()
True
>>> result.status
'SUCCESS'

If I start this qithout the -Q option: 'celery -A tasks worker --loglevel=info' I get:

>>> import tasks
>>> result = tasks.test_task.apply_async(args=['blaat',]) # set no custom queue

>>> result.ready() # worker already executed, so should give True
True
>>> result.status # previous seemed to work
'SUCCESS'
>>> print(result.get()) # SUCCESS was also reported
'This is a string'
>>> result.ready()
True
>>> result.status
'SUCCESS'
@jpescalona
Copy link

jpescalona commented Jul 26, 2017

I've been able to reproduce this issue with:

  • Celery 4.0.2
  • backend_results = 'rpc://'

but it fails when I execute a group task before individual tasks. Anyway, GroupResult does not return ready even if all subtasks are ready. When I call GroupResult.get it marks GroupResult as ready.

In [1]: from myapp.tasks import invalidate_cache
In [2]: r = invalidate_cache.apply_async(queue='management')

In [3]: r.state
Out[3]: u'STARTED'

In [4]: r.state
Out[4]: u'STARTED'

In [5]: r.state
Out[5]: u'STARTED'

In [6]: r.ready()
Out[6]: True

In [7]: r.ready()
Out[7]: True

In [8]: r.state
Out[8]: u'SUCCESS'

In [9]: from myapp.tasks import invalidate_cache

In [10]: from celery import group

In [11]: g = group([invalidate_cache.s().set(queue='management')]).apply_async()

In [12]: g.ready()
Out[12]: False

In [13]: g.ready()
Out[13]: False

In [14]: g.ready()
Out[14]: False

In [15]: g.ready()
Out[15]: False

In [16]: g.ready()
Out[16]: False

In [17]: g.get()
Out[17]: [None]

In [18]: g.ready()
Out[18]: True

In [19]: r = invalidate_cache.apply_async(queue='management')

In [20]: r.state
Out[20]: u'PENDING'

In [21]: r.state
Out[21]: u'PENDING'

In [22]: r.state
Out[22]: u'PENDING'

In [23]: r.get()

In [24]: r.state
Out[24]: u'SUCCESS'

No matter what queue you define or start using celery command, it starts failing once a group task is executed

@jpescalona
Copy link

After digging a while, i've discovered that this problem is reproduced when using rpc as results_backend instead of amqp. Maybe rpc results backend is broken?

@thedrow
Copy link
Member

thedrow commented Jul 27, 2017

This sounds like a bug. Thanks for the report.
Did any of you try doing the same with master?

@jpescalona
Copy link

I've tried against 4.0.2, 4.1.0 and master, and for all of them, rpc backend do not update the tasks status correctly.

@donkopotamus
Copy link
Contributor

donkopotamus commented Sep 6, 2017

I have also observed something similar to this issue, where despite a worker being configure to send a STARTED message (via task_send_started), and having confirmed (by inserting a breakpoint) that it is being sent, the rpc backend does not appear to process it with the AsyncResult.state moving only between PENDING to SUCCESS.

Currently I have traced the issue to this piece of code in celery.backends.async:BaseResultConsumer:

    def on_state_change(self, meta, message):
        if self.on_message:
            self.on_message(meta)
        if meta['status'] in states.READY_STATES:    ### <===== HERE
            task_id = meta['task_id']
            try:
                result = self._get_pending_result(task_id)
            except KeyError:
                # send to buffer in case we received this result
                # before it was added to _pending_results.
                self._pending_messages.put(task_id, meta)
            else:
                result._maybe_set_cache(meta)
                buckets = self.buckets
                try:
                    # remove bucket for this result, since it's fulfilled
                    bucket = buckets.pop(result)
                except KeyError:
                    pass
                else:
                    # send to waiter via bucket
                    bucket.append(result)
        sleep(0)

I have confirmed that the STARTED message is being processed by this on_message but clearly as it is not in states.READY_STATES it will never be acted on in any way. Indeed other custom messages will also not be processed.

@auvipy
Copy link
Member

auvipy commented Dec 19, 2017

could you also check latest master and report?

@auvipy auvipy added this to the v4.2 milestone Dec 19, 2017
@auvipy auvipy modified the milestones: v4.2, v5.0.0 Jan 13, 2018
@auvipy auvipy modified the milestones: v5.0.0, v4.3 May 27, 2018
@xirdneh
Copy link
Member

xirdneh commented Jun 5, 2018

Somebody reported this as an issue on IRC.
I can confirm the state and ready values are not correct until you do result.get() when using RPC as result backend.
This is using the latest master

@xirdneh
Copy link
Member

xirdneh commented Jun 5, 2018

Another update while I was testing this. Apparently it is working for single tasks but the same problem arises when using groups.

>>> res = group(add.s(3,4), add.s(5,7)).delay()
>>> for r in res.results:
...     print(r.state)
PENDING
PENDING

This might have to be a new issue...

@auvipy auvipy modified the milestones: v4.3, v5.0.0 Jan 8, 2019
@codeSamuraii
Copy link

I'm having a similar issue.

I have a group of task executing correctly when using the AMQP backend. When switching to RPC, the group task never finishes and blocks on the group_result.get() statement.

@auvipy auvipy modified the milestones: v5.0.0, 4.6 Jul 6, 2019
@pyDev20
Copy link

pyDev20 commented Jul 18, 2019

I encountered an issue that might be related, in version 4.3.0.

I have a simple celery application with two tasks, a_func() and b_func(). After starting the celery worker, I am calling a_func.apply_async(), and a_func, when running on worker is calling b_func.apply_async().

When using 'amqp://' as a backend everything is working well. However, when using 'rpc://' as a backend, I am having problems. I am trying to get the state and the return value of the tasks. For the a_func() task, there is no problem. However for b_func() I am getting state = 'PENDING' forever, and get() is stuck forever.

I am using: celery version 4.3.0. rabbitmq version 3.5.7 as broker. python 2.7. ubuntu version 16.0.4 LTS.

Worker cmd:
celery -A celery_test worker --loglevel=inf
celery application:
app = Celery('my_app', backend='rpc://', broker='pyamqp://guest@localhost/celery', include=['tasks'])

a_func and b_func tasks:
@task def a_func(): print "A" b_func.apply_async() return "A"
@task def b_func(): print "B" return "B"

@amard33p
Copy link

+1. This is probably due to this issue : #4830 which has a fix in the works.
Changing the backend to redis (since amqp will be deprecated) solves the issue.

@auvipy auvipy modified the milestones: 4.6, 4.4.x Dec 16, 2019
@auvipy auvipy modified the milestones: 4.4.x, 5.3 Feb 18, 2021
@shaloba
Copy link

shaloba commented Jun 6, 2021

Just encountered with this issue and i noticed the issue is still opened. any news ?
tnx

@thedrow
Copy link
Member

thedrow commented Jun 6, 2021

It may be a limitation of the RPC backend since it doesn't store state.
We might be able to work around this by storing the task state elsewhere.

I think this issue might only be fixed after our NextGen architecture refactoring.

@Pxeba
Copy link

Pxeba commented Aug 26, 2021

on my rabbitmq-server the task shows completed. however I always get pending and even if I get() my service freezes. The interesting thing is that locally it works, but when my service is sending tasks to a rabbitmq on another server I start getting these problems. I'm thinking about replacing the backend

@thangdc94
Copy link

I faced the same issue today on celery version 5.1.2

Everything works fine on my windows PC but when I deployed the code on Linux server GroupResult.get() stuck forever.

I found out that use celery version 5.0.0 can solve the issue on my Linux server.

Hope this information can help you guys debug the issue.

@auvipy
Copy link
Member

auvipy commented Oct 13, 2021

then it might be a regression, can you try 5.2.0rc1 and report back

@thangdc94
Copy link

then it might be a regression, can you try 5.2.0rc1 and report back

I tried with celery version 5.2.0rc1 and still faced the same problem.
I installed many older versions and found out that the issue begins at 5.0.3.
I also tested with version 4.4.7 and got no issue.

@auvipy
Copy link
Member

auvipy commented Oct 13, 2021

thanks for clarifying

@Venkatesh283
Copy link

Venkatesh283 commented Jun 29, 2022

Hi,
I'm facing the same issue with celery versions - 4.4.5, 4.4.7, 5.1.2, 5.2.0cr1, 5.2.7
Worker is configured with Concurrency = 10 / prefork pool.
Broker - AMQP
Backend - rpc
Python-3.8 and Ubuntu-20.04.1 LTS

When a single task added at a time, it works fine. AyncResult.state is updated to SUCCESS.
But in case of multiple tasks added in parallel (5 or more at a time) the workers are successfully completing the tasks. AyncResult.state is in PENDING state even after completion of task. And also the AyncResult.get() stucks forever.

Any fixes or updates?

Thanks in advance.

@Mortaza-Seydi
Copy link

Hi,
@Venkatesh283 i have the same problem, .get() stucks forever and .state is Pending but worker logs shows the task is succeeded with return value.
celery = 5.2.7
python = 3.8.10
CELERY_BROKER = 'amqp://guest@localhost//'
CELERY_BACKEND = 'rpc://'

@extratrees
Copy link

Same problem here.
Using celery==5.2.7, python==3.8.14
CELERY_BROKER = 'amqp://guest@localhost//'
CELERY_BACKEND = 'rpc://'.

Will try to switch to Redis, but this adds a layer of complexity

@loongmxbt
Copy link

loongmxbt commented Jan 9, 2023

Same issue. Get the same taskId (which already completed) returns different result.

5|energysi | PENDING
5|energysi | PENDING
5|energysi | SUCCESS
5|energysi | {'status': 'success', 'message': '计算服务:上传结果数据成功,name:测试,year:None,caseroom:54,caseinfo:133', 'result_id': 70}
5|energysi | PENDING
5|energysi | PENDING
5|energysi | SUCCESS
5|energysi | {'status': 'success', 'message': '计算服务:上传结果数据成功,name:测试,year:None,caseroom:54,caseinfo:133', 'result_id': 70}
5|energysi | PENDING
5|energysi | PENDING
5|energysi | PENDING
5|energysi | SUCCESS
5|energysi | {'status': 'success', 'message': '计算服务:上传结果数据成功,name:测试,year:None,caseroom:54,caseinfo:133', 'result_id': 70}
5|energysi | PENDING
5|energysi | PENDING

Seems problems with how flask is running:
The development run works fine.

if __name__ == "__main__":
    app.run(port=PORT)

However if the server is started, it will occur the problem

gunicorn --bind 127.0.0.1:8091 -w 4 "flask_pyapi:app"

also if set -w 1 then no problem

gunicorn --bind 127.0.0.1:8091 -w 1 "flask_pyapi:app"

@Sushanti99
Copy link

Hey. Been meaning to switch to rpc backend for more support for monitoring purposes like Flower. Is it still not safe to upgrade to 5.x and use rpc backend? (Currently using amqp backend but that's deprecated in 5.x)

@Nusnus Nusnus removed this from the 5.3 milestone Feb 19, 2023
@bskqd
Copy link

bskqd commented May 19, 2023

Same problem
celery==5.2.7 and 5.3.0rc1, python==3.10
CELERY_BROKER = 'amqp://guest@localhost//'
CELERY_BACKEND = 'rpc://'

tasks statuses are always pending and .get() takes forever

@auvipy auvipy added this to the 5.3.x milestone May 21, 2023
@buildxyz-git
Copy link

buildxyz-git commented Jun 20, 2023

Same issue here using:

celery==5.3.1 and python==3.10
CELERY_BROKER = 'amqp://guest@localhost//'
CELERY_BACKEND = 'rpc://'

Seems this issue has been lurking for years, I went down the RabbitMQ hole as it seemed to be the recommended celery backend at the time but now it seems the solution is to switch over to another backend like Redis?

Colelyman added a commit to edilytics/CRISPResso2 that referenced this issue Sep 21, 2023
The RPC backend (using RabbitMQ) seems to have a bug that hasn't been fixed in 4
years celery/celery#4084 where the states of the tasks
aren't properly updated. This makes redirecting the results page difficult if
not impossible. Using Redis, we do not encounter these issues.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests