New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dockerized worker with redis gives broken pipe at every startup #5272
Comments
I changed the start worker script, I'm not using autoscale anymore, I'm using the concurrency parameter directly. #!/usr/bin/env bash
CELERY_DOMAIN=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1)
echo ${CELERY_DOMAIN} > /home/user/celery_domain.txt
printf "Celery domain is %s\n" "$CELERY_DOMAIN"
celery worker -A centroservizi -l INFO --concurrency=4 -Q sensor -n sensor@${CELERY_DOMAIN} |
Found an issue https://github.com/celery/kombu/issues/681 in kombu |
Could you please attach the Redis log when the connection is severed? It could be that we're sending a large amount of data? See redis/redis-py#986, redis/redis#4888 and https://stackoverflow.com/questions/43204496/broken-pipe-error-redis. |
How is this related to #4363? |
Sorry, my bad, I confused the issue subject, I will remove the comment, does this remove also the reference in GitHub? |
@thedrow no, I haven't tried it yet with rabbit, I use elasticache in AWS so I can't get the redis logs I will try to reproduce it locally, maybe with the same data in order to have the same load |
I'm quite sure it's not a size issue, I mean, all the parameters that I pass are either, booleans, strings, numbers ir really small dictionaries(one key, sometimes two) |
Yes. |
Seems like with the new version of celery 4.3.0rc2, this does not happen anymore. I will confirm that, in the following days when I will try autoscale in production |
I confirm that I don't see that error message anymore, you can close it @thedrow thanks for fixing it |
Sure, no problem. |
Healthcheck fails after a few tries about in 5 minutes, and when I ping all the workers using
I don't find the worker, I find the other ones by the way.
I have three dockerized workers, using Amazon ECS two of which have light work to do and are not dependent on a VPN so I'm using fargate and the one that gives me problems is on EC2 using host networking mode in a server with a VPN. The other two workers don't have the same start up problem.
This worker has a beat that makes the worker do some work to update some sensors statuses every minute. It uses group calls to do parallel work.
REPORT with part of django settings
part of requirements.txt concerning celery and redis
start script
health check script
Start log
If I can give you other info on the issue just ask
The text was updated successfully, but these errors were encountered: