Skip to content

prevent dupes in event loop on Consumer restart (causing leak?)#7187

Merged
auvipy merged 1 commit intocelery:masterfrom
pawl:event_loop_cleanup
Dec 25, 2021
Merged

prevent dupes in event loop on Consumer restart (causing leak?)#7187
auvipy merged 1 commit intocelery:masterfrom
pawl:event_loop_cleanup

Conversation

@pawl
Copy link
Copy Markdown
Contributor

@pawl pawl commented Dec 25, 2021

Description

Currently, each time the prefork worker encounters a connection error and restarts the Consumer, it will result in an additional AsynPool._create_write_handlers.<locals>.on_poll_start getting added to the event loop (in Hub.on_tick). This may be causing a minor memory leak and may slow down the event loop after a lot of connection errors.

For more details, I think this is the flow causing the issue:

Expand
1. WorkController runs .start(), which starts the Worker's bootsteps
2. The Worker's Pool Bootstep starts and initializes AsynPool
3. The Worker's Consumer Bootstep runs .start() on the Consumer's Blueprint, which starts the Consumer's bootsteps.
4. Consumer's Evloop bootstep starts, which runs Evloop.loop().
5. Evloop.loop() runs asynloop()
6. asynloop() calls Worker.register_with_event_loop, which calls register_with_event_loop on all Bootsteps in the Worker's Blueprint.
7. Worker's Pool bootstep's register_with_event_loop adds AsynPool._create_write_handlers.<locals>.on_poll_start` to Hub.on_tick.
8. asynloop() calls Consumer.register_with_event_loop, which calls register_with_event_loop on all Bootsteps in the Consumer's Blueprint.
9. asynloop() starts the event loop to start constantly running all functions in Hub.on_tick.
10. ConnectionError is caught while running blueprint.start() in Consumer.start()
12. The Consumer's blueprint is restarted, which will .stop() then start() all the Bootsteps in the Consumer's Blueprint.
13. The Evloop bootstep stops then starts.
14. Evloop.loop() runs again, which eventually adds AsynPool._create_write_handlers.<locals>.on_poll_start to Hub.on_tick again.</details>

This PR prevents the unnecessary duplicate on_poll_starts from getting added to the event loop on each Consumer restart by allowing hub.on_tick.add(self.on_poll_start) to run only once per instance of AsynPool.

@auvipy auvipy requested review from auvipy and maybe-sybr December 25, 2021 06:05
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 25, 2021

Codecov Report

Merging #7187 (bc00526) into master (83869da) will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7187      +/-   ##
==========================================
- Coverage   89.36%   89.35%   -0.01%     
==========================================
  Files         138      138              
  Lines       16759    16759              
  Branches     2448     2448              
==========================================
- Hits        14976    14975       -1     
- Misses       1553     1554       +1     
  Partials      230      230              
Flag Coverage Δ
unittests 89.34% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
celery/backends/asynchronous.py 68.83% <0.00%> (-0.44%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 843396e...bc00526. Read the comment docs.

@auvipy auvipy added this to the 5.2.x milestone Dec 25, 2021
Copy link
Copy Markdown
Member

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the coverage improved little bit more? also were were you abe to pass integration tests locally with this changes?

@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Dec 25, 2021

This pull request introduces 3 alerts and fixes 1 when merging bc00526 into 843396e - view on LGTM.com

new alerts:

  • 2 for Wrong name for an argument in a call
  • 1 for Clear-text logging of sensitive information

fixed alerts:

  • 1 for Unused import

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants