Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kafka-consumer] Use wait group to ensure goroutine is finished before returning from Close #4582

Merged
merged 5 commits into from Jul 14, 2023

Conversation

kennyaz
Copy link
Contributor

@kennyaz kennyaz commented Jul 14, 2023

Which problem is this PR solving?

Resolves # 4576

Short description of the changes

Once we call start and call close again with fxtest module, there is a chance that the application will close the goroutine before the logline is executed causing panic Log in routine after .. has completed.

Fix:
Add a wait group that will wait for the goroutine to finish before closing the channel.

Signed-off-by: kennyaz <115052215+kennyaz@users.noreply.github.com>
@kennyaz kennyaz requested a review from a team as a code owner July 14, 2023 14:26
@kennyaz kennyaz requested a review from vprithvi July 14, 2023 14:26
@codecov
Copy link

codecov bot commented Jul 14, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.05 ⚠️

Comparison is base (c582d89) 97.08% compared to head (ccb123f) 97.04%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4582      +/-   ##
==========================================
- Coverage   97.08%   97.04%   -0.05%     
==========================================
  Files         300      301       +1     
  Lines       17813    17839      +26     
==========================================
+ Hits        17293    17311      +18     
- Misses        417      423       +6     
- Partials      103      105       +2     
Flag Coverage Δ
unittests 97.04% <100.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
cmd/ingester/app/consumer/consumer.go 96.96% <100.00%> (+0.06%) ⬆️

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@yurishkuro yurishkuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's indeed a good idea to keep track of all goroutines that begin in Start, and reusing the existing doneWg for this makes sense. However, you're still not tracking the deadlock detector goroutine started in L77, which also performs logging, so I don't think this change will fully address your issue.

Comment on lines 80 to 81
c.logger.Info("Starting main loop")
c.doneWg.Done()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
c.logger.Info("Starting main loop")
c.doneWg.Done()
defer c.doneWg.Done()
c.logger.Info("Starting main loop")

Copy link
Contributor Author

@kennyaz kennyaz Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing, that cause test to timeout because this channel is not closed at all for some reason. I investigate it further and it seems like this method is not closing the partition channel properly at least. The test that causes the code to timeout is https://github.com/jaegertracing/jaeger/blob/main/cmd/ingester/app/consumer/consumer_test.go#L122. It seems like the bug is happening on the "github.com/Shopify/sarama/mocks" side.

Copy link
Contributor Author

@kennyaz kennyaz Jul 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back. It is happening on the Jaeger Side. I will submit a correct push shortly.

Signed-off-by: kennyaz <115052215+kennyaz@users.noreply.github.com>
Signed-off-by: kennyaz <115052215+kennyaz@users.noreply.github.com>
Signed-off-by: kennyaz <115052215+kennyaz@users.noreply.github.com>
Signed-off-by: kennyaz <115052215+kennyaz@users.noreply.github.com>
@kennyaz kennyaz requested a review from yurishkuro July 14, 2023 18:28
@yurishkuro yurishkuro changed the title AddWaitGroup [kafka-consumer] Use wait group to ensure goroutine is finished before returning from Close Jul 14, 2023
@yurishkuro yurishkuro merged commit 673b69f into jaegertracing:main Jul 14, 2023
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants