Fix shutdown time #188

mariomac · 2022-04-22T12:52:54Z

Some nodes, such as the GRPC+Protobuf server, are not very friendly to shutdown signals from Kubernetes, and may take up to 1 minute before Kubernetes eventually kills the pods. This makes undeployments and redeployments to be very slow.

This PR captures the SIGTERM signals and stops the main goroutine when they are received. The pipeline runs in background so it is automatically stopped when the main goroutine ends.

codecov-commenter · 2022-04-22T13:00:12Z

Codecov Report

Merging #188 (717de58) into main (6eea033) will decrease coverage by 0.10%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main     #188      +/-   ##
==========================================
- Coverage   58.48%   58.38%   -0.11%     
==========================================
  Files          58       58              
  Lines        3365     3371       +6     
==========================================
  Hits         1968     1968              
- Misses       1268     1274       +6     
  Partials      129      129

Flag	Coverage Δ
unittests	`58.38% <0.00%> (-0.11%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
cmd/flowlogs-pipeline/main.go	`0.00% <0.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6eea033...717de58. Read the comment docs.

eranra · 2022-04-24T07:23:40Z

@mariomac thanks for adding this PR ... there is a centralized process to exit all threads in FLP ... the code is here: https://github.com/netobserv/flowlogs-pipeline/blob/main/pkg/pipeline/utils/exit.go#L46 can you use the RegisterExitChannel function here https://github.com/netobserv/flowlogs-pipeline/blob/main/pkg/pipeline/utils/exit.go#L34 so that when we get SIGTERM signal the pipeline will exit (as well as all other threads in the FLP process). This should be very similar to what you added just using that centralized process exit code

KalmanMeth · 2022-04-24T12:39:28Z

cmd/flowlogs-pipeline/main.go

+	}()
+
+	// Subscribe to signals for terminating the program.
+	signal.Notify(stopper, os.Interrupt, syscall.SIGTERM)


This has to be integrated to work with exiting all the other threads that exit when SIGINT or SIGTERM are received. See SetupElegantExit() in flowlogs-pipeline/pkg/pipeline/utils/exit.go

mariomac · 2022-04-25T09:38:06Z

Thanks for the info, @eranra & @KalmanMeth ! I didn't realized this elegant shutdown mechanism.

I adapted the GRPC ingester to work with it and also changed the mechanism itself to use the standard practices in Golang:

Instead of a buffered bool channel, I created an unbuffered struct{} channel (the empty type, as the actual value is not imporant).

Also, the common practice when you want to use a channel as a barrier is not to send a message and read it, but close it. When you try to read a closed channel, the read operation will continue. This actually allows to share a single channel that is closed once and will allow the rest of instances to block until it is closed.

In this PR, I am still using one single channel for ingester, but in a later PR we can share the same channel for all the instances that want the elegant shutdown.

eranra

Hi @mariomac ... thanks for improving the mechanism ... very nice and clean ;-)

mariomac requested review from KalmanMeth and eranra April 22, 2022 12:52

KalmanMeth reviewed Apr 24, 2022

View reviewed changes

Fix shutdown time for GRPC ingester

6ac9c48

eranra approved these changes Apr 25, 2022

View reviewed changes

mariomac merged commit 43ca3a2 into netobserv:main Apr 25, 2022

mariomac deleted the shutdown branch April 25, 2022 10:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix shutdown time #188

Fix shutdown time #188

Uh oh!

mariomac commented Apr 22, 2022

Uh oh!

codecov-commenter commented Apr 22, 2022

Uh oh!

eranra commented Apr 24, 2022

Uh oh!

KalmanMeth Apr 24, 2022

Uh oh!

mariomac commented Apr 25, 2022 •

edited

Loading

Uh oh!

eranra left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix shutdown time #188

Fix shutdown time #188

Uh oh!

Conversation

mariomac commented Apr 22, 2022

Uh oh!

codecov-commenter commented Apr 22, 2022

Codecov Report

Uh oh!

eranra commented Apr 24, 2022

Uh oh!

KalmanMeth Apr 24, 2022

Choose a reason for hiding this comment

Uh oh!

mariomac commented Apr 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eranra left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mariomac commented Apr 25, 2022 •

edited

Loading