[Worker] Check if worker allocator is terminated in static allocation mode #3105

TomerShor · 2023-12-31T10:52:16Z

Followup to #3092 , where we are blocking worker allocation if a the workers are terminated:

Add termination check to the static worker allocator mode. It was missing from [Processor] Block worker allocation if worker allocator is terminated #3092 because the static allocator allocates all workers on startup, instead of per event.
Rename termination state related methods to "drain state".
In Kafka's drainOnRebalance - close the readyForRebalanceChan after sending a value on it. This will help cases where the maxWaitHandlerDuringRebalance times out before the drain handler finishes, and then the channel is closed before the drain invoking goroutine passes a value on it.

… mode

rokatyy · 2024-01-02T09:11:32Z

pkg/processor/trigger/kafka/trigger.go

@@ -360,6 +359,7 @@ func (k *kafka) drainOnRebalance(session sarama.ConsumerGroupSession,

 		wg.Wait()
 		readyForRebalanceChan <- true
+		close(readyForRebalanceChan)


In that case if we are out of waiting time (in select operator), we will never close the chan because writing to the chan is blocking operation. So, I would leave it as is.

The goroutine will still run even when the timeout has passed and the select doesn't wait for this channel.

The issue is that if timeout has passed, the function exists and closes the readyForRebalanceChan. The goroutine continues to run, and tries to write to the closed channel.
My thought was to not close the channel before the goroutine is done.

@TomerShor Yes, but if the goroutine is still running when the timeout has passed, this change will result in the channel never being closed. We won't read from readyForRebalanceChan, leading to a zombie goroutine persisting indefinitely.

Currently, it's possible to attempt writing to a closed channel, causing a panic and the goroutine to exit. To address this issue properly and prevent the panic, we should notify the goroutine from the main function body that the timeout has passed, and there's no need to write anything to the channel. But as for me, panicing in this goroutine is not a big issue

@rokatyy We already have a recover for that case in this goroutine 🤦
I will revert this change.

rokatyy

Overall looks good, nice catch with SignalTermination!
One question about closing chan and we are done.

pkg/processor/worker/allocator.go

rokatyy · 2024-01-02T09:18:21Z

pkg/processor/worker/allocator.go

@@ -116,13 +119,18 @@ func (s *singleton) SignalDraining() error {
 }

 func (s *singleton) SignalTermination() error {
-	return s.worker.Drain()
+	s.isTerminated = true


[Worker] Check if worker allocator is terminated in static allocation…

de774a3

… mode

github-actions bot added the trigger/kafka label Dec 31, 2023

TomerShor requested review from liranbg and rokatyy December 31, 2023 15:07

rokatyy reviewed Jan 2, 2024

View reviewed changes

rokatyy requested changes Jan 2, 2024

View reviewed changes

Kate CR

4694e2b

TomerShor requested a review from rokatyy January 2, 2024 10:20

rokatyy approved these changes Jan 2, 2024

View reviewed changes

TomerShor merged commit 53ea9ed into nuclio:development Jan 2, 2024
11 checks passed

TomerShor deleted the kafka-drain-termination branch January 2, 2024 11:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Worker] Check if worker allocator is terminated in static allocation mode #3105

[Worker] Check if worker allocator is terminated in static allocation mode #3105

TomerShor commented Dec 31, 2023

rokatyy Jan 2, 2024

TomerShor Jan 2, 2024

rokatyy Jan 2, 2024

TomerShor Jan 2, 2024

rokatyy left a comment

rokatyy Jan 2, 2024

[Worker] Check if worker allocator is terminated in static allocation mode #3105

[Worker] Check if worker allocator is terminated in static allocation mode #3105

Conversation

TomerShor commented Dec 31, 2023

rokatyy Jan 2, 2024

Choose a reason for hiding this comment

TomerShor Jan 2, 2024

Choose a reason for hiding this comment

rokatyy Jan 2, 2024

Choose a reason for hiding this comment

TomerShor Jan 2, 2024

Choose a reason for hiding this comment

rokatyy left a comment

Choose a reason for hiding this comment

rokatyy Jan 2, 2024

Choose a reason for hiding this comment