-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: memory leak inside session windower #1445
Conversation
yhl25
commented
Jan 10, 2024
•
edited
edited
- fixes memory leak inside session windower
- metrics to track number of active and closed windows
- enhance tickgen to produce out of order messages
Signed-off-by: Yashash H L <yashashhl25@gmail.com>
Signed-off-by: Yashash H L <yashashhl25@gmail.com>
pkg/sources/kafka/reader.go
Outdated
@@ -142,20 +143,33 @@ func (r *kafkaSource) Partitions(context.Context) []int32 { | |||
} | |||
|
|||
func (r *kafkaSource) Read(_ context.Context, count int64) ([]*isb.ReadMessage, error) { | |||
latestEtMap := make(map[int32]int64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for debug purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had added it for debugging, will remove it
|
||
// print the latest event time for each partition | ||
for partition, latestEt := range latestEtMap { | ||
df.opts.logger.Infow("Latest event time for partition - ", zap.Int32("partition", partition), zap.Int64("latestEt", int64(time.Since(time.Unix(0, latestEt)).Minutes()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debugw?
for _, m := range transformerResults { | ||
writeMessages = append(writeMessages, m.writeMessages...) | ||
for _, message := range m.writeMessages { | ||
// we convert each writeMessage to isb.ReadMessage by providing its parent ReadMessage's ReadOffset. | ||
// since we use message event time instead of the watermark to determine and publish source watermarks, | ||
// time.UnixMilli(-1) is assigned to the message watermark. transformedReadMessages are immediately | ||
// used below for publishing source watermarks. | ||
if latestEt, ok := latestEtMap[m.readMessage.ReadOffset.PartitionIdx()]; !ok || message.EventTime.UnixNano() < latestEt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all of these debugging purpose logic, we should think about using an approach that does not affect the performance. e.g. turn it off by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Value *uint64 `json:"value,omitempty" protobuf:"bytes,5,opt,name=value"` | ||
// Jitter is the jitter for the message generation, used to simulate out of order messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more descriptions? an example would be great. I need to look into code to fully understand how jitter specification works.
pkg/sources/kafka/reader.go
Outdated
//// print the latest event time for each partition | ||
//for partition, latestEt := range latestEtMap { | ||
// r.logger.Infow("Latest event time for partition - ", zap.Int32("partition", partition), zap.Int64("latestEt", int64(time.Since(time.Unix(0, latestEt)).Minutes()))) | ||
//} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we uncomment or remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will remove
@@ -183,6 +192,19 @@ func (w *Windower) CloseWindows(time time.Time) []*window.TimedWindowRequest { | |||
windowOperations = append(windowOperations, operation) | |||
w.closedWindows.InsertBack(win) | |||
} | |||
|
|||
metrics.ActiveWindowsCount.With(map[string]string{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity, why do we choose to emit metrics when we call CloseWindows, as opposed to other methods that are also updating the active&closed windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CloseWindows() is invoked once per batch, whereas other functions are invoked for every message. I think it's good to track the windows count once per batch.
@@ -99,8 +105,11 @@ type Windower struct { | |||
closedWindows *window.SortedWindowListByEndTime | |||
} | |||
|
|||
func NewWindower(length time.Duration, slide time.Duration) window.TimedWindower { | |||
func NewWindower(length time.Duration, slide time.Duration, vertexInstance *dfv1.VertexInstance) window.TimedWindower { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about adding a new method to the Windower interface EmitMetrics(vertexInstance *dfv1.VertexInstance)
? That way the windower can emit metrics not only after closing windows but also after other operations, which is up to the caller to decide. It also simplifies the Windower struct itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think adding Prometheus metrics related functionality to the interface is ideal, the caller shouldn't be tasked with handling these metrics, since they are not directly related to window functionality.
Signed-off-by: Yashash H L <yashashhl25@gmail.com>