Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On each metadata change, the topology of partition followers grows in the Gateway's broker topology #8724

Closed
romansmirnov opened this issue Feb 3, 2022 · 1 comment · Fixed by #10255
Assignees
Labels
kind/bug Categorizes an issue or PR as a bug scope/gateway Marks an issue or PR to appear in the gateway section of the changelog severity/low Marks a bug as having little to no noticeable impact for the user version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@romansmirnov
Copy link
Member

Describe the bug

In a long-running scenario, where the Gateway is up and running for a long time, the heap memory consumption of BrokerTopologyManager grows over time:

image

Especially, the property partitionFollowers grows the most:

image

Whenever there is a metadata change, the Gateway gets notified about it by the respective broker. The Gateway will update the topology information accordingly. Thereby, it will update the list of followers for a specific partition every time. Basically, it will always add the broker to the list of followers for a partition:

https://github.com/camunda-cloud/zeebe/blob/df800d41fcc307817b692d17d9c87247f2825516/gateway/src/main/java/io/camunda/zeebe/gateway/impl/broker/cluster/BrokerClusterStateImpl.java#L90-L97

It only removes the broker from the list of followers for a specific partition if the broker transitions to another role then FOLLOWER. Meaning, if a broker remains all the time as FOLLOWER for a specific partition but other metadata changes (like terms or the healthy state) overtime frequently, the broker is added multiple times to the list of followers for that specific partition:

image

The same applies to the property partitionInactiveNodes:

https://github.com/camunda-cloud/zeebe/blob/df800d41fcc307817b692d17d9c87247f2825516/gateway/src/main/java/io/camunda/zeebe/gateway/impl/broker/cluster/BrokerClusterStateImpl.java#L99-L106

To Reproduce

  • Let the cluster with a Gateway run for a long period whereby the metadata changes over time frequently

Expected behavior

  • The followers are listed only once for each partition so that the retained size of the object of BrokerTopologyManager doesn't grow over time.

Environment:

  • Zeebe Version: 1.4.0-SNAPSHOT
@romansmirnov romansmirnov added kind/bug Categorizes an issue or PR as a bug severity/low Marks a bug as having little to no noticeable impact for the user scope/gateway Marks an issue or PR to appear in the gateway section of the changelog labels Feb 3, 2022
@npepinpe npepinpe added this to Planned in Zeebe Feb 8, 2022
@npepinpe
Copy link
Member

npepinpe commented Feb 8, 2022

Thanks for the clear report, seems like a low hanging fruit and we should do it (under the assumption it's a quick fix).

@KerstinHebel KerstinHebel removed this from Planned in Zeebe Mar 23, 2022
@megglos megglos self-assigned this Sep 2, 2022
zeebe-bors-camunda bot added a commit that referenced this issue Sep 5, 2022
10255: fix(gateway): store followers and inactive nodes in sets r=megglos a=megglos

Previous use of lists resulted in unbounded growth due to duplication of the same broker ids.

## Description

Replaces the usage of Lists to keep track of followers and inactiveNodes in the BrokerClusterState with Sets to prevent duplication on arbitrary metadata updates.

## Related issues

closes #8724



Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 5, 2022
10255: fix(gateway): store followers and inactive nodes in sets r=megglos a=megglos

Previous use of lists resulted in unbounded growth due to duplication of the same broker ids.

## Description

Replaces the usage of Lists to keep track of followers and inactiveNodes in the BrokerClusterState with Sets to prevent duplication on arbitrary metadata updates.

## Related issues

closes #8724



Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 5, 2022
10255: fix(gateway): store followers and inactive nodes in sets r=megglos a=megglos

Previous use of lists resulted in unbounded growth due to duplication of the same broker ids.

## Description

Replaces the usage of Lists to keep track of followers and inactiveNodes in the BrokerClusterState with Sets to prevent duplication on arbitrary metadata updates.

## Related issues

closes #8724



Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 6, 2022
10276: [Backport stable/1.3] fix(gateway): store followers and inactive nodes in sets r=megglos a=backport-action

# Description
Backport of #10255 to `stable/1.3`.

relates to #8724

Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 6, 2022
10277: [Backport stable/8.0] fix(gateway): store followers and inactive nodes in sets r=megglos a=backport-action

# Description
Backport of #10255 to `stable/8.0`.

relates to #8724

Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 6, 2022
10277: [Backport stable/8.0] fix(gateway): store followers and inactive nodes in sets r=megglos a=backport-action

# Description
Backport of #10255 to `stable/8.0`.

relates to #8724

Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue Sep 6, 2022
10277: [Backport stable/8.0] fix(gateway): store followers and inactive nodes in sets r=megglos a=backport-action

# Description
Backport of #10255 to `stable/8.0`.

relates to #8724

Co-authored-by: Meggle (Sebastian Bathke) <sebastian.bathke@camunda.com>
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes an issue or PR as a bug scope/gateway Marks an issue or PR to appear in the gateway section of the changelog severity/low Marks a bug as having little to no noticeable impact for the user version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants