Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

pointSlicePool issues #962

Closed
woodsaj opened this issue Jul 16, 2018 · 10 comments · Fixed by #1924
Closed

pointSlicePool issues #962

woodsaj opened this issue Jul 16, 2018 · 10 comments · Fixed by #1924

Comments

@woodsaj
Copy link
Member

woodsaj commented Jul 16, 2018

To reduce allocations we use a pointsSlicePool when reading datapoints out of a chunk.
Chunks are read in GetTargetsLocal -> getSeries -> getSeriesFixed
GetTargetsLocal is called when a render request is received and also when a GetData request is received. The GetData call is made when a node does not own a required series and so must call out to a peer for it.

issue 1

The pointSlices used in a render request are returned to the pool after renderMetrics function has completed.
However, if the getTargetsLocal was called from a getData request, the points never get returned to the pool.

issue 2

In a metrictank cluster, the majority of points used on a render request are going to come from remote nodes. However, we are not using the pointsSlicePool when UnMarshalling the byte[] slice returned from the peer.

@woodsaj
Copy link
Member Author

woodsaj commented Jul 16, 2018

fixing issue 1 is easy. just put the datapoint slices back in the pool at the end of https://github.com/grafana/metrictank/blob/master/api/cluster.go#L194-L204

woodsaj pushed a commit that referenced this issue Jul 16, 2018
@shanson7
Copy link
Collaborator

I believe sync.Pools are also cleared out on every GC, so that could result in more allocations

@Dieterbe
Copy link
Contributor

Dieterbe commented Aug 6, 2018

related: #958

@Dieterbe
Copy link
Contributor

Dieterbe commented Aug 6, 2018

I believe sync.Pools are also cleared out on every GC

correct

@Dieterbe
Copy link
Contributor

some more analysis needed but this issue has the potential to significantly reduce memory usage.

@woodsaj what brought you to open this issue? observing any particular issue?

@woodsaj
Copy link
Member Author

woodsaj commented Aug 27, 2018

I was looking over code after a production incident where an MT node suddenly allocated 20+GB of memory. I was just trying to understand the code paths for handling queries and noticed the issues with the pointSlicePool

@stale
Copy link

stale bot commented Apr 4, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 4, 2020
@Dieterbe Dieterbe removed the stale label Apr 6, 2020
@stale
Copy link

stale bot commented Jul 5, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jul 5, 2020
@stale stale bot closed this as completed Jul 12, 2020
@Dieterbe Dieterbe removed the stale label Jul 15, 2020
@Dieterbe Dieterbe reopened this Jul 15, 2020
@stale
Copy link

stale bot commented Oct 14, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@Dieterbe
Copy link
Contributor

Issue 1 will be fixed by #1924
Issue 2 has been fixed by #1921

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants