Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

cassandra-idx: load partitions in parallel #1270

Merged
merged 6 commits into from
Apr 10, 2019

Conversation

shanson7
Copy link
Collaborator

@shanson7 shanson7 commented Apr 4, 2019

Loading the cassandra index takes up to 10 minutes for our nodes. With the changes for partitioned index it used less memory, but wasn't any faster. This change brings it down to 3 minutes which helps reduce our recovery window when we lose a node (also speeds up our rollouts)

@woodsaj
Copy link
Member

woodsaj commented Apr 4, 2019

I think it would be beneficial to make the concurrency configurable. We have some instances that consume a lot of memory if they try and load all partitions at once. So having the concurrency configurable allows us to balance speed vs memory utilisation.

@shanson7
Copy link
Collaborator Author

shanson7 commented Apr 4, 2019

That makes sense. What would a reasonable default value be? I feel like 1 is the safest, but also slowest.

@shanson7
Copy link
Collaborator Author

shanson7 commented Apr 4, 2019

Ok, if we can agree on the anem of the param and the default, I'll do another commit and update the docs/sample ini files.

@shanson7 shanson7 force-pushed the loadIdxInParallel branch 2 times, most recently from 4e311f5 to 43c49ef Compare April 4, 2019 22:04
@replay
Copy link
Contributor

replay commented Apr 5, 2019

That's a great idea.
scripts/dev/tools-to-doc.sh still needs to be updated to make the tests pass.

@shanson7
Copy link
Collaborator Author

shanson7 commented Apr 5, 2019

Yep, wanted to make sure the naming / defaults were agreed upon

wg.Done()
<-gate
}()
var defs []schema.MetricDefinition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try and reuse these slices where possible. If concurrency is set to 1, then we can re-use the same slice for every partition and reduce allocations.

Perhaps a sync.Pool could be used.

@woodsaj
Copy link
Member

woodsaj commented Apr 5, 2019

init-load-concurrency sounds like the right config name to me.

@shanson7
Copy link
Collaborator Author

shanson7 commented Apr 5, 2019

I should also mention that this change really shines with the PartitionedIndex because lock contention is lower. We aren't using that yet though.

@Dieterbe
Copy link
Contributor

Dieterbe commented Apr 8, 2019

I have disabled codelingo. too unreliable for now. may need to re-push to get it to get rid of the codelingo step.

@shanson7 shanson7 closed this Apr 9, 2019
@shanson7 shanson7 reopened this Apr 9, 2019
@shanson7
Copy link
Collaborator Author

shanson7 commented Apr 9, 2019

Got a test failure in unrelated test. Possibly due to the latest rebase?

@Dieterbe
Copy link
Contributor

Dieterbe commented Apr 9, 2019

CI is all green.

@shanson7
Copy link
Collaborator Author

shanson7 commented Apr 9, 2019

Yes, I clicked "Re-run". Seems like a flaky test.

<-gate
}()
defs = c.LoadPartitions([]int32{p}, defs, pre)
num += c.MemoryIndex.LoadPartition(p, defs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concurrent write to num? maybe use an atomic here?

@Dieterbe
Copy link
Contributor

@shanson7 i pushed minor tweaks. if my commits looks good to you, then I think this is good to go!

@shanson7
Copy link
Collaborator Author

LGTM

Copy link
Contributor

@Dieterbe Dieterbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. thanks @shanson7 !

@Dieterbe Dieterbe changed the title Load partitions in parallel cassandra-idx: load partitions in parallel Apr 10, 2019
@Dieterbe Dieterbe merged commit ebcbb7b into grafana:master Apr 10, 2019
@shanson7 shanson7 deleted the loadIdxInParallel branch February 24, 2020 11:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants