-
Notifications
You must be signed in to change notification settings - Fork 104
cassandra-idx: load partitions in parallel #1270
Conversation
I think it would be beneficial to make the concurrency configurable. We have some instances that consume a lot of memory if they try and load all partitions at once. So having the concurrency configurable allows us to balance speed vs memory utilisation. |
That makes sense. What would a reasonable default value be? I feel like 1 is the safest, but also slowest. |
Ok, if we can agree on the anem of the param and the default, I'll do another commit and update the docs/sample ini files. |
4e311f5
to
43c49ef
Compare
That's a great idea. |
Yep, wanted to make sure the naming / defaults were agreed upon |
idx/cassandra/cassandra.go
Outdated
wg.Done() | ||
<-gate | ||
}() | ||
var defs []schema.MetricDefinition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try and reuse these slices where possible. If concurrency is set to 1, then we can re-use the same slice for every partition and reduce allocations.
Perhaps a sync.Pool could be used.
|
I should also mention that this change really shines with the PartitionedIndex because lock contention is lower. We aren't using that yet though. |
I have disabled codelingo. too unreliable for now. may need to re-push to get it to get rid of the codelingo step. |
e104b43
to
6720297
Compare
Got a test failure in unrelated test. Possibly due to the latest rebase? |
CI is all green. |
Yes, I clicked "Re-run". Seems like a flaky test. |
idx/cassandra/cassandra.go
Outdated
<-gate | ||
}() | ||
defs = c.LoadPartitions([]int32{p}, defs, pre) | ||
num += c.MemoryIndex.LoadPartition(p, defs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
concurrent write to num? maybe use an atomic here?
@shanson7 i pushed minor tweaks. if my commits looks good to you, then I think this is good to go! |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good. thanks @shanson7 !
Loading the cassandra index takes up to 10 minutes for our nodes. With the changes for partitioned index it used less memory, but wasn't any faster. This change brings it down to 3 minutes which helps reduce our recovery window when we lose a node (also speeds up our rollouts)