-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs: interactive capacity planning tool #1988
Comments
I think that a spreadsheet seems like the easiest solution, OTH if we'd create a CLI tool to do the capacity planning then all Mimir contributors could contribute to it and possibly improve it based on their own experience, a spreadsheet would likely have to be restricted in some way because otherwise there is no review process for changes to the spreadsheet. |
Let me know if you're looking for contributions on this! |
That's right. And it's also more difficult to test. Writing unit tests in golang is way easier.
In this case we wouldn't have to create a new tool. We already have
We do! Let's just reach a consensus on how it should work (e.g. spreadsheet vs CLI tool). Let me ping rest of Mimir maintainers / squad, to get a quick feedback loop. |
I would vote for a CLI tool - because it would allow reviews, change history, etc in a more familiar format for Mimir contributors. Having the logic implemented as go code also makes it easier to eventually extend into more sophisticated use-cases (far in the future) like an auto-scaling operator, generating helm values file automatically, etc. We can start to get feedback on the formulas used and build on that knowledge later. For now something simple + straightforward like a new command in |
cc @osg-grafana |
What would CLI tool look like? I have hard time imagining command-line interface that would beat the spreadsheet (or simple webpage with some javascript to do the calculation) in terms of ease of use. |
Something like:
From an ease of use perspective, I agree a web UI would be easier to use. On the other side, collaborating on a web UI may be more complicated (e.g. no code reviews and no external contributors on spreadsheet, not much JS experience not even enough tooling like unit tests, ...). Given we publish |
I don't think single HTML page with some javascript would be too difficult to review and collaborate on, but you're right that we don't have tooling for it prepared. (Maybe writing it in Go and compiling into webasm would work just fine? 😄 I have 0 experience with that though.) Your example isn't too bad just yet, but it gets more complex with more parameters very quickly. |
If it gets too complex we could consider to provide the tool with a configuration file, where the configuration file defines all the relevant parameters. Then we could deliver the tool together with an example configuration file, so a user could just copy the example configuration file and adjust all the defined parameters there. I think this will be easier to use then looking up lots of cli args from |
/half-joke: We can distribute jsonnet file with example values and all the math, and let people edit and render that :) |
nice idea, but i kind of suspect that most users will stick to helm and don't know how to use jsonnet. |
One of the requirements is that we need to use a language for which it's not complicated to write unit tests. I think jsonnet doesn't fit it. |
I don't see big benefit of unit-testability in this specific case given that the feature is basically set of formulas that show some numbers to the user. As a user of this feature, I want to:
I see these needs covered better by tools like Google Sheets or Jsonnet rather than tool with hardcoded-formulas in it. If we wanted to go jsonnet route, we could embed jsonnet interpreter library into And we have plenty of tests for our jsonnet config in the Mimir repo already. |
My idea is to build two tools:
|
What would be the output of mimirtool? I reckon that core/memory/disk per mimir module should be enough. I've be working on a similar tool which address this question the other way around.
Here is an example of the output {
"performance": {
"write path": {
"distributor samples/sec": 120000,
"ingester active series": 1920000
},
"read path": {
"query-frontend queries/sec": 1200,
"query-scheduler queries/sec": 2400,
"querier queries/sec": 48,
"store-gateway queries/sec": 192,
"active series": 36923077
},
"compaction": {
"compactable active series": 60000000
}
},
"specs": {
"write path": {
"distributor": {
"count": 3,
"flavor": "b2-15"
},
"ingester": {
"count": 3,
"flavor": "b2-60"
},
"compactor": {
"count": 3,
"flavor": "b2-60"
}
},
"read path": {
"query-frontend": {
"count": 3,
"flavor": "b2-15"
},
"query-scheduler": {
"count": 3,
"flavor": "b2-15"
},
"querier": {
"count": 3,
"flavor": "b2-15"
},
"store-gateway": {
"count": 3,
"flavor": "b2-60"
}
},
}
} (the flavor are based on OVHcloud public cloud) |
I would also add number of replicas per Mimir component. Output format should be configurable, ideally supporting:
Right. At Grafana Labs we call it "target capacity" and that should be another input factor too. |
Scoping estimation high because this doc ticket is large unactionable at its current stage in development. |
Removing from Docs Squad backlog because @cristiangsp and @osg-grafana agree that it is in Engineering’s hands. |
We're hearing feedback from OSS community (e.g. this Slack thread) that capacity planning doc apparently show more resources than probably required. I think a reason is that there are multiple factors to run a proper capacity planning, while the doc is an oversimplification.
We could provide an interactive capacity planning tool where given some input (e.g. active series, samples/sec, queries/sec, retention, ...) we compute a more accurate capacity plan.
An option could build a Google Spreasheet and embed it in doc.
The text was updated successfully, but these errors were encountered: