Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Helm charts enhancement #147

Closed
guidoiaquinti opened this issue Oct 6, 2021 · 5 comments
Closed

Helm charts enhancement #147

guidoiaquinti opened this issue Oct 6, 2021 · 5 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request helm Helm chart work

Comments

@guidoiaquinti
Copy link
Contributor

👋 Hi! I’m going to list here few random ideas on how we could improve our helm charts divided by topic:

📈 Scaling

  1. we should support vertical and horizontal scaling of all our dependendencies: Kafka, ClickHouse and PostgreSQL

    • vertical service a scale: this is usually an operation used as first mitigation in case of resource contention. It usually involves adding more CPU/memory/storage to a pod.

    • horizontal service scale: this is usually an operation that can take some time (depending on the dataset) and usually requires dataset partitioning/sharding and a cluster rebalance operation.

  2. related to ☝️ we should make sure we mount service data dir on top of resizable storage

🚨 Monitoring & Alerting

As part of the helm charts, we should ship a basic monitoring/alerting stack. I know we have some debugging information already built-in into PostHog and we could probably extend that but I don’t think it will covers most of the cases we might need (e.g. how can we troubleshoot a problem when a PostHog installation is down?)

📑 Documentation

We should document all the maintenance operations & alerts in a runbook.


Please share your ideas and I'll add them to this post. Thank you!

@guidoiaquinti guidoiaquinti added documentation Improvements or additions to documentation enhancement New feature or request labels Oct 6, 2021
@macobo
Copy link
Contributor

macobo commented Oct 6, 2021

Scaling & Documentation

Related issue: #129

As part of the helm charts, we should ship a basic monitoring/alerting stack.

This is kind of done (but undocumented) w/ the prometheus setup.

@guidoiaquinti
Copy link
Contributor Author

guidoiaquinti commented Oct 6, 2021

This is kind of done (but undocumented) w/ the prometheus setup.

Do we ship basic alerts and related runbooks as well? This could have caught few of the issues I've seen in the last few days.

@macobo
Copy link
Contributor

macobo commented Oct 6, 2021

Here's what we ship by default around alerting: https://github.com/PostHog/charts-clickhouse/blob/main/charts/posthog/values.yaml#L592-L671

Runbooks I think would live alongside our documentation in the handbook w/ troubleshooting sections.

@tiina303
Copy link
Contributor

tiina303 commented Oct 6, 2021

Priorities, in my view atm:

high:

  • vertical scaling for Kafka, ... (ticket Resizing Kafka on all platforms #146) this I would say is high priority as we keep seeing problems/questions and we have no docs & the solution to nuke data isn't great.

mid:

  • alerting on PostHog not working & troubleshooting easier for users (there's a bunch of random questions in user slack that are around k8s basics & stuff like plugin server got into a bad state - try restart - ok that worked, great)
  • docs around maintenance operations/alerts

low:

  • horizontal scaling: We have hpa for most things & people use it & I haven't had much user questions about it - I just see them update the charts with improvements to it, so seems like it's working pretty great.

@fuziontech fuziontech added the helm Helm chart work label Jan 19, 2022
@guidoiaquinti
Copy link
Contributor Author

In the last year we have implemented the majority of the improvements above. I'm going to close this issue as we are now tracking the remaining tasks individually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request helm Helm chart work
Projects
None yet
Development

No branches or pull requests

4 participants