Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway errors in production #443

Closed
esheehan-gsl opened this issue Nov 15, 2023 · 1 comment · Fixed by #446
Closed

Gateway errors in production #443

esheehan-gsl opened this issue Nov 15, 2023 · 1 comment · Fixed by #446
Assignees
Labels
bug Something isn't working
Milestone

Comments

@esheehan-gsl
Copy link
Contributor

Describe the bug

We keep getting 502 Bad Gateway and 504 Gate Timeout errors in production. The timeout typically happens on the time series endpoint when you load the application. This seems to trigger an out of memory error that causes Kubernetes to kill the container. While Kubernetes is managing the containers, you start to see the bad gateway errors for the entire application and all of the data endpoints.

@esheehan-gsl esheehan-gsl added the bug Something isn't working label Nov 15, 2023
@esheehan-gsl esheehan-gsl added this to the Cycle 2023.5 milestone Nov 15, 2023
@esheehan-gsl esheehan-gsl self-assigned this Nov 15, 2023
@esheehan-gsl
Copy link
Contributor Author

The working theory is that the history endpoint runs out of RAM because we have no limits on how far back we pull data, so we end up with all of the data in the store, which just increases over time. Meaning this endpoint will require increasing amounts of RAM over time.

@esheehan-gsl esheehan-gsl linked a pull request Nov 16, 2023 that will close this issue
esheehan-gsl added a commit that referenced this issue Nov 16, 2023
Limit the amount of data read in for the historical data to just two
weeks prior to the initialization time. This should reduce memory usage
in production and allow the application to continue working, solving
#443 (I hope).

In future, we may make this range configurable by users, instead of
hard-coding a two week limit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant