-
-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement persistence through application restart #32
Comments
This has been on my list as well. I've also entertained the idea of having another in-memory map that stored the last N outages for each services and dynamically generated a timeline instead of trying to persist the entire history, i.e.
Though this wouldn't survive application restarts. But yeah, thanks for opening this issue. |
Just a side note: Having a separate persistence layer (not in-memory) would be a step towards running Gatus in a high-availability setup to ensure monitoring works even when a host is down. In this case the state of performed requests of all instances would be shared and the dashboard could show them all. Syncing execution of requests and sending of alerts would still be an open topic though 🙃 |
As discussed in #66 (comment), the first step towards this will be introducing file-based persistence. Later, other storage means can be introduced if required (e.g databases). |
I am taking a look at this with @cjheppell this afternoon |
Posting part of the comment I posted in #69 here for traceability:
To continue on that comment, the implementation I went for does not persist the data immediately. It dumps the data to a file every 7 minutes. I had initially tried to implement it through bolt only, but I was very unsatisfied with the performance of using file-only, so I decided to use a hybrid: in memory + occasional persistence. I've renamed this issue to focus purely on persisting data to survive restarts, thus allowing the data generated by Gatus to survive restarts which in turns makes services with longer interval more viable than before. The ability to go back in history will be added in the future, but I don't have a specific date yet. P.S. For those of you who want the ability to view older history over a long period right now as opposed to just persistence, it's not ideal, but you can probably leverage the |
Released in v2.1.0 |
At the moment, it seems that the status results are stored in an in-memory map: https://github.com/TwinProduction/gatus/blob/3773f952a80058eb88f48fe9ae9ac51bf1c1efe7/watchdog/watchdog.go#L16
And also limited to only 20 results:
https://github.com/TwinProduction/gatus/blob/3773f952a80058eb88f48fe9ae9ac51bf1c1efe7/watchdog/watchdog.go#L59-L62
It'd be great if these were stored in a persistent data store somehow instead (e.g a database, or files on disk). Whilst Gatus currently only returns the last 20 results, it'd be nice to keep the history to review outages that might have occurred in the past. Storing the results in a database/persistence layer would enable this as the first step.
Of course, the option to retain results in memory should also be kept as it makes Gatus very easy to get up and running.
The text was updated successfully, but these errors were encountered: