Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: System Health Report #26046

Merged
merged 20 commits into from Apr 22, 2024
Merged

feat: System Health Report #26046

merged 20 commits into from Apr 22, 2024

Conversation

ankush
Copy link
Member

@ankush ankush commented Apr 18, 2024

Screenshot speaks louder than words.

Screen Shot 2024-04-19 at 17 13 16-fullpage

TODO:

  • perf / caching ( removed most slow funcs, should load fine in <30 seconds in most cases, can make separate small requests in future if required)
  • error handling
  • tabs -> section (less clicking around)
  • highlight "bad" indicators
  • constantly failing scheduled jobs (can reveal things like broken backup, repost etc)
  • highlight error with >10 occurrences

Closes #25486

@ankush ankush added the Skip CI Doesn't run Ci for this PR. label Apr 19, 2024
@ankush ankush removed the Skip CI Doesn't run Ci for this PR. label Apr 19, 2024
@ankush ankush marked this pull request as ready for review April 19, 2024 11:44
@ankush ankush requested a review from a team as a code owner April 19, 2024 11:44
@ankush ankush requested review from akhilnarang and removed request for a team April 19, 2024 11:44
@rmehta
Copy link
Member

rmehta commented Apr 19, 2024

Nice, make add some emphasis, colour for measures that seem out of some range?

@ankush
Copy link
Member Author

ankush commented Apr 19, 2024

@rmehta red borders are for that 🙈

@rmehta
Copy link
Member

rmehta commented Apr 19, 2024

@ankush ah - that is mixed up with mandatory (value missing). Maybe try something else, like text in bold and ping / yellow background?

@ankush ankush requested a review from a team April 19, 2024 12:23
- fix styles
- hardcode perm check
- few more indicators
- cache directory size for 5 min (rapid refreshes should be fast enough)
@git-avc
Copy link
Contributor

git-avc commented Apr 22, 2024

Maybe something like this?
image

@ankush ankush disabled auto-merge April 22, 2024 07:07
@ankush ankush enabled auto-merge April 22, 2024 07:09
@ankush ankush merged commit 4b705fc into frappe:develop Apr 22, 2024
22 of 24 checks passed
@ankush ankush deleted the system_helth branch April 22, 2024 07:15
@ankush ankush added the backport version-15-hotfix Backport the PR to v15 label Apr 22, 2024
ankush added a commit that referenced this pull request Apr 22, 2024
* feat: System Health Report

(cherry picked from commit e06901a)

* feat: background worker monitoring

(cherry picked from commit d410af7)

* feat: better bench doctor in UI

(cherry picked from commit d7a0ed8)

* feat: socketio health check

(cherry picked from commit 023297b)

# Conflicts:
#	realtime/handlers/frappe_handlers.js

* feat: email health checks

(cherry picked from commit 2df9e2e)

* feat: Errors in System Health

(cherry picked from commit 7bfa31f)

* feat: database health stats

(cherry picked from commit b9ed8c5)

* feat: cache health

(cherry picked from commit 92dc5f3)

* feat: backup health

(cherry picked from commit 614857e)

* feat: system health - users

(cherry picked from commit 5b70060)

* refactor: Single page instead of tabs

(cherry picked from commit 99d2dea)

* feat: background jobs test

(cherry picked from commit 7411c4f)

* fix: exception handling for health report

(cherry picked from commit cbf4351)

* chore: rename child doctypes

(cherry picked from commit a94534a)

* feat: highlight bad indicators

(cherry picked from commit 4f406d7)

* fix(UX): help links and relative URLs

also closes #23020

(cherry picked from commit d40b2a2)

* feat: extend highlight to child tables

(cherry picked from commit b0ce404)

* refactor: use table for errors

(cherry picked from commit 9154e42)

* feat: failng scheduled jobs

(cherry picked from commit c712780)

* refactor: misc

- fix styles
- hardcode perm check
- few more indicators
- cache directory size for 5 min (rapid refreshes should be fast enough)

(cherry picked from commit c9a8cd6)

* chore: conflicts

---------

Co-authored-by: Ankush Menat <ankush@frappe.io>
@ankush
Copy link
Member Author

ankush commented Apr 22, 2024

I changed it to light red background (almost pink) + bold fonts.

Not updating screenshot, that takes some time to produce fake data 🥴

frappe-pr-bot pushed a commit that referenced this pull request Apr 23, 2024
# [15.24.0](v15.23.0...v15.24.0) (2024-04-23)

### Bug Fixes

* 🐛 don't create __init__.py files when gathering pages ([#26045](#26045)) ([#26091](#26091)) ([285a30f](285a30f)), closes [#25167](#25167)
* allow setting dynamic filters for number cards even without developer mode ([8811e82](8811e82))
* Avoid permission check on unsaved doc ([#26027](#26027)) ([#26031](#26031)) ([334d353](334d353))
* dashboard link number color for timeless night ([#26058](#26058)) ([d6a060d](d6a060d))
* datepicker time row color for timeless night ([#26077](#26077)) ([ef2f3e2](ef2f3e2))
* filter select perm in get_doctypes_with_read (backport [#26037](#26037)) ([#26040](#26040)) ([2d7d38e](2d7d38e))
* filters on prepared report export ([627a0ed](627a0ed))
* **grid_row:** check child table dependent properties whenever a row is selected ([6ec64a8](6ec64a8))
* **grid:** ensure that `doc` itself is not null ([b4c9d40](b4c9d40)), closes [#25800](#25800)
* increase report limit ([#26102](#26102)) ([#26104](#26104)) ([8706dd8](8706dd8))
* limit select user to desk users by default ([#25843](#25843)) ([#25996](#25996)) ([374c75c](374c75c))
* only notify for modified greater than DB ([#26070](#26070)) ([#26071](#26071)) ([224d8aa](224d8aa))
* register faulthandler on true stderr only (backport [#26028](#26028)) ([#26034](#26034)) ([bb0f1be](bb0f1be))
* **report_view:** allow exporting all rows even if count is disabled ([0f65a23](0f65a23))
* **resolver:** handle werkzeug redirect exception ([3f9b5f3](3f9b5f3))
* runtime error during pot build ([#25991](#25991)) ([#25992](#25992)) ([58a133b](58a133b))
* strip redirect URIs for trailing whitespaces ([#26006](#26006)) ([#26008](#26008)) ([d543dd3](d543dd3))
* unknown charset windows-874 problem on incoming mail ([14e1a31](14e1a31))

### Features

* add copy to clipboard on read only code fields ([62f09f7](62f09f7))
* enable dynamic filters for standard number cards ([ba2f70a](ba2f70a))
* gettext translations (v15 port) ([#25982](#25982)) ([0189bb2](0189bb2))
* System Health Report (backport [#26046](#26046)) ([#26085](#26085)) ([7b8a923](7b8a923))
* What's New ([#25986](#25986)) ([fc0ab40](fc0ab40))
@ankush ankush added the backport version-14-hotfix backport to version 14 label Apr 30, 2024
ankush added a commit that referenced this pull request May 1, 2024
* feat: System Health Report

(cherry picked from commit e06901a)

* feat: background worker monitoring

(cherry picked from commit d410af7)

* feat: better bench doctor in UI

(cherry picked from commit d7a0ed8)

* feat: socketio health check

(cherry picked from commit 023297b)

# Conflicts:
#	realtime/handlers.js

* feat: email health checks

(cherry picked from commit 2df9e2e)

* feat: Errors in System Health

(cherry picked from commit 7bfa31f)

* feat: database health stats

(cherry picked from commit b9ed8c5)

* feat: cache health

(cherry picked from commit 92dc5f3)

* feat: backup health

(cherry picked from commit 614857e)

* feat: system health - users

(cherry picked from commit 5b70060)

* refactor: Single page instead of tabs

(cherry picked from commit 99d2dea)

* feat: background jobs test

(cherry picked from commit 7411c4f)

* fix: exception handling for health report

(cherry picked from commit cbf4351)

* chore: rename child doctypes

(cherry picked from commit a94534a)

* feat: highlight bad indicators

(cherry picked from commit 4f406d7)

* fix(UX): help links and relative URLs

also closes #23020

(cherry picked from commit d40b2a2)

* feat: extend highlight to child tables

(cherry picked from commit b0ce404)

* refactor: use table for errors

(cherry picked from commit 9154e42)

* feat: failng scheduled jobs

(cherry picked from commit c712780)

* refactor: misc

- fix styles
- hardcode perm check
- few more indicators
- cache directory size for 5 min (rapid refreshes should be fast enough)

(cherry picked from commit c9a8cd6)

* chore: v14 compat

* chore: strip types

* chore: v14 compat

* chore: v14 compat

* fix: redis cache isn't required

---------

Co-authored-by: Ankush Menat <ankush@frappe.io>
frappe-pr-bot pushed a commit that referenced this pull request May 7, 2024
# [14.74.0](v14.73.0...v14.74.0) (2024-05-07)

### Bug Fixes

* Apply configured perms on address list ([#26334](#26334)) ([#26335](#26335)) ([4307ab4](4307ab4))
* args is a stringified JSON ([98ece0e](98ece0e))
* changes for scheduler reliability (backport [#26292](#26292)) ([#26293](#26293)) ([7691afe](7691afe))
* **Data Import:** don't rely on permission for Data Import Log (backport [#26228](#26228)) ([#26250](#26250)) ([fd0a844](fd0a844))
* **Data Import:** scheduler not needed in dev mode (backport [#24667](#24667)) ([#26264](#26264)) ([9712f14](9712f14))
* disabled user login from login via link feature ([#26134](#26134)) ([#26140](#26140)) ([96b7542](96b7542))
* don't add creation index if one exists ([#26295](#26295)) ([#26297](#26297)) ([c74dcbd](c74dcbd))
* **Geo:** change Canadian dates to ISO 8601 format ([351cd04](351cd04))
* init db conn for unbuffered cursor if not set ([#26220](#26220)) ([#26256](#26256)) ([04afefb](04afefb))
* lstrip for query writes detection ([#26180](#26180)) ([#26252](#26252)) ([6ebfe54](6ebfe54))
* multistep webform page navigation ([d5a25f2](d5a25f2))
* **Navbar Settings:** reload page after save ([#26274](#26274)) ([#26275](#26275)) ([73f265b](73f265b))
* **oauth2:** refresh token is optional ([#26266](#26266)) ([#26271](#26271)) ([d6603c6](d6603c6)), closes [/www.rfc-editor.org/rfc/rfc6749#section-5](https://github.com//www.rfc-editor.org/rfc/rfc6749/issues/section-5)
* only redirect to same domain (backport [#26304](#26304)) ([#26305](#26305)) ([c2f2d6c](c2f2d6c))
* perm query for dashboard (backport [#26239](#26239)) ([#26242](#26242)) ([4ab6a46](4ab6a46))
* reportview average of ints should be float (backport [#26284](#26284)) ([#26287](#26287)) ([c0f3912](c0f3912))
* Treeview DB lookup should perform the same preperation operations as method update_nsm in file nestedset.py ([#26199](#26199)) ([#26259](#26259)) ([01e08f8](01e08f8))

### Features

* `Desk User` role (backport [#22224](#22224)) ([#26237](#26237)) ([171e1d0](171e1d0))
* System Health Report (backport [#26046](#26046)) ([#26255](#26255)) ([f2d2d0c](f2d2d0c))

### Performance Improvements

* Reduce 1 redis call while dumping monitor logs (backport [#26337](#26337)) ([#26338](#26338)) ([75b2a86](75b2a86))
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 15, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport version-14-hotfix backport to version 14 backport version-15-hotfix Backport the PR to v15
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: automated system health checks
3 participants