Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashboard/monitoring to surface problems #639

Open
shlomi-nx opened this issue Feb 24, 2021 · 0 comments
Open

Dashboard/monitoring to surface problems #639

shlomi-nx opened this issue Feb 24, 2021 · 0 comments

Comments

@shlomi-nx
Copy link
Contributor

Dashboard/monitoring to surface problems running procedures, resolving intents

We started using Jeager, which may be sufficient. Let's see if we can get all we want. We want running statistics on:

  1. Success/failure rates of procedure invocations. Perhaps break-down by failures:
    1. e.g. Find a significant increase in failures of type "no free machines" on a particular procedure
  2. Time-to-completion of procedures, to see if something which takes N to complete suddenly runs significantly different to N
  3. Time-to-completion of SFS - same as procedures
  4. Stats on outcome of different SFS:
    1. success - successfully resolved the diff
    2. errors - Exited with some exception
    3. unresolved - no exceptions but diff still outstanding
@shlomi-nx shlomi-nx created this issue from a note in Customer-1's asks (Nice to have 1.0) Feb 24, 2021
@shlomi-nx shlomi-nx added this to the Customer-1 M3 milestone Feb 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Customer-1's asks
Nice to have 1.0
Development

No branches or pull requests

1 participant