New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking: Report user errors #7831
Comments
I hope I understand it right, the
Do we report the concrete error message or only the error count? If it's the former, is there a mechanism for persisting (or buffering) the error messages or throttling a bulk of errors (for example, wrong column descriptors lead to failure for every line in the parser)? |
We will record the concrete error message if it has not been truncated. We will truncate only if the message is blacklisted (i.e. we do not know if it can have unbounded cardinality) and if the number of unique blacklisted messages recorded exceed some threshold (e.g. 50). Our hope is to recover the full, informative error message in the happy path. EDIT: truncation is no longer planned |
Implements: #7824
ExprError
ascompute_error
in stream jobs (feat(stream): Reportcompute_error_count
to prometheus #7832)source_error_count
reporting to prometheus #7877)ErrorSuppressor
for user compute errors #8132)metrics, source_info
intosource_ctx
and addErrorSuppressor
for user source errors #8156)WARN
user about batch source errors #8135)- [ ] Truncate errors if blacklisted (Stream Error Truncation Mechanism #7871)Orthogonal:
ExprError
- however, this might be quite hard as the Debug fmt may not clear for the nested BoxedExpression used in our executors)Stream Errors
panel (as suggested here: streaming: report actor error #37)The text was updated successfully, but these errors were encountered: