Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream Error Truncation Mechanism #7871

Closed
jon-chuang opened this issue Feb 13, 2023 · 1 comment
Closed

Stream Error Truncation Mechanism #7871

jon-chuang opened this issue Feb 13, 2023 · 1 comment
Labels
good first issue Good for newcomers help wanted Issues that need help from contributors type/feature

Comments

@jon-chuang
Copy link
Contributor

jon-chuang commented Feb 13, 2023

A truncation strategy can be:

  1. choose a target truncated message number, TARGET_NUM

Objective: choose at most TARGET_NUM prefixes such:

  1. that we maximize their minimum length such that all elements seen so far have one of the prefixes as a prefix.

Example 1: No common prefix. Prefixes found: all the elements, then "" after reaching capacity.
Example 2:

TARGET_NUM = 2. 
data = ["parse error: at position 3", "parse error: at position 3", "parse error: at position 4", "parse error: at position 4", "network error: Port 5464 is unavailable"]. 
Prefixes found: ["parse error: at position ", "network error: Port 5464 is unavailable"]. 
Resultant errors: ["parse error: at position <truncated>", "parse error: at position <truncated>", "parse error: at position <truncated>", "network error: Port 5464 is unavailable"]

Data structure: prefix tree with e.g. 15 unique prefixes.

When a new error string arrives, if the capacity is not filled, we add it to the prefix tree. If the capacity is filled and we have a new value:

  1. Check if there is a non-zero prefix in the prefix tree. If there is a complete match, do nothing.
  2. Else, find the longest partial match. Remove all children of the partial match, and add the partial match as a leaf. Truncate to the partial match.
  3. If there is no partial match, we free up capacity by truncating the longest path with at least 2 descendants to that path. Then we add the new element as a leaf. If no such path exists, then we do not add the new element to the tree. We truncate the message to "".

When we cannot accept new non-empty entries to the prefix tree (every prefix is unique), we will simply truncate the incoming message entirely.

@github-actions github-actions bot added this to the release-0.1.17 milestone Feb 13, 2023
@jon-chuang jon-chuang added the help wanted Issues that need help from contributors label Feb 13, 2023
@jon-chuang
Copy link
Contributor Author

jon-chuang commented Feb 14, 2023

We have decided against this approach for a simple error suppression that limits the number of unique error messages to e.g. 10 for the last hour.

@jon-chuang jon-chuang closed this as not planned Won't fix, can't repro, duplicate, stale Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Issues that need help from contributors type/feature
Projects
None yet
Development

No branches or pull requests

1 participant