feat(new transform): Add ansi stripper transform#1188
feat(new transform): Add ansi stripper transform#1188lukesteensen merged 1 commit intovectordotdev:masterfrom
Conversation
Given a log field with ansi escape sequences such as:
\x1b[32mhello\x1b[m world
The escape sequences are stripped from the string:
hello world
This transformer only works on string-type fields, any other field type
will result in a warning and an unchanged value.
Signed-off-by: Jean Mertz <jean@mertz.fm>
| owning_ref = "0.4.0" | ||
| listenfd = "0.3.3" | ||
| inventory = "0.1" | ||
| strip-ansi-escapes = "0.1.0" |
There was a problem hiding this comment.
Not sure if we want to add this dependency. It adds a minimal amount of extra dependencies, and it gets the job done.
Alternatively, we could do the escape sequence parsing ourselves, but it's a bit more complex than writing a simple regular expression, so we'd probably want to write a tiny parser if we go that route.
Another option would be to simplify the implementation and only account for the most common "CSI" escape sequence patterns.
There was a problem hiding this comment.
This definitely seems worthwhile to me, especially as a pretty light dependency.
| field = self.field.as_ref(), | ||
| ), | ||
| Some(ValueKind::Bytes(ref mut bytes)) => { | ||
| *bytes = match strip_ansi_escapes::strip(bytes.clone()) { |
There was a problem hiding this comment.
A downside to this is that the clone happens for all strings, even those that don't have any escape sequences.
One (initial) solution could be to match the bytes for the \x1b[ pattern and only then strip escape sequences, but that would only cover a subset of possible escape sequences.
There was a problem hiding this comment.
I think this is a good place to start. Down the line, if there's a need for more performance, we can make a pass at profiling and optimizing.
|
|
||
| #[test] | ||
| fn ansi_stripper_transform() { | ||
| assert_foo_bar![ |
There was a problem hiding this comment.
This tests all potential escape sequences as described in http://ascii-table.com/ansi-escape-sequences.php.
|
Very nice! This looks great. |
| owning_ref = "0.4.0" | ||
| listenfd = "0.3.3" | ||
| inventory = "0.1" | ||
| strip-ansi-escapes = "0.1.0" |
There was a problem hiding this comment.
This definitely seems worthwhile to me, especially as a pretty light dependency.
| field = self.field.as_ref(), | ||
| ), | ||
| Some(ValueKind::Bytes(ref mut bytes)) => { | ||
| *bytes = match strip_ansi_escapes::strip(bytes.clone()) { |
There was a problem hiding this comment.
I think this is a good place to start. Down the line, if there's a need for more performance, we can make a pass at profiling and optimizing.
|
How do we use it? can't find it listed here: https://vector.dev/docs/reference/configuration/transforms/ |
This not listed on the website because the transform was deleted. I would use |
|
@pront Perfect, thanks |
Given a log field with ansi escape sequences such as:
The escape sequences are stripped from the string:
This transformer only works on string-type fields, any other field type
will result in a warning and an unchanged value.
This does not yet add any documentation, or changes any benchmarks or integration tests.
Closes #908