Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 26 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
[![codecov](https://codecov.io/gh/skjolber/json-log-filter/graph/badge.svg?token=8mCiHxVFbz)](https://codecov.io/gh/skjolber/json-log-filter)

# json-log-filter
High-performance filtering of to-be-logged JSON. Reads, filters and writes JSON in a single step - drastically increasing throughput. Typical use-cases:
High-performance filtering of JSON. Reads, filters and writes JSON in a single step - drastically increasing throughput. Typical use-cases:

* Filter sensitive values from logs (i.e. on request-/response-logging)
* technical details like passwords and so on
Expand All @@ -20,28 +20,21 @@ High-performance filtering of to-be-logged JSON. Reads, filters and writes JSON

Features:

* Mask single values or whole subtrees
* Remove single values or whole subtrees
* Truncate String values
* Truncate document size (max total output size)
* Remove whitespace (for pretty-printed documents)
* Anonymize single values or whole subtrees
* Remove whole subtrees
* Limit text value size
* Limit document size (skip end of document when max size is reached)
* Remove whitespace

The library contains multiple filter implementations as to accommodate combinations of the above features with as little overhead as possible.

The equivalent filters are also implemented using [Jackson]:

* filter + verify document structure in the same operation
* allows dual filter setup:
* trusted (locally produced) JSON: fast filters without strict syntax validation
* untrusted (remotely produced) JSON: slower filter with strict syntax validation
The library contains multiple filter implementations as to accommodate combinations of the above features with as little overhead as possible. No external dependencies are necessary.

Bugs, feature suggestions and help requests can be filed with the [issue-tracker].

## License
[Apache 2.0]

## Obtain
The project is built with [Maven] and is available on the central Maven repository.
The project is built with [Maven] and is available on the central Maven repository.

<details>
<summary>Maven coordinates</summary>
Expand Down Expand Up @@ -106,14 +99,13 @@ api("com.github.skjolber.json-log-filter:jackson:${jsonLogFilterVersion}")
</details>

# Usage
Use a `DefaultJsonLogFilterBuilder` or `JacksonJsonLogFilterBuilder` to configure a filter instance (all filters are thread safe):
Use a `DefaultJsonLogFilterBuilder` to configure a filter instance (all filters are thread safe):

```java
JsonFilter filter = DefaultJsonLogFilterBuilder.createInstance()
.withMaxStringLength(127) // cuts long texts
.withAnonymize("$.customer.email") // inserts ***** for values
.withAnonymize("$.customer.email") // inserts "*" for values
.withPrune("$.customer.account") // removes whole subtree
.withMaxPathMatches(16) // halt anon/prune after a number of hits
.withMaxSize(128*1024)
.build();

Expand All @@ -131,7 +123,7 @@ Configure max string length for output like
}
```

### Mask (anonymize)
### anonymize (mask)
Configure anonymize for output like

```json
Expand Down Expand Up @@ -176,15 +168,15 @@ to output like
### Path syntax
A simple syntax is supported, where each path segment corresponds to a `field name`. Expressions are case-sensitive. Supported syntax:

/my/field/name
$.my.field.name

with support for wildcards;

/my/field/*
$.my.field.*

or a simple any-level field name search

//myFieldName
$..myFieldName

The filters within this library support using multiple expressions at once. Note that path expressions are see through arrays.

Expand All @@ -194,7 +186,7 @@ Configure max path matches; so that filtering stops after a number of matches. T
For example if the to-be filtered JSON document has a schema definition with a header + body structure, and the target value is in the header.

### Max size
Configure max size to limit the size of the resulting document. This reduces the size of the document by (silently) deleting the JSON content after the limit is reached.
Configure max size to limit the size of the resulting document.

### Metrics
Pass in a `JsonFilterMetrics` argument to the `process` method like so:
Expand All @@ -209,19 +201,20 @@ The resulting metrics could be logged as metadata alongside the JSON payload or
* Measuring the impact of the filtering, i.e. reduction in data size
* Make sure filters are actually operating as intended

## Performance
The `core` processors within this project are faster than the `Jackson`-based processors. This is expected as parser/serializer features have been traded for performance:
### Opt-in Jackson module
The filters have also been implemented using [Jackson], in an opt-in module.

* `core` is something like 3x-9x as fast as `Jackson` processors, where
* skipping large parts of JSON documents (prune) decreases the difference, and
* small documents increase the difference, as `Jackson` is more expensive to initialize.
* working directly on bytes is faster than working on characters for the `core` processors.
* filter + verify document structure in the same operation
* allows dual filter setup:
* trusted (locally produced) JSON: fast filters without strict syntax validation
* untrusted (remotely produced) JSON: slower filter with strict syntax validation

For a typical, light-weight web service, the overall system performance improvement for using the `core` filters over the `Jackson`-based filters will most likely be a few percent.
Configure filters from `JacksonJsonLogFilterBuilder`.

Memory use will be at 2-8 times the raw JSON byte size; depending on the invoked `JsonFilter` method (some accept string, other raw bytes or chars).
## Performance
This project trades parser/serializer features for performance, and runs multiple times faster than a "traditional" parser/writer approach (like when using Jackson).

See the benchmark results ([JDK 17](https://jmh.morethan.io/?source=https://raw.githubusercontent.com/skjolber/json-log-filter/master/benchmark/jmh/results/jmh-results-4.1.2.jdk17.json&topBar=off)) and the [JMH] module for running detailed benchmarks.
See the benchmark results ([JDK 25](https://jmh.morethan.io/?source=https://raw.githubusercontent.com/skjolber/json-log-filter/master/benchmark/jmh/results/jmh-results-5.0.0.jdk25.json&topBar=off)) and the [JMH] module for running detailed benchmarks.

There is also a [path](impl/path) artifact which helps facilitate per-path filters for request/response-logging applications, which should further improve performance.

Expand All @@ -234,7 +227,7 @@ Using SIMD for parsing JSON:

Alternative JSON filters:

* [json-masker](https://github.com/Breus/json-masker) (included in benchmark).
* [json-masker](https://github.com/Breus/json-masker) (included in some of the benchmarks).

[Apache 2.0]: https://www.apache.org/licenses/LICENSE-2.0.html
[issue-tracker]: https://github.com/skjolber/json-log-filter/issues
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ public AbstractMultiPathJsonFilter(int maxStringLength, int maxSize, int maxPath
for(int i = 0; i < prunes.length; i++) {
String prune = prunes[i];
if(hasAnyPrefix(prune)) {
String name = prune.substring(2);
String name = removeAnyPrefix(prune);
if(name.equals("*")) {
throw new IllegalArgumentException("Unexpected any match for *");
}
Expand All @@ -64,7 +64,7 @@ public AbstractMultiPathJsonFilter(int maxStringLength, int maxSize, int maxPath
for(int i = 0; i < anonymizes.length; i++) {
String anonymize = anonymizes[i];
if(hasAnyPrefix(anonymize)) {
String name = anonymize.substring(2);
String name = removeAnyPrefix(anonymize);
if(name.equals("*")) {
throw new IllegalArgumentException("Unexpected any match for *");
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ public int getType() {

protected static final String[] EMPTY = new String[]{};
protected static final String ANY_PREFIX_SLASHES = "//";
protected static final String ANY_PREFIX_DOTS = "..";
protected static final String ANY_PREFIX_DOTS = "$..";

public static final String STAR = "*";
protected static final char[] STAR_CHARS = STAR.toCharArray();
Expand Down Expand Up @@ -66,6 +66,16 @@ public static boolean hasAnyPrefix(String string) {
return string.startsWith(AbstractPathJsonFilter.ANY_PREFIX_SLASHES) || string.startsWith(AbstractPathJsonFilter.ANY_PREFIX_DOTS);
}

public static String removeAnyPrefix(String string) {
if(string.startsWith(AbstractPathJsonFilter.ANY_PREFIX_SLASHES)) {
return string.substring(2);
}
if(string.startsWith(AbstractPathJsonFilter.ANY_PREFIX_DOTS)) {
return string.substring(2);
}
throw new IllegalArgumentException();
}

/** strictly not needed, but necessary for testing */
protected final String[] anonymizes;
protected final String[] prunes;
Expand Down

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -111,8 +111,8 @@ public void testRegexpExpressions() {
public void testAnyPrefix() {
assertTrue(AbstractPathJsonFilter.hasAnyPrefix(new String[] {"//a"}));
assertFalse(AbstractPathJsonFilter.hasAnyPrefix(new String[] {"/a"}));
assertTrue(AbstractPathJsonFilter.hasAnyPrefix(new String[] {"..a"}));
assertFalse(AbstractPathJsonFilter.hasAnyPrefix(new String[] {".a"}));
assertTrue(AbstractPathJsonFilter.hasAnyPrefix(new String[] {"$..a"}));
assertFalse(AbstractPathJsonFilter.hasAnyPrefix(new String[] {"$.a"}));
}

@Test
Expand Down

This file was deleted.

Loading