New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notify users of shaper "misses" #2776
Comments
This is an auto-generated commit with a zq dependency update. The zq PR brimdata/zed#2370, authored by @philrz, has been merged. Add tunnel fields to Suricata shaper brimdata/zed#2369 describes a problem where missing NDJSON typing config for rarely-seen Suricata fields generated a user-facing warning. This PR adds the additional config to the Suricata shaper to cover these fields. Our switchover from legacy `types.json` to the new shaping configs in the time since the user reported the issue actually changes the symptom, so before I show the fix having its impact, I'll take a step back and show how the symptom was looking as of Brim commit 8569148 which pointed at `zqd` commit 8ab4607. In that case, because of the change in brimdata/zed#2309, the user no longer was shown a warning, but the additional fields were silently imported with inferred types. So for instance, notice how the `tunnel` fields are shown in black font that's used for strings, rather than the blue that's used for the `ip` type fields. ![](https://user-images.githubusercontent.com/5934157/111556887-d2dc9a80-8748-11eb-9a86-329b652f1c1a.png) More importantly, if the user doesn't notice this subtle difference and tries to treat them like the IP addresses they otherwise _appear_ to be, they get unexpected behavior, such as this failing CIDR match. ![](https://user-images.githubusercontent.com/5934157/111557022-1afbbd00-8749-11eb-9e7e-8258739777fe.png) Of course, if they catch the problem and do the cast at query time, they can fix it. But this is what we want the shaper to do, hence the changes in this PR. ![](https://user-images.githubusercontent.com/5934157/111557058-2b139c80-8749-11eb-9ca2-746635067188.png) If I start from the same Brim commit 8569148 and advance my `zq` pointer to 6613aba to use the shaper in this branch, then re-import, now we get the correct typing out of the gate. ![](https://user-images.githubusercontent.com/5934157/111557351-bc830e80-8749-11eb-897c-b43862ed9d54.png) ![](https://user-images.githubusercontent.com/5934157/111557392-cf95de80-8749-11eb-9d99-144e8479d31a.png) A couple additional details: 1. The sample data I'm using from brimdata/zed#2369 doesn't happen to generate the `tunnel.src_port` and `tunnel.dest_port` fields in the Suricata EVE JSON. However, since the original user's issue showed these being generated, I've included them here in the shaping config. 2. Since we've been through so many example pcaps before seeing Suricata generates this field, some may wonder if it makes sense to make the current single schema even wider to always make room for them as opposed to letting multiple shapes get created. However, we've had users recently be confused by the disappearing columns that result from unexpected multiple shapes (#967). Therefore, until we have new UX in Brim that makes it easier for users to browse their data in such situations and/or also apply+modify their own shaper modifications, I'm keen to maintain the single schema. 3. While this now covers us for all Suricata fields we're aware of for `alert` events, the fact the user was not being informed these `tunnel` fields were being imported with inferred types seems hazardous. I've opened separate issue brimsec/brim#1519 regarding this, as it seems like something we'll need to address in the wider context of shaper UX. Fixes brimdata/zed#2369
@mccanne had a slightly different take when reviewing this one. In addition to "crop" or "infer" he'd also imagined yet another approach where shaper "misses" are still stored in the lake but are marked as "unclassified" (e.g. tagged in a way that reflects that it was not a perfect match for a shaper rule and hence might need to be dealt with). This seems to imply workflows where users could periodically browse the unclassified data in their lake and re-shape/re-store it, delete it, etc. That said, I still imagine users may want to know ASAP if this has happened so they're aware that such data is present to be browsed, so the idea of notifications still seems to apply. |
First-class errors have been a part of the language for some time and provide an approach similar to the one attributed to @mccanne in the last comment above. Linked PR #4686 added some user-facing docs that describe how to make use of Zed errors to get the kind of notice of shaper "misses" as described in this issue. The docs show multiple approaches based on how severe the user thinks a shaper "miss" is for their use case, i.e., whether an entire record becomes an error value or just individual problematic fields. The subject of how user could be "notified" when such errors accumulate is fairly orthogonal, but mechanisms already exist such as running frequent |
#2369 and its fix in #2370 describe an example where the shaper currently imports additional fields that are not covered by the shaping config, assigning them inferred types. In that case it resulted in an IP address being stored as a string rather than an
ip
type, which has the follow-on effect of type-specific operations not working (e.g. CIDR matches in this case).This behavior was last visited in #2309. Before that change, such additional fields would have been "cropped". After that change, now they're let in with inferred types. If choosing between these two options, indeed the current "infer" approach seems slightly better, but from a user perspective, they both seem sub-optimal: Whether the unexpected fields are lost entirely or stored incorrectly, I imagine the user will want to know what happened so they can adjust their queries, ask us for help, etc. Before #2298 was merged, the legacy
types.json
approach at least gave the user a heads-up in the form of a warning ("Unexpected additional field(s) 'tunnel.src_ip' 'tunnel.src_port' 'tunnel.dest_ip' 'tunnel.dest_port' 'tunnel.proto' 'tunnel.depth'. Please email this error message to support@brimsecurity.com."
) which is how we got a heads-up about #2369 in the first place. But now we've lost that, hence this issue.It seems the first order of business will be to just have the warnings generated by
zqd
and presented to the user in Brim. I expect the ultimate solution will be when we have a fairly rich shaper UX that will allow the user to react to the warnings themselves, which means them being able to see/modify/apply their own shaper configs. But as long as they're stuck with static shapers that we provide, some kind of notification like we had before feels like it would be an improvement.The text was updated successfully, but these errors were encountered: