Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't require leading | for Zed that's applied after inclusions #2584

Open
philrz opened this issue Apr 19, 2021 · 1 comment
Open

Don't require leading | for Zed that's applied after inclusions #2584

philrz opened this issue Apr 19, 2021 · 1 comment

Comments

@philrz
Copy link
Contributor

philrz commented Apr 19, 2021

Repro is with zed commit 969f37d.

Imagine a user has some Zeek NDJSON data and they want to bucket addresses into subnets. They first try to process the data without the assistance of any shaper.

$ zq -z 'put classnet=network_of(.id.resp_h) | count() by classnet | sort classnet' weird.ndjson.gz 
put: a referenced field is missing
...

At this point they realize "oh shoot... this NDJSON has no rich typing!" To turn id into a nested record that includes the ip-typed fields like resp_h, they modify their command line to to apply the reference shaper for Zeek, but now receive a parse error:

$ zq -z -I ~/work/zed/zeek/shaper.zed 'put classnet=network_of(.id.resp_h) | count() by classnet | sort classnet' weird.ndjson.gz 
zq: parse error: error parsing Z at line 114, col 1:type port=uint16;

The problem is that the shaper ends with put .=unflatten(.) | put . = shape(schemas[_path]), so for our additional put to succeed, the user needs to prepend a | in their additional Zed like so:

$ zq -z -I ~/work/zed/zeek/shaper.zed '| put classnet=network_of(.id.resp_h) | count() by classnet | sort classnet' weird.ndjson.gz 
{classnet:"not an IPv4" (error),count:3 (uint64)} (=0)
{classnet:8.0.0.0/8,count:1 (uint64)} (=1)
{classnet:10.0.0.0/8,count:23889} (1)
...

It seems it would be helpful if the platform could figure out on its own when the | is necessary to stitch together the additional Zed top of the included Zed so the user doesn't have to make this adjustment.

@philrz philrz mentioned this issue Apr 20, 2021
@philrz
Copy link
Contributor Author

philrz commented Apr 21, 2021

@mccanne noted this should be easier after his hazmat PR merges.

@philrz philrz added this to the Data MVP0 milestone Apr 21, 2021
@philrz philrz modified the milestones: Data MVP0, Data MVP1 May 10, 2021
@philrz philrz removed this from the ETL Lake milestone Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant