Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flags for varying the defaults of the shape() functions #2585

Open
philrz opened this issue Apr 19, 2021 · 0 comments
Open

Flags for varying the defaults of the shape() functions #2585

philrz opened this issue Apr 19, 2021 · 0 comments
Assignees

Comments

@philrz
Copy link
Contributor

philrz commented Apr 19, 2021

#1783 (comment) introduced the building blocks of shape(), specifically the four functions cast(), crop(), fill(), and order(). While these can be used "a la carte", @henridf correctly made the point in the issue that users will often want all four of these performed at once to make a particular record conform to a given type config, hence the convenience of shape().

After some initial use of shape(), #2308 then flipped its behavior such that crop() is not performed by default. However, for my own use cases (such as if I'm advising a user to apply our reference Suricata shaper atop their own custom Suricata rather than an artifact we bundle with Brimcap) I find myself wanting the crop() behavior back (since without it, a bunch of additional fields we trim out via our bundled Suricata YAML leak through and cause excess typedefs). Of course the "a la carte" crop() still exists, and indeed my short term workaround is to use it. However, having to chain yet another function call makes the config slightly more messy and is also a mild drag on performance.

In discussing this topic with @mccanne, he pointed out that @henridf's implementation would lend itself to an approach where shape() could, say, take some optional boolean flags that would vary its behaviors relative to the defaults. In this regard one could still call just shape() but include a parameter that says something like crop=true, hence get the cleanliness and better performance of a single function call.

@philrz philrz added this to the Data MVP1 milestone Apr 21, 2021
@philrz philrz removed this from the ETL Lake milestone Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants