You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To date, applying shapers has been something that's always been associated with a client-side operation. A couple examples:
zq may be used (often with -I) to shape data before importing it into a lake via zed load, dragging it into the Brim app, etc.
Brimcap performs all its shaping to turn data into rich ZNG client-side before it's posted to a Zed Lake
However, a useful app workflow might be to define a persistent shaper config such that unshaped data could be incrementally added to a pool and shaped server-side without the user having to include or mention the shaper code every time. For example, in a multi-person org, a Zed-savvy user may be responsible for perfecting the "golden" shaping configs and defining policy that ensures they're applied on incoming data for certain pools. Then other users could just import their NDJSON/CSV/etc. directly to those pools without having to know anything about shapers.
We expect this issue to start with design tasks such as determining how the shaping configs are attached/persisted in the Lake, then thinking about how it's invoked by the Brim app and zed.
Note: This may overlap with the "intake" concept we've discussed in the past and tracked via brimdata/zui#1481.
The text was updated successfully, but these errors were encountered:
Today I thought of this issue again in the context of a community user's inquiry in a recent Slack thread. They were trying to use the Python client to replicate command lines they'd traditionally done at the shell. Their specific question:
is there a way to specify the type of some fields like I can with the zq | zed load -
for eg: zq -i json '_ts:=time(created)' infile.json | zed load -
I couldn't think of a way for them to replicate that whole pipeline within Python unless they were invoking the zq binary. That is, the Zed Python client can load data from file-like objects into the lake, or read data back out of the lake using queries, but the first half of that pipeline is entirely "non-lake". That got me to thinking again about how it would be handy if that kind of shaping were somehow persisted server-side so it could be applied on ingest, since that way a dumber client like the Python one (or an even dumber one like curl) could post the unshaped data and have it be shaped before being storaed. FWIW, when I described this to the user, their response was "oh that would be awesome!", but per @mccanne's most recent comment above, I'm sure there's other design considerations that might favor an approach other than the one I originally thought of in this issue.
To date, applying shapers has been something that's always been associated with a client-side operation. A couple examples:
zq
may be used (often with-I
) to shape data before importing it into a lake viazed load
, dragging it into the Brim app, etc.However, a useful app workflow might be to define a persistent shaper config such that unshaped data could be incrementally added to a pool and shaped server-side without the user having to include or mention the shaper code every time. For example, in a multi-person org, a Zed-savvy user may be responsible for perfecting the "golden" shaping configs and defining policy that ensures they're applied on incoming data for certain pools. Then other users could just import their NDJSON/CSV/etc. directly to those pools without having to know anything about shapers.
We expect this issue to start with design tasks such as determining how the shaping configs are attached/persisted in the Lake, then thinking about how it's invoked by the Brim app and
zed
.Note: This may overlap with the "intake" concept we've discussed in the past and tracked via brimdata/zui#1481.
The text was updated successfully, but these errors were encountered: