Generic Histogram #1541

jameskerr · 2021-03-29T20:42:33Z

Create a new histogram component to be used when the search results are clearly not "Zeek" or analytics. Keep the old one to render when the resulting schemas contain a "ts" and a "_path" field.

The underlying query will be <current filter> | every <lake key> count() by typeof(.)

Use a color mapping to apply to the data shapes automatically

The text was updated successfully, but these errors were encountered:

mccanne · 2021-11-03T17:18:34Z

shapes barchart

philrz · 2023-05-25T19:37:23Z

We had a longer team discussion about this one. There was consensus that generalizing the existing histogram so it would work with any time-typed field would be a logical minimum next step. However, we discussed in more detail whether it would make sense to handle pool keys of any Zed type and what this would mean in terms of the "bucketing" to generate the values that would populate the histogram bars. In terms of how other solutions approach the problem, one example is this width_bucket() function that allows a specified number of buckets to be created across any min/max range, though in that case only numeric types are supported. We debated for a bit if this made sense for string types for instance. @mccanne pointed out how a mathematical approach could be applied where a function could determine for each type where a value falls in a range [0,1], e.g., a string aaaa is "almost 0" and a string zzzz is "almost 1". Whether users would actually benefit from this specific bucketing shown in histogram form vs. some other visualization for their non-time data is not immediately clear to us. However, the discussion is being recorded in this comment in the event we want to pursue the topic here or spawn a different issue when this one closes.

philrz · 2023-06-20T23:26:13Z

During a recent group discussion of what we've got thus far in #2785, @mccanne summarized his long-term vision of how he hopes the app can flexibly handle this histogram. (I'm paraphrasing a bit here, but I've got the original wording archived on video if anyone needs to reference it. 😄)

At a high level, I think the goal should be that the UI just figures out what to do without the user having to type in field names. It could do queries to figure out the cardinality of different fields (e.g., since low cardinality fields are probably "interesting fields" for segmentation) and use a heuristic to pick a default. It could similarly look for a time-typed field to use for the X-axis & bucketing. So ~90% of the time it would just do the right thing. Then for the 10% of the time the heuristic makes the "wrong" picks, there'd be simple pull-downs/checkboxes to pick alternatives, e.g., if there's several "interesting fields" or several time-typed values. (It should be noted that we don't currently have an efficient way to do a "count distinct" of every field, but the dictionaries in vcache may be able to provide something adequate here.)

jameskerr added this to the Data MVP0 milestone Mar 29, 2021

jameskerr self-assigned this Mar 31, 2021

philrz mentioned this issue May 6, 2021

Wiki article and example YAML/scripts for custom Zeek/Suricata/NetFlow brimdata/brimcap#72

Merged

philrz modified the milestones: Data MVP0, Data MVP1 May 10, 2021

philrz unassigned jameskerr Jun 1, 2021

jameskerr modified the milestones: ETL Lake, v0.27.0 Oct 7, 2021

philrz modified the milestone: v0.27.0 Oct 8, 2021

mason-fish removed this from the v0.27.0 milestone Oct 26, 2021

philrz mentioned this issue Aug 16, 2022

Bring back Zeek histogram #2489

Closed

philrz mentioned this issue Sep 2, 2022

Challenges visualizing pool that contains keyless records #2516

Open

philrz mentioned this issue Dec 29, 2022

Zed Table #2626

Merged

45 tasks

philrz assigned jameskerr May 25, 2023

This was linked to pull requests Jul 6, 2023

Generic Histogram for time-based data #2785

Merged

Create settings with defaults when a pool is created #2794

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic Histogram #1541

Generic Histogram #1541

jameskerr commented Mar 29, 2021 •

edited

Loading

mccanne commented Nov 3, 2021

philrz commented May 25, 2023 •

edited

Loading

philrz commented Jun 20, 2023

Generic Histogram #1541

Generic Histogram #1541

Comments

jameskerr commented Mar 29, 2021 • edited Loading

mccanne commented Nov 3, 2021

philrz commented May 25, 2023 • edited Loading

philrz commented Jun 20, 2023

jameskerr commented Mar 29, 2021 •

edited

Loading

philrz commented May 25, 2023 •

edited

Loading