Feature: Making data gaps visible in line charts by inserting zeros #3512

fabianmenges · 2017-09-21T19:51:59Z

Description

For many different kinds of metrics you might want to immediately see that you did not record any values in a given time period (e.g. not registering any user impressions for the first minute of every hour). With the current line chart visualization this would be very difficult to spot, you would have to spot a "missing" marker: 1 out of 60 every hour.

This merge request adds a "insert zeros" control that takes the current selected Time Grain into account to insert 0s into the time series data if the time series did not contain any data points for a given period specified by the Time Grain.

The main difference to resampling is, that is does this entirely on the client side and only very few data points are added to the time series.

Example 1:

We are visualizing the birth_names example datasource. As you can see looking at the markers on the first screenshot we have one data point per year.

As you can see, we selected the time grain "month". Checking the box "Insert Zeros" will now
scan the time series: if two data points are more than one month apart from one another it will insert a 0 data point into the time series accordingly.

Example 2:

Same dataset birth_names. I deleted all rows with a timestamps between 1967 and 1968. Without markers you would not be able to notice them missing, with markers its still pretty difficult:

When you insert 0s it is easy to see:

Features:

Everything happens on the client side
Uses the time series grain,
Works with Druid and Sql
Works with Grouped and Filtered Data
If the time grain is set to "Time Column" or "All" it applies a heuristic to guess the period length.
It will only insert a minimal number of 0s to make the missing data visible.

I added unit tests for the granularity/time-grain/period parsing. Happy to add a unit test for the adding 0 logic, I couldn't find a good place to add the spec file.

This is related to: #487 without putting 0s everywhere
Happy to get some feedback

coveralls · 2017-09-21T20:51:37Z

Coverage remained the same at 69.535% when pulling 1f477f6 on tc-dc:fmegnes/zeros into 9af34ba on apache:master.

fabianmenges · 2017-09-21T21:33:33Z

If you think it makes more sense to enhance resampling on the server side (e.g. support the various period types, derive it from the time grain) I'm happy to improve that experience instead.

mistercrunch · 2017-09-22T16:38:55Z

I like the fact that "insert zeros" as a trigger is much more intuitive than "Resampling". I'm thinking server side on this one. Just because pandas is so simple for this.

See how time grain is fairly loosely defined at the moment? https://github.com/apache/incubator-superset/blob/master/superset/db_engine_specs.py#L192

One easy approach would be add a corresponding pandas freq to each time grain. Knowing this, it would be easy to resample whenever needed.

"Resampling" came before the "Time-Grain" concept and is now confusing/obsolete. I feel like we could trigger the pandas resampling based on time grain configuration with a simple checkbox.

In that context, there may be some very few specific use cases where people may want the extra flexibility of resampling with specific fill patterns but I wouldn't be against deprecating the feature.

mistercrunch · 2017-09-22T16:42:20Z

An extra point for pandas vs having our own JS function for this is that pandas (from memory) deals nicely with timezones, such that when we want to start thinking about comprehensive time zone management pandas may do some of that magic too.

fabianmenges · 2017-09-22T18:50:26Z

K, I'll take care of this and bring it down to the server side, should be able to get something going by end of next week.

We probably need to have a function to translate the time grains to pandas frequencies (like the one I wrote in JS to ms) because we also need to cover Druid and the granularities for Druid are handled completely independent. Even better would probably be to unify the concept of druid granularity/pandas frequencies (offset-aliases)/sql time grain because they are all used almost interchangeably within superset.

Apart from that I don't think that my function would have problems with time zones since it just looks at raw milliseconds since epoch...

mosche · 2018-03-01T17:11:41Z

@fabianmenges This would be super helpful with respect to the time grain problems mentioned in #487
Did you manage to implement this feature on the server side in the meanwhile?

Adding zeros

1f477f6

fabianmenges closed this Sep 29, 2017

EBoisseauSierra mentioned this pull request Jun 8, 2021

Time-series Line / Bar chart: Add option to plot 0 if no data. #15036

Closed

m-ajay mentioned this pull request Jun 24, 2023

[Snyk] Fix for 1 vulnerabilities m-ajay/superset#144

Open

laxmanraparthi mentioned this pull request Nov 26, 2023

[Snyk] Fix for 10 vulnerabilities ShareChat/superset#56

Open

m-ajay mentioned this pull request Nov 28, 2023

[Snyk] Fix for 15 vulnerabilities m-ajay/superset#200

Open

This was referenced Dec 9, 2023

[Snyk] Fix for 6 vulnerabilities ShareChat/superset#91

Open

[Snyk] Fix for 6 vulnerabilities ShareChat/superset#92

Open

[Snyk] Fix for 6 vulnerabilities ShareChat/superset#93

Open

laxmanraparthi mentioned this pull request Dec 20, 2023

[Snyk] Fix for 13 vulnerabilities ShareChat/superset#97

Open

This was referenced Dec 21, 2023

[Snyk] Fix for 12 vulnerabilities ShareChat/superset#99

Open

[Snyk] Fix for 12 vulnerabilities ShareChat/superset#101

Open

This was referenced Dec 30, 2023

[Snyk] Fix for 12 vulnerabilities ShareChat/superset#103

Open

[Snyk] Fix for 12 vulnerabilities ShareChat/superset#104

Open

[Snyk] Fix for 12 vulnerabilities ShareChat/superset#108

Open

This was referenced Jan 24, 2024

[Snyk] Fix for 13 vulnerabilities ShareChat/superset#133

Open

[Snyk] Fix for 13 vulnerabilities ShareChat/superset#134

Open

This was referenced Feb 7, 2024

[Snyk] Fix for 8 vulnerabilities ShareChat/superset#148

Open

[Snyk] Fix for 8 vulnerabilities ShareChat/superset#149

Open

m-ajay mentioned this pull request Feb 12, 2024

[Snyk] Fix for 1 vulnerabilities m-ajay/superset#263

Open

m-ajay mentioned this pull request Mar 22, 2024

[Snyk] Fix for 2 vulnerabilities m-ajay/superset#278

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Making data gaps visible in line charts by inserting zeros #3512

Feature: Making data gaps visible in line charts by inserting zeros #3512

fabianmenges commented Sep 21, 2017 •

edited

coveralls commented Sep 21, 2017 •

edited

fabianmenges commented Sep 21, 2017

mistercrunch commented Sep 22, 2017

mistercrunch commented Sep 22, 2017 •

edited

fabianmenges commented Sep 22, 2017

mosche commented Mar 1, 2018

Feature: Making data gaps visible in line charts by inserting zeros #3512

Feature: Making data gaps visible in line charts by inserting zeros #3512

Conversation

fabianmenges commented Sep 21, 2017 • edited

Description

Example 1:

Example 2:

Features:

coveralls commented Sep 21, 2017 • edited

fabianmenges commented Sep 21, 2017

mistercrunch commented Sep 22, 2017

mistercrunch commented Sep 22, 2017 • edited

fabianmenges commented Sep 22, 2017

mosche commented Mar 1, 2018

fabianmenges commented Sep 21, 2017 •

edited

coveralls commented Sep 21, 2017 •

edited

mistercrunch commented Sep 22, 2017 •

edited