Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application hangs when plotting lots of NULL values on 2nd Y axis with Line type 'Impulse' #1977

Open
2 of 10 tasks
wrenoud opened this issue Aug 26, 2019 · 19 comments
Open
2 of 10 tasks
Labels
bug Confirmed bugs or reports that are very likely to be bugs. plot

Comments

@wrenoud
Copy link

wrenoud commented Aug 26, 2019

Details for the issue

What did you do?

Tried to plot 2 columns on the Y axis, with line type "Impulse", one of which has mostly NULL values. About 300 rows in total

What did you expect to see?

A plot

What did you see instead?

The application hangs for a long time. It ultimately plots, but NULL values appear to be infinite vertical lines and any redraw (zoom/pan) hangs again.

Useful extra information

The info below often helps, please fill it out if you're able to. :)

What operating system are you using?

  • Windows: ( version: ___ )
  • Linux: ( distro: ___ )
  • Mac OS: ( version: ___ )
  • Other: ___

What is your DB4S version?

  • 3.11.2
  • 3.11.1
  • 3.10.1
  • Other: ___

Did you also

@wrenoud
Copy link
Author

wrenoud commented Aug 26, 2019

A workaround for me is to replace NULL values with 0 in a query using ifnull("my_column", 0)

@sky5walk
Copy link

0 is a valid data point in many curves.
The correct way is to ignore or skip any defined null value.
I use '-999' as a defined null value and never plot it.

@wrenoud
Copy link
Author

wrenoud commented Aug 26, 2019

@sky5walk null is an appropriate null value. I was noting that replacing the null values with a number prevented the application hang. A "correct" replacement is going to be data set specific. And correct me if I'm wrong, but I know of no way to prevent a value from being plotted when you try to plot 2 columns on the y axis.

@wrenoud wrenoud changed the title Application hangs when plotting lots of NULL values Application hangs when plotting lots of NULL values on 2nd Y axis Aug 26, 2019
@wrenoud
Copy link
Author

wrenoud commented Aug 26, 2019

@sky5walk I think you misunderstood my bug report. I've updated the title and I will elaborate. I have essential 3 columns

index value1 value2
1 3 NULL
2 7 NULL
3 7 1

I wish to see plots for both value1 and value2 vs index on the same plot. Omitting null values for value2 is a mostly useless plot for me as I won't see most of value1. I'm even OK with the current choice of drawing an infinite line for NULL, but that seems to perform very poory if there are more than a couple NULL values.

@sky5walk
Copy link

null is not really an appropriate value for any entry.
Your app, as you have found, must "deal" with null in some algorithmic way.
For me, I force null = -999.
Then my app's can ignore a real value instead of determining if 0 is applicable to the current field.
I do the same for text null's. They are set to "-999". Empty strings are not correct to assign as a null.
The embedded plotting library may or may not have a null feature? If it does, just set it to your desired real value. Then the plot lib will 'skip' that point in the curve, creating a non-continuous curve.
If the lib does not have null gaps as a feature, then you must edit the code or request it.
Or, as you explained, UPDATE your nulls with custom defined real values you can live with.
But, that becomes a pain when every column has a different bound.

@MKleusberg
Copy link
Member

I just gave this a quick try and couldn't reproduce it. Is there maybe something else I have to do to make it hang, like sort order, data types, etc.?

I do see however how NULL values are rendered in a weird way, it just doesn't seem to affect the performance for me. To make it render correctly we would have to decide what to do with NULL values? Skip them or treat them as 0, make it configurable, or something else altogether?

@mgrojo
Copy link
Member

mgrojo commented Aug 26, 2019

We are in fact using NULL in a way expected by our plot library:

https://www.qcustomplot.com/documentation/classQCPGraph.html

Gaps in the graph line can be created by adding data points with NaN as value (qQNaN() or std::numeric_limits::quiet_NaN()) in between the two data points that shall be separated.

I think inserting gaps is also the best way to handle those NULL values.

It ultimately plots, but NULL values appear to be infinite vertical lines

That doesn't match with the expectation. It should produce a gap in the curve. Can you provide a screenshot?

Regarding the performance problem, does it help decreasing the "Prefecht block size" in "Preferences > Database", @wrenoud? What is the value of "lots" in "lots of NULL values" 😄 ? Shouldn't matter if replacing them with zeros solve the problem, anyway. Maybe treating those gaps is costly for the plot library. Just guessing.

@mgrojo
Copy link
Member

mgrojo commented Aug 26, 2019

@sky5walk I think you misunderstood my bug report. I've updated the title and I will elaborate. I have essential 3 columns
index value1 value2
1 3 NULL
2 7 NULL
3 7 1

I wish to see plots for both value1 and value2 vs index on the same plot. Omitting null values for value2 is a mostly useless plot for me as I won't see most of value1. I'm even OK with the current choice of drawing an infinite line for NULL, but that seems to perform very poory if there are more than a couple NULL values.

With those values, I get this plot (some zoom out applied):

imagen

Isn't the same for you?

@MKleusberg
Copy link
Member

@mgrojo I get the same plot. However, for another test case with ~2000 rows and about one third of them NULL I get this (there is definitely no value < 10000, so the extra thin line is going nowhere):
test
It depends on the sort order of the data (as far as I can tell it doesn't happen when most of the NULL values are in one place; but it does happen if numbers and NULL values are mixed a lot) and on the zoom level. It's also impossible to follow the thin extra lines because they always disappear for me whenever I drag the plot around.

mgrojo added a commit that referenced this issue Aug 26, 2019
Gaps are requested to QCustomPlot when one of the coordinates of a point
is NaN. We were using that feature for the y axis, but not for the x axis
where null values where being converted to 0, which makes less sense than
making a gap, at least for consistency.

See issue #1977
@mgrojo mgrojo added bug Confirmed bugs or reports that are very likely to be bugs. plot labels Aug 26, 2019
@mgrojo
Copy link
Member

mgrojo commented Aug 26, 2019

Maybe the problems are with NULL values in X? We are not checking for NULL in that axis and they seem to be translated to 0, probably by toDouble(). I've committed a change that also applies that logic to the x axis. I think both axes should be treated equally, independently of whether we should make the logic configurable, which I'm not sure it would be useful.

@MKleusberg and @wrenoud, could you confirm if it gives the expected result with your data, using tomorrow's nightly build?

@wrenoud
Copy link
Author

wrenoud commented Aug 27, 2019

Here is an example table that reliably causes the hang for me: https://dbhub.io/wrenoud/asfd.db

I select datetime for the X axis, then selecting either count1 or count2 for Y axis is OK, but selection the other column as the second Y axis the app hangs.

I also downloaded and installed the nightly and I still experience this.

@justinclift
Copy link
Member

https://dbhub.io/wrenoud/asfd.db

That's giving a "doesn't exist" error. It's probably still set to private. If you go into the settings for the database you can set it to public. 😄

@wrenoud
Copy link
Author

wrenoud commented Aug 27, 2019

@mgrojo I saw your comment after posting my comment. I downloaded the 3.11.99 (Aug 27 2019) build, is that the "tomorrow" you're referening to?

@wrenoud
Copy link
Author

wrenoud commented Aug 27, 2019

@justinclift, sorry, I thought I had made it public. Should be fixed now.

@justinclift
Copy link
Member

Thanks @wrenoud. Yep, it can be grabbed by people now. 😄

@wrenoud
Copy link
Author

wrenoud commented Aug 27, 2019

I just realized the thing I wasn't reporting in the case. My line type was set to Impulse. Switching to Line works just fine.

@wrenoud wrenoud changed the title Application hangs when plotting lots of NULL values on 2nd Y axis Application hangs when plotting lots of NULL values on 2nd Y axis with Line type 'Impulse' Aug 27, 2019
@wrenoud
Copy link
Author

wrenoud commented Aug 27, 2019

This is what I meant by the "infinite vertical line"
image

Even with just the 2 NULL values it's super slow to zoom/pan the Impulse plot.

@sky5walk
Copy link

Maybe the problems are with NULL values in X? We are not checking for NULL in that axis and they seem to be translated to 0, probably by toDouble(). I've committed a change that also applies that logic to the x axis. I think both axes should be treated equally, independently of whether we should make the logic configurable, which I'm not sure it would be useful.

@MKleusberg and @wrenoud, could you confirm if it gives the expected result with your data, using tomorrow's nightly build?

Thanks, this is critical for scatter plots of any X or Y column.

@mgrojo
Copy link
Member

mgrojo commented Aug 27, 2019

I just realized the thing I wasn't reporting in the case. My line type was set to Impulse. Switching to Line works just fine.

Ok, I see now the plot drawing problem setting the Impulse line type. I don't know if we can do something. It seems that in this case the NULL/NaN feature is not implemented in the same way as in the other cases. It's true that a "gap" does not make sense with those line types, so maybe the plot library is not expecting them in those cases.

Even with just the 2 NULL values it's super slow to zoom/pan the Impulse plot.

I don't appreciate the slowness, but it is another hint about not expecting the gaps in this case.

We'll have to think what we want to do for this combination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs. plot
Projects
None yet
Development

No branches or pull requests

5 participants