Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rescale discrete levels for how='eq_hist' #1055

Merged
merged 7 commits into from
Apr 5, 2022
Merged

Rescale discrete levels for how='eq_hist' #1055

merged 7 commits into from
Apr 5, 2022

Conversation

ianthomas23
Copy link
Member

This is a candidate fix for issue #357.

Under normal circumstances if how='eq_hist' then the span is calculated as (np.nanmin, np.nanmax) of the masked data, hence the data is colored using the full cmap range. If there are a small number of discrete values in the data then this can lead to low values, which are often in the majority, being rendered at the low end of cmap with low alpha.

This PR adds a new rescale_small_values boolean kwarg to shade() with a default of False. If it is True, how='eq_hist' and there are a small number of discrete values in the masked data then the lower span limit is reduced so that the data is rendered more towards the top end of the cmap range.

The exact form of the equation relating the lower_span limit to the max value of the masked data max_data is to be determined. Currently trialling (line 259):

lower_span = 1.0 - 0.6*np.log10(max_data)

@ianthomas23
Copy link
Member Author

Here is a video of it using HoloViews for interactive zooming.

Screencast.2022-03-28.13.50.53.mov

@ianthomas23
Copy link
Member Author

This now uses a different equation that gives better results. If the number of discrete values is less than 100 it increases the span range by a factor that depends on the number of discrete levels; this factor is 1 for 100 discrete values rising linearly to 1.5 for 2 discrete values. This span range factor is applied by lowering the lower span limit. An equivalent way of thinking about this is that if there are 100 or more discrete values then the data fills the colormap as usual, but as the number of discrete values is reduced the data fills a smaller proportion of the top end of the colormap, at 2 discrete levels this is the top 2/3 of the colormap.

The equation is linear in the number of discrete values. I have also tried logarithmic variation here but the results are not noticeably different. I have left the maths of the equation in the code so that there are no magic numbers.

Here is a video of it in action:

rescale.mp4

@ianthomas23
Copy link
Member Author

Most of the test failures are Ragged array related, which are think are fixed by #1050 but I have not explicitly checked this.

=========================== short test summary info ============================
FAILED datashader/tests/test_datatypes.py::TestRaggedGetitem::test_getitem_invalid
FAILED datashader/tests/test_datatypes.py::TestRaggedInterface::test_tolist
FAILED datashader/tests/test_datatypes.py::TestRaggedMethods::test_where_series[True]
FAILED datashader/tests/test_datatypes.py::TestRaggedMethods::test_where_series[False]
= 4 failed, 765 passed, 48 skipped, 2 xfailed, 80 warnings in 308.55s (0:05:08) =

@hoxbro
Copy link
Member

hoxbro commented Mar 31, 2022

The four failing tests should be fixed by #1050.

Copy link
Member

@jbednar jbednar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I'll rebase it now so that it can be tested, then separately rename rescale_small_values to rescale_discrete_levels since the values themselves may not actually be small; e.g. two discrete values of 10 and 100000 would still be covered by this rescaling. I'll also rename max_data to num_levels, again because the value is about the number of levels not the value of the data. I may also add control over those numeric values; not sure yet. In any case, definitely looks like a big improvement in eq_hist behavior!

@jbednar jbednar changed the title Rescale small values for how='eq_hist' Rescale discrete levels for how='eq_hist' Apr 1, 2022
@jbednar
Copy link
Member

jbednar commented Apr 1, 2022

Looks like there are still failing tests after rebasing. Also, can you please temporarily make "rescale_discrete_values" be True and make sure the tests pass in that case as well? We'll want to turn it on by default soon enough, and need to be able to use it everywhere in the meantime to test it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants