-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raster Singleband Pseudocolor with Quantile mode inconsistencies #35465
Comments
|
This issue is not new, I see that it was already present in QGIS 2.18.28. |
Fixes qgis#35465 but it is still and arbitrary value, better approaches would require to calculate other dispersion indexes and they seem impractical (inefficient) in this case. See for example: https://en.wikipedia.org/wiki/Freedman%E2%80%93Diaconis_rule
|
Hi @elpaso Thanks for the intermediate solution, however, I've tested it in master, and it still does not solve the sample I've put here. Do you agree it is better to keep this issue open until it can be completely solved? Best regards |
|
I tested your same dataset and the problem was solved. |
|
But let me check it again ... |
|
@PedroVenancio you are right, there was another change required to get exact results, which is to not restrict the sample size to 25000 like we are doing now, I removed that before the commit because I was (wrongly) convinced that the other fix was sufficient to fix this issue. This is what I get if the sample size is augmented ten times (set to 250000*10 ) |
|
Ok I tested it with my biggest rasters and it obviously freezes QGIS unless a maximum is set. A better approach for exact results would involve a more complex solution with a separate thread. |
Fixes qgis#35465 (part two)
Hi @elpaso Yes, this is the right answer for my sample! But if that increase freezes QGIS with huge rasters, it can't be applied. Quantile for vectors code is newer, and it seems it follows another aproach: Can't be somehow used in rasters? Or the problem is upstream, when collecting the raster statistics? Thank you very much Alessandro! |
|
@PedroVenancio we (me and @gioman) decided to go with the current solution (which is to increase both bin count and sample size). It does not impact performances because there is still a limit, even if that is way bigger than it was before (the original code was written 10 years ago, when RAM and CPUs were much more limited than today). |
Fixes qgis#35465 but it is still and arbitrary value, better approaches would require to calculate other dispersion indexes and they seem impractical (inefficient) in this case. See for example: https://en.wikipedia.org/wiki/Freedman%E2%80%93Diaconis_rule
Fixes qgis#35465 (part two)
Fixes qgis#35465 but it is still and arbitrary value, better approaches would require to calculate other dispersion indexes and they seem impractical (inefficient) in this case. See for example: https://en.wikipedia.org/wiki/Freedman%E2%80%93Diaconis_rule
Fixes qgis#35465 (part two)
|
Perfect @elpaso ! I've tested here and it works! Do you think this changes can be backported to 3.12 and 3.10? |
@PedroVenancio backports done minutes ago. |
|
Can you please test quantile in this raster with 3.14? It does not show any class to me, both in Linux or Win. With QGIS 3.10.7 it works as expected. |
|
Another issue I'm seeing now, with a Float64 raster: With this raster, quantiles are not correctly calculated (in this case, in QGIS 3.10.7, as it does not work in 3.14). Rounding that raster and converting to a Int32, it already calculate quantiles correctly (image on the right): Using Raster Pixels to Points algorithm to export Float64 pixel values to points, and calculating quantiles in the resulting vector layer, the output is the correct (as seen on the left image). Running r.quantile in the Float64 raster, the output is correct: Also correct exporting to CSV and calculating in R: So, it seems that quantile classification is still failing with some kind of data. |
@PedroVenancio @elpaso confirmed, new ticket filed here: #37448 |
@PedroVenancio and #37449 |
Commits 98261bc and ebdb546 for qgis#35465 have changed the logic to compute the default number of bins for the histogram. While appropriate in the context of qgis#35465, that tends to augment the number of bins, which is inappropriate for the raster histogram chart, where beyond 1000, the chart becomes unreadable. However I think the logic for plotting the raster histogram should probably be revised, but I'm not clear how: ask the user for a number of bins and/or take into account the width in pixels of the chart to determine the number of bins (should that depend if the user has zoomed in... ?)
[GUI] Raster histogram: restore behavior before #35465 changes
|
It keeps failing with some rasters. See this one: Here is the project with the sample raster: Quantiles calculated from Vector Symbology are always correct, just like r.quantile output: |
|
Maybe related to #39058 |
|
@PedroVenancio what QGIS version are you testing? |
@elpaso I'm testing with
and
and I'm getting the same results. And I also believe that #39058 is somehow related. |
|
This always the same issue: quantile for rasters is calculated on an histogram, which uses a sample of values and not the whole raster and a limited number of bins. Unless we re-implement it completely using an exact approach similar to the one used by vector layers and GRASS (and like I've done with unique/paletted) you will always get approximated results, that might lead to a wrong classification in some cases. Differently from paletted/quantile, where the calculations were already implemented in a separate thread, for singleband/pseudocolr that is not the case, quantile requires a non-trivial implementation of a new class to gather the values and build the histogram. I'll see what I can do but I cannot make any promise. |
Great @elpaso , thank you very much for your effort! |
|
@PedroVenancio can we close this? |
@elpaso @PedroVenancio I re-read this ticket and other related. To me there is still an issue (the one described at the end of this ticket), so I don't see why close it. I agree that raster symbology should be overhauled, but until then this do not seems fully solved. Just my opinion @elpaso feel free to close. |
|
@PedroVenancio @gioman the way I see it is that in order to get exact results from a raster classification we cannot use samples but read the whole raster. Due to the size that a typical raster can get we cannot do this in the GUI thread without freezing QGIS, this means to take a different approach and probably split the classification methods to add a list of "exact" (and potentially very slow) methods. I actually have got an idea and I've started an implementation back in october that exposes a raster layer as a vector layer and allows to use all vector classification methods already implemented for vectors in QGIS for free. But...
so, the bottom line is that this is quite a big refactoring that would probably classify as a feature request. I would be more than happy to work on this topic because I've already spent quite a lot of time researching and experimenting on this but I cannot do it without funds and it does not seem fair to me to use all the bugfixing funds for this development. There is another potentially promising approach to speed up these calculations which is to use the GPU with OpenCL or OpenGL compute shaders but that is also quite a refactoring. IMHO the route to go is a QEP followed by a grant, a crowdfunding, a generous donor or a combination of these. |
@elpaso I will be more than happy to chip in. And probably there are others interested. Ping me privately so we can get an idea of the amount that is necessary. |



I'm seeing some inconsistencies in classes breaks, when using Quantile mode on a raster with singleband pseudocolor symbology.
To reproduce:
Download this sample raster and load it on QGIS:
https://cld.pt/dl/download/7161caab-5ebe-4e94-8c4a-574a80821a97/perigosidade.tif
Properties -> Symbology -> Render type: Singleband pseudocolor -> Interpolation: Discrete -> Mode: Quantile -> Classes: 5. You will get:

Run GRASS r.quantile from Processing on this raster, with Number of Quantiles: 5. Leave all other options as default. You will get:
These are the correct classes breaks, and so the symbology should be:

I have done some tests to be sure about the correct classification.
[A] I've run Raster Pixels to Points to get the center point of all pixels. If you want to check, here is the output geopackage:
https://cld.pt/dl/download/3764537c-2504-4326-be77-74e26c0d2a49/perigosidade.gpkg
Load it in QGIS and go to Properties -> Symbology -> Graduated -> Mode: Equal Count (Quantile) -> Classes: 5 -> Classify. You will get:

[B] I've run GRASS r.stats to get all pixel values. Here is the CSV with all values:
https://cld.pt/dl/download/b730c614-40b3-40b8-887f-2b985e753d96/perigosidade_r_stats.csv
I've tested this csv in R:
[C] Then I extracted the absolute frequencies of raster values:
https://cld.pt/dl/download/af265a0a-024f-40c4-a04c-77eaa9f09173/perigosidade_r_stats_abs_freq.csv
and did the manual analysis of it, getting the cumulative relative frequencies and the result is the same:
https://cld.pt/dl/download/9b014dbd-f991-42e9-880e-baec555a6d18/perigosidade_r_stats_abs_freq_cum.ods
So, it seems that Singleband Pseudocolor Quantile is not returning the correct breaks.
The text was updated successfully, but these errors were encountered: