Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SEDONA-368] Add RS_SummaryStats #978

Merged
merged 12 commits into from
Aug 18, 2023
Merged

Conversation

furqaankhan
Copy link
Contributor

Did you read the Contributor Guide?

Is this PR related to a JIRA ticket?

What changes were proposed in this PR?

  • Add RS_SummaryStats

How was this patch tested?

  • Passed all added tests

Did this PR include necessary documentation updates?

  • Yes, I have updated the documentation update.

Comment on lines 137 to 143
private static double getStddev(double[] pixels, double mean) {
double stddev = 0;
for(double pixel: pixels){
stddev += Math.pow(pixel - mean, 2);
}
return Math.sqrt(stddev/(pixels.length - 1));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to exclude pixels with nodata value when computing std-dev?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include/exclude no data values in std based on the flag.

}
}
}
mean = sum / count;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be aware of the cases when all pixels in the band are nodata value. PostGIS returns all statistics except count as NULL. I think both NULL and NaN are sane choices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, will add a check before the compute code and return NULL.

for (double pixel: pixels) {
sum += pixel;
if (pixel < min) {
min = pixel;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both of these for loops can be merged

Comment on lines 137 to 143
private static double getStddev(double[] pixels, double mean) {
double stddev = 0;
for(double pixel: pixels){
stddev += Math.pow(pixel - mean, 2);
}
return Math.sqrt(stddev/(pixels.length - 1));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include/exclude no data values in std based on the flag.

for(double pixel: pixels){
stddev += Math.pow(pixel - mean, 2);
}
return Math.sqrt(stddev/(pixels.length - 1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standard deviation for a sample requires dividing by (n - 1). Since we're not sampling yet and are considering all pixels, we should be dividing by n.

@@ -74,6 +74,116 @@ public void testBandNoDataValueIllegalBand() throws FactoryException, IOExceptio
assertEquals("Provided band index 2 is not present in the raster", exception.getMessage());
}

@Test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the tests, since we're dealing with means and std which are small, can we see if delta of 1e-6 is sufficient instead of 0.1d?

@jiayuasu jiayuasu added this to the sedona-1.5.0 milestone Aug 18, 2023
@jiayuasu jiayuasu merged commit 92fb141 into apache:master Aug 18, 2023
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants