Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird proportion results for group_polys #48

Closed
kaijagahm opened this issue Mar 13, 2023 · 4 comments · Fixed by #49
Closed

Weird proportion results for group_polys #48

kaijagahm opened this issue Mar 13, 2023 · 4 comments · Fixed by #49
Assignees

Comments

@kaijagahm
Copy link

Hi,
My colleague and I are having a problem with the KDE implementation for the group_polys function. The proportions being returned aren't logical--they're much greater than 1.

My colleague noticed that for any self overlaps, the proportion returned is very close to, or exactly, 10000, leading her to think that there should be some kind of division by 10000 somewhere in the function. Maybe the units aren't matching up between hectares and m^2?

Here is a reproducible example using some of her data:

library(spatsoc) # load spatsoc

# here is a tiny tiny bit of data for two individuals
test.dat <- structure(list(id = c("6468", "6468", "6468", "6468", "6468", 
                      "6468", "6468", "6468", "6468", "6468", "6468", "6468", "6468", 
                      "6468", "6468", "6468", "6468", "6468", "6468", "6468", "6468", 
                      "6468", "9576_9577", "9576_9577", "9576_9577", "9576_9577", "9576_9577", 
                      "9576_9577", "9576_9577", "9576_9577", "9576_9577", "9576_9577", 
                      "9576_9577", "9576_9577", "9576_9577", "9576_9577", "9576_9577", 
                      "9576_9577", "9576_9577", "9576_9577", "9576_9577"), x = c(573853.9685, 
                                                                                 573870.0016, 573869.3368, 573867.4952, 573866.979, 573864.5267, 
                                                                                 573856.9952, 573856.0194, 573858.5259, 573862.3601, 573871.0058, 
                                                                                 573870.1924, 573865.5344, 573862.4916, 573864.0015, 573858.669, 
                                                                                 573861.5062, 573859.0108, 573857.3564, 573856.7867, 573865.0035, 
                                                                                 573863.0313, 573813.5108, 573805.4084, 573802.0335, 573801.7041, 
                                                                                 573804.9867, 573827.0013, 573827.0013, 573828.0181, 573828.0181, 
                                                                                 573833.045, 573833.045, 573833.045, 573837.0256, 573835.0206, 
                                                                                 573838.9847, 573842.5092, 573843.9913, 573843.9913, 573834.9678
                      ), y = c(4261981.792, 4261974.239, 4261968.295, 4261973.958, 
                               4261978.976, 4261995.718, 4262008.009, 4262009.026, 4261994.5, 
                               4261981.975, 4261981.004, 4261982.606, 4261986.045, 4261984.005, 
                               4261982.992, 4261966.662, 4261953.983, 4261953.983, 4261953.983, 
                               4261957.96, 4261966.496, 4261977.049, 4261939.988, 4261914.996, 
                               4261915.001, 4261919.633, 4261924.718, 4261936.978, 4261936.978, 
                               4261954.502, 4261955.046, 4261955.046, 4261955.046, 4261965.979, 
                               4261965.979, 4261920.684, 4261936.504, 4261958.034, 4261971.485, 
                               4261979.951, 4261979.951), timegroup = c(16L, 16L, 16L, 16L, 
                                                                        16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
                                                                        16L, 16L, 16L, 16L, 16L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 
                                                                        45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L)), row.names = c(NA, 
                                                                                                                                               -41L), spec = structure(list(cols = list(time_agg...1 = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                   "collector")), time_agg...2 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                             "collector")), id = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                                             "collector")), DateTime = structure(list(format = ""), class = c("collector_datetime", 
                                                                                                                                                                                                                                                                                                                                                                                                              "collector")), time = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                "collector")), x = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "collector")), y = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              "collector")), count = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 "collector"))), default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       "collector")), delim = ","), class = "col_spec"), class = c("data.table", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             "data.frame"))

# run group_polys
polygroup.kde <- group_polys(DT = test.dat, area = TRUE, hrType = 'kernel',
                             hrParams = list(h = "href", kern = "bivnorm", 
                                             # hlim = c(0.1, 1.5), 
                                             grid = 150, extent = 4
                             ),
                             projection = 'EPSG:26910', id = 'id',
                             coords = c('x', 'y'))

# view the results. We can see that the results in the proportion column are nonsensical.
polygroup.kde

# Just as a comparison, running group_polys with hrType = "mcp" works fine for the same data
polygroup.mcp <- group_polys(DT = test.dat, area = TRUE, hrType = "mcp", hrParams = list(percent = 95), projection = "EPSG:26910", id = 'id', coords = c("x", "y"))
polygroup.mcp # (in this particular example, the individuals don't overlap, which is fine, but at least the self-overlap values are logical now: ~1).

Can you help figure out what is going on? The documentation isn't very clear about how to implement the KDE overlap. If we're doing something wrong or need to provide e.g. additional coordinate/unit information, let us know (and also that would be good to update in the documentation).

Thanks in advance for any help you can provide!

@robitalec
Copy link
Member

Hello @kaijagahm,

Thanks for pointing this out - it's fixed in the related PR so be sure to install the dev version after I merge it.

Internally, rgeos::gArea and the kernel's areas were in different units, which is why the proportion was off.
The areas returned now are forced to be in m2 like gArea, and this fixes the proportions.
I had noticed a strange result from kernel group_polys recently but I'm in the middle of my comprehensive exam so didn't get to it immediately.

I've added a test to the package as well to make sure this is monitored moving forward.

Of interest, I'll be doing a refactor of the package to remove the dependency on deprecated spatial packages (eg. rgeos) sometime this spring/summer. Good to know that there are users out there using the group_polys functions, hopefully our dependencies make the switch over too so it is a simple fix to get rid of the deprecated packages.

@robitalec
Copy link
Member

robitalec commented Mar 13, 2023

Here's your MWE with the fixed units

library(spatsoc) # load spatsoc

# here is a tiny tiny bit of data for two individuals
test.dat <- structure(list(id = c("6468", "6468", "6468", "6468", "6468", 
                                  "6468", "6468", "6468", "6468", "6468", "6468", "6468", "6468", 
                                  "6468", "6468", "6468", "6468", "6468", "6468", "6468", "6468", 
                                  "6468", "9576_9577", "9576_9577", "9576_9577", "9576_9577", "9576_9577", 
                                  "9576_9577", "9576_9577", "9576_9577", "9576_9577", "9576_9577", 
                                  "9576_9577", "9576_9577", "9576_9577", "9576_9577", "9576_9577", 
                                  "9576_9577", "9576_9577", "9576_9577", "9576_9577"), x = c(573853.9685, 
                                                                                             573870.0016, 573869.3368, 573867.4952, 573866.979, 573864.5267, 
                                                                                             573856.9952, 573856.0194, 573858.5259, 573862.3601, 573871.0058, 
                                                                                             573870.1924, 573865.5344, 573862.4916, 573864.0015, 573858.669, 
                                                                                             573861.5062, 573859.0108, 573857.3564, 573856.7867, 573865.0035, 
                                                                                             573863.0313, 573813.5108, 573805.4084, 573802.0335, 573801.7041, 
                                                                                             573804.9867, 573827.0013, 573827.0013, 573828.0181, 573828.0181, 
                                                                                             573833.045, 573833.045, 573833.045, 573837.0256, 573835.0206, 
                                                                                             573838.9847, 573842.5092, 573843.9913, 573843.9913, 573834.9678
                                  ), y = c(4261981.792, 4261974.239, 4261968.295, 4261973.958, 
                                           4261978.976, 4261995.718, 4262008.009, 4262009.026, 4261994.5, 
                                           4261981.975, 4261981.004, 4261982.606, 4261986.045, 4261984.005, 
                                           4261982.992, 4261966.662, 4261953.983, 4261953.983, 4261953.983, 
                                           4261957.96, 4261966.496, 4261977.049, 4261939.988, 4261914.996, 
                                           4261915.001, 4261919.633, 4261924.718, 4261936.978, 4261936.978, 
                                           4261954.502, 4261955.046, 4261955.046, 4261955.046, 4261965.979, 
                                           4261965.979, 4261920.684, 4261936.504, 4261958.034, 4261971.485, 
                                           4261979.951, 4261979.951), timegroup = c(16L, 16L, 16L, 16L, 
                                                                                    16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 16L, 
                                                                                    16L, 16L, 16L, 16L, 16L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 
                                                                                    45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L, 45L)), row.names = c(NA, 
                                                                                                                                                           -41L), spec = structure(list(cols = list(time_agg...1 = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                               "collector")), time_agg...2 = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                         "collector")), id = structure(list(), class = c("collector_character", 
                                                                                                                                                                                                                                                                                                                                                         "collector")), DateTime = structure(list(format = ""), class = c("collector_datetime", 
                                                                                                                                                                                                                                                                                                                                                                                                                          "collector")), time = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                            "collector")), x = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           "collector")), y = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          "collector")), count = structure(list(), class = c("collector_double", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             "collector"))), default = structure(list(), class = c("collector_guess", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   "collector")), delim = ","), class = "col_spec"), class = c("data.table", 
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               "data.frame"))

# run group_polys
polygroup.kde <- group_polys(DT = test.dat, area = TRUE, hrType = 'kernel',
                             hrParams = list(h = "href", kern = "bivnorm", 
                                             # hlim = c(0.1, 1.5), 
                                             grid = 150, extent = 4
                             ),
                             projection = 'EPSG:26910', id = 'id',
                             coords = c('x', 'y'))
#> Registered S3 methods overwritten by 'adehabitatMA':
#>   method                       from
#>   print.SpatialPixelsDataFrame sp  
#>   print.SpatialPixels          sp
#> Warning: GEOS support is provided by the sf and terra packages among others

# view the results. We can see that the results in the proportion column are nonsensical.
polygroup.kde
#>          ID1       ID2      area proportion
#> 1:      6468      6468 2429.6249  0.9999998
#> 2:      6468 9576_9577  923.2739  0.3800067
#> 3: 9576_9577      6468  923.2739  0.1580150
#> 4: 9576_9577 9576_9577 5842.9506  1.0000000

# Just as a comparison, running group_polys with hrType = "mcp" works fine for the same data
polygroup.mcp <- group_polys(DT = test.dat, area = TRUE, hrType = "mcp", hrParams = list(percent = 95), projection = "EPSG:26910", id = 'id', coords = c("x", "y"))
#> Warning: GEOS support is provided by the sf and terra packages among others
polygroup.mcp # (in this particular example, the individuals don't overlap, which is fine, but at least the self-overlap values are logical now: ~1).
#>          ID1       ID2      area proportion
#> 1:      6468      6468  488.7712  0.9999999
#> 2: 9576_9577 9576_9577 1409.3551  1.0000001

Created on 2023-03-13 with reprex v2.0.2

Note: there is a small difference between the area calculated by intersection and by the polygons @area. Hopefully this is resolved after the future refactor

@kaijagahm
Copy link
Author

Thanks for the quick response, @robitalec!

Glad to hear you'll be refactoring the package--I'll be excited to see where it goes! I'll pass this along to my colleague.

@robitalec
Copy link
Member

You're welcome, all the best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants