Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why doesn't cellnumbers() preserve the order in the query object? #29

Closed
CaptureOSM opened this issue Mar 8, 2021 · 4 comments
Closed

Comments

@CaptureOSM
Copy link

CaptureOSM commented Mar 8, 2021

Hi,

I find cellnumbers() very very usefull!

Let me just point out a mistake I did in the begining, which might occur only in the case of points data. when you want to combine a column from the original spatial query with the output of cellnumbers() . Here is a reproducible example

library(sf)
library(tabularaster)  

# create some labeled data 
sq <- seq(0, 1, by = 0.1)
p <- st_as_sf(expand.grid(sq, sq), coords = c("Var1", "Var2"))
p$label <- rep(c("A", "B"), length.out = nrow(p))

# load raster data
r <- raster(volcano)

# extract volcano data and append labels (e.g. for descriptive stats or regression)
cn <- cellnumbers(r, p) %>% mutate(elev = extract(r, cell_),
                                   label = p$label) # and here is the mistake. It is very tempting to 
                                                             # to simply `cbind`

Simply cbinding the output of cellnumbers() back to the original spatial object (or transfering any extra columns from p to cn as I am showing here) is wrong because the order of p is not preserved in cn. The correct way would be to join by row_id

 cn <- cellnumbers(r, p) %>% mutate(elev = extract(r, cell_))
 p <- rowid_to_column(p) # additional step needed to create a join column 
 p <- left_join(p, cn, by = c('rowid' = 'object_'))

So i just wanted to point that. Or is there better way to perform this type of extraction followed by combination with original point data?

I see several ways to improve this

  • order the output of cellnumbers() by object_.
  • Maybe consider renaming object_ to rowid as semanticaly speaking they are the same.
  • Or provide the option in cellnumbers call to select a column from the spatial object to become the object_ column.
@mdsumner
Copy link
Member

mdsumner commented Mar 8, 2021

eek, that's certainly not supposed to be like that ... I see other problems too, thanks for the report

@mdsumner
Copy link
Member

mdsumner commented Mar 8, 2021

could you check your workflow with latest commit?

02d998e

I need to submit soon to align to the new spatstat, so this fix can get on CRAN soon (thank you!)

@CaptureOSM
Copy link
Author

It works, now the object_ is ordered and the output of join and cbind is the same.

cn <- 
      cellnumbers(r, p) %>% 
       mutate(elev = extract(r, cell_),
                    label = p$label)

p <- 
      rowid_to_column(p) %>% 
      left_join(cn, by = c('rowid' = 'object_'))

 sum(!(p$label.x == p$label.y))
 # 0

@mdsumner
Copy link
Member

mdsumner commented Mar 9, 2021

Thanks! Clearly I need some tests on this

@mdsumner mdsumner closed this as completed Nov 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants