Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexpected result for coarser resolutions #2

Open
hansvancalster opened this issue Jul 7, 2021 · 3 comments
Open

unexpected result for coarser resolutions #2

hansvancalster opened this issue Jul 7, 2021 · 3 comments

Comments

@hansvancalster
Copy link
Contributor

hansvancalster commented Jul 7, 2021

This reprex uses the grts.sqlite downloaded from https://zenodo.org/record/2784012#.YOWn9OgzZPZ.

library(grtsdb)
library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(sf)
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1

plot_sample <- function(sample) {
  sample %>%
    st_as_sf(coords = c('x1c', 'x2c'), crs = 31370) %>%
    ggplot() + 
    geom_sf()}

samplesize  <- 100
bbox <- rbind(x = c(22000, 258880), y = c(153050, 244030))

test_10m <- extract_sample(
  grtsdb = connect_db(here::here("data/ignored_data/c-mon/grts.sqlite")),
  samplesize = samplesize,
  bbox = bbox,
  cellsize = 10)

plot_sample(test_10m) # looks OK

add_level(bbox = bbox, 
          cellsize = 640, 
          grtsdb = connect_db(here::here("data/ignored_data/c-mon/grts.sqlite"))
          )
#> Required number of levels: 9
# adds levels 14 (20 m), 13, ..., 9 (640 m) to the database

test_20m <- extract_sample(
  grtsdb = connect_db(here::here("data/ignored_data/c-mon/grts.sqlite")),
  samplesize = samplesize,
  bbox = bbox,
  cellsize = 20
)

plot_sample(test_20m) #not correct

head(test_20m)
#>      x1c    x2c  ranking
#> 1 103130 242450 17825792
#> 2 103130 242430 17825793
#> 3 103150 242430 17825794
#> 4 103150 242450 17825795
#> 5 103110 242410 17825796
#> 6 103110 242390 17825797

test_40m <- extract_sample(
  grtsdb = connect_db(here::here("data/ignored_data/c-mon/grts.sqlite")),
  samplesize = samplesize,
  bbox = bbox,
  cellsize = 40
)

head(test_40m) # empty, also for all coarser resolutions
#> [1] x1c     x2c     ranking
#> <0 rows> (or 0-length row.names)

Created on 2021-07-07 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                       
#>  version  R version 4.1.0 (2021-05-18)
#>  os       Windows 10 x64              
#>  system   x86_64, mingw32             
#>  ui       RTerm                       
#>  language (EN)                        
#>  collate  Dutch_Belgium.1252          
#>  ctype    Dutch_Belgium.1252          
#>  tz       Europe/Paris                
#>  date     2021-07-07                  
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date       lib source                      
#>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.1.0)              
#>  bit           4.0.4   2020-08-04 [1] CRAN (R 4.1.0)              
#>  bit64         4.0.5   2020-08-30 [1] CRAN (R 4.1.0)              
#>  blob          1.2.1   2020-01-20 [1] CRAN (R 4.1.0)              
#>  cachem        1.0.5   2021-05-15 [1] CRAN (R 4.1.0)              
#>  class         7.3-19  2021-05-03 [2] CRAN (R 4.1.0)              
#>  classInt      0.4-3   2020-04-07 [1] CRAN (R 4.1.0)              
#>  cli           2.5.0   2021-04-26 [1] CRAN (R 4.0.5)              
#>  colorspace    2.0-2   2021-06-24 [1] CRAN (R 4.0.5)              
#>  crayon        1.4.1   2021-02-08 [1] CRAN (R 4.1.0)              
#>  curl          4.3.1   2021-04-30 [1] CRAN (R 4.1.0)              
#>  DBI           1.1.1   2021-01-15 [1] CRAN (R 4.1.0)              
#>  digest        0.6.27  2020-10-24 [1] CRAN (R 4.1.0)              
#>  dplyr       * 1.0.7   2021-06-18 [1] CRAN (R 4.0.5)              
#>  e1071         1.7-7   2021-05-23 [1] CRAN (R 4.1.0)              
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.0)              
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.1.0)              
#>  fansi         0.5.0   2021-05-25 [1] CRAN (R 4.0.5)              
#>  farver        2.1.0   2021-02-28 [1] CRAN (R 4.1.0)              
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.0)              
#>  fs            1.5.0   2020-07-31 [1] CRAN (R 4.1.0)              
#>  generics      0.1.0   2020-10-31 [1] CRAN (R 4.1.0)              
#>  ggplot2     * 3.3.5   2021-06-25 [1] CRAN (R 4.0.5)              
#>  glue          1.4.2   2020-08-27 [1] CRAN (R 4.1.0)              
#>  grtsdb      * 0.1     2021-07-07 [1] Github (inbo/grtsdb@c6c4851)
#>  gtable        0.3.0   2019-03-25 [1] CRAN (R 4.1.0)              
#>  here          1.0.1   2020-12-13 [1] CRAN (R 4.1.0)              
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.1.0)              
#>  htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0)              
#>  httr          1.4.2   2020-07-20 [1] CRAN (R 4.1.0)              
#>  KernSmooth    2.23-20 2021-05-03 [2] CRAN (R 4.1.0)              
#>  knitr         1.33    2021-04-24 [1] CRAN (R 4.1.0)              
#>  lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.1.0)              
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.1.0)              
#>  memoise       2.0.0   2021-01-26 [1] CRAN (R 4.1.0)              
#>  mime          0.11    2021-06-23 [1] CRAN (R 4.0.5)              
#>  munsell       0.5.0   2018-06-12 [1] CRAN (R 4.1.0)              
#>  pillar        1.6.1   2021-05-16 [1] CRAN (R 4.1.0)              
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.1.0)              
#>  proxy         0.4-26  2021-06-07 [1] CRAN (R 4.0.5)              
#>  ps            1.6.0   2021-02-28 [1] CRAN (R 4.1.0)              
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.1.0)              
#>  R6            2.5.0   2020-10-28 [1] CRAN (R 4.1.0)              
#>  Rcpp          1.0.6   2021-01-15 [1] CRAN (R 4.1.0)              
#>  reprex        2.0.0   2021-04-02 [1] CRAN (R 4.1.0)              
#>  rlang         0.4.11  2021-04-30 [1] CRAN (R 4.1.0)              
#>  rmarkdown     2.9     2021-06-15 [1] CRAN (R 4.0.5)              
#>  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.1.0)              
#>  RSQLite       2.2.7   2021-04-22 [1] CRAN (R 4.1.0)              
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.0)              
#>  scales        1.1.1   2020-05-11 [1] CRAN (R 4.1.0)              
#>  sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.1.0)              
#>  sf          * 1.0-0   2021-06-09 [1] CRAN (R 4.0.5)              
#>  stringi       1.6.2   2021-05-17 [1] CRAN (R 4.0.5)              
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.1.0)              
#>  tibble        3.1.2   2021-05-16 [1] CRAN (R 4.1.0)              
#>  tidyselect    1.1.1   2021-04-30 [1] CRAN (R 4.1.0)              
#>  units         0.7-2   2021-06-08 [1] CRAN (R 4.0.5)              
#>  utf8          1.2.1   2021-03-12 [1] CRAN (R 4.1.0)              
#>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.1.0)              
#>  withr         2.4.2   2021-04-18 [1] CRAN (R 4.1.0)              
#>  xfun          0.24    2021-06-15 [1] CRAN (R 4.0.5)              
#>  xml2          1.3.2   2020-04-23 [1] CRAN (R 4.1.0)              
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.1.0)              
#> 
#> [1] C:/R/library
#> [2] C:/R/R-4.1.0/library

Any idea what is going on here? Bug or am I doing something wrong?
I ran this with inbo/grtsdb@v0.1.

@hansvancalster hansvancalster changed the title unexpected result for coarses resolutions unexpected result for coarser resolutions Aug 23, 2021
ThierryO added a commit that referenced this issue Jan 8, 2022
@hansvancalster
Copy link
Contributor Author

I updated grtsdb to version 0.2 and reran the above reprex, but the problem seems to persist. Maybe there is still a bug, or it could also be the case that I'm doing something wrong in the reprex.

library(grtsdb)
#> Warning: package 'grtsdb' was built under R version 4.1.2
library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(sf)
#> Linking to GEOS 3.9.0, GDAL 3.2.1, PROJ 7.2.1

plot_sample <- function(sample) {
  sample %>%
    count(x1c, x2c) %>%
    st_as_sf(coords = c('x1c', 'x2c'), crs = 31370) %>%
    ggplot() +
    geom_sf(aes(size = n))}

samplesize  <- 100
bbox <- rbind(x = c(22000, 258880), y = c(153050, 244030))

test_10m <- extract_sample(
  grtsdb = connect_db(here::here("data/c-mon/grts.sqlite")),
  samplesize = samplesize,
  bbox = bbox,
  cellsize = 10)

plot_sample(test_10m) # looks OK

add_level(bbox = bbox,
          cellsize = 640,
          grtsdb = connect_db(here::here("data/c-mon/grts.sqlite")),
          level = 9
          )
# adds levels 14 (20 m), 13, ..., 9 (640 m) to the database

test_20m <- extract_sample(
  grtsdb = connect_db(here::here("data/c-mon/grts.sqlite")),
  samplesize = samplesize,
  bbox = bbox,
  cellsize = 20
)

plot_sample(test_20m) #not correct

head(test_20m)
#>      x1c    x2c  ranking
#> 1 103130 242450 17825792
#> 2 103130 242430 17825793
#> 3 103150 242430 17825794
#> 4 103150 242450 17825795
#> 5 103110 242410 17825796
#> 6 103110 242390 17825797

test_40m <- extract_sample(
  grtsdb = connect_db(here::here("data/c-mon/grts.sqlite")),
  samplesize = samplesize,
  bbox = bbox,
  cellsize = 40
)

plot_sample(test_40m)

head(test_40m)
#>     x1c    x2c  ranking
#> 1 30620 159720 25165824
#> 2 30620 159720 25165825
#> 3 30620 159720 25165826
#> 4 30620 159720 25165827
#> 5 30620 159720 25165828
#> 6 30620 159720 25165829

Created on 2022-01-20 by the reprex package (v2.0.1.9000)

Session info
sessioninfo::session_info()
#> - Session info  --------------------------------------------------------------
#>  hash: man singer: medium-light skin tone, fountain, drooling face
#> 
#>  setting  value
#>  version  R version 4.1.1 (2021-08-10)
#>  os       Windows 10 x64 (build 19042)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language (EN)
#>  collate  Dutch_Belgium.1252
#>  ctype    Dutch_Belgium.1252
#>  tz       Europe/Paris
#>  date     2022-01-20
#>  pandoc   2.14.0.3 @ C:/Program Files/RStudio/bin/pandoc/ (via rmarkdown)
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version    date (UTC) lib source
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.1.0)
#>  backports     1.4.1      2021-12-13 [1] CRAN (R 4.1.2)
#>  bit           4.0.4      2020-08-04 [1] CRAN (R 4.1.0)
#>  bit64         4.0.5      2020-08-30 [1] CRAN (R 4.1.0)
#>  blob          1.2.2      2021-07-23 [1] CRAN (R 4.1.0)
#>  cachem        1.0.6      2021-08-19 [1] CRAN (R 4.1.1)
#>  class         7.3-19     2021-05-03 [2] CRAN (R 4.1.1)
#>  classInt      0.4-3      2020-04-07 [1] CRAN (R 4.1.0)
#>  cli           3.1.0      2021-10-27 [1] CRAN (R 4.1.1)
#>  colorspace    2.0-2      2021-06-24 [1] CRAN (R 4.1.0)
#>  crayon        1.4.2      2021-10-29 [1] CRAN (R 4.1.1)
#>  curl          4.3.2      2021-06-23 [1] CRAN (R 4.1.0)
#>  DBI           1.1.2      2021-12-20 [1] CRAN (R 4.1.2)
#>  digest        0.6.29     2021-12-01 [1] CRAN (R 4.1.2)
#>  dplyr       * 1.0.7      2021-06-18 [1] CRAN (R 4.1.0)
#>  e1071         1.7-9      2021-09-16 [1] CRAN (R 4.1.1)
#>  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.1.0)
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.1.0)
#>  fansi         0.5.0      2021-05-25 [1] CRAN (R 4.1.2)
#>  farver        2.1.0      2021-02-28 [1] CRAN (R 4.1.0)
#>  fastmap       1.1.0      2021-01-25 [1] CRAN (R 4.1.0)
#>  fs            1.5.2      2021-12-08 [1] CRAN (R 4.1.2)
#>  generics      0.1.1      2021-10-25 [1] CRAN (R 4.1.1)
#>  ggplot2     * 3.3.5      2021-06-25 [1] CRAN (R 4.1.0)
#>  glue          1.6.0      2021-12-17 [1] CRAN (R 4.1.2)
#>  grtsdb      * 0.2        2022-01-09 [1] https://inbo.r-universe.dev (R 4.1.2)
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 4.1.0)
#>  here          1.0.1      2020-12-13 [1] CRAN (R 4.1.0)
#>  highr         0.9        2021-04-16 [1] CRAN (R 4.1.0)
#>  htmltools     0.5.2      2021-08-25 [1] CRAN (R 4.1.1)
#>  httr          1.4.2      2020-07-20 [1] CRAN (R 4.1.0)
#>  KernSmooth    2.23-20    2021-05-03 [2] CRAN (R 4.1.1)
#>  knitr         1.37       2021-12-16 [1] CRAN (R 4.1.2)
#>  labeling      0.4.2      2020-10-20 [1] CRAN (R 4.1.0)
#>  lifecycle     1.0.1      2021-09-24 [1] CRAN (R 4.1.0)
#>  magrittr      2.0.1      2020-11-17 [1] CRAN (R 4.1.0)
#>  memoise       2.0.1      2021-11-26 [1] CRAN (R 4.1.2)
#>  mime          0.12       2021-09-28 [1] CRAN (R 4.1.1)
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 4.1.0)
#>  pillar        1.6.4      2021-10-18 [1] CRAN (R 4.1.1)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.1.0)
#>  proxy         0.4-26     2021-06-07 [1] CRAN (R 4.1.0)
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 4.1.0)
#>  R.cache       0.15.0     2021-04-30 [1] CRAN (R 4.1.1)
#>  R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.1.0)
#>  R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.1.0)
#>  R.utils       2.11.0     2021-09-26 [1] CRAN (R 4.1.0)
#>  R6            2.5.1      2021-08-19 [1] CRAN (R 4.1.0)
#>  Rcpp          1.0.7      2021-07-07 [1] CRAN (R 4.1.0)
#>  reprex        2.0.1.9000 2021-10-08 [1] Github (tidyverse/reprex@9ca939f)
#>  rlang         0.4.12     2021-10-18 [1] CRAN (R 4.1.1)
#>  rmarkdown     2.11       2021-09-14 [1] CRAN (R 4.1.1)
#>  rprojroot     2.0.2      2020-11-15 [1] CRAN (R 4.1.0)
#>  RSQLite       2.2.8      2021-08-21 [1] CRAN (R 4.1.1)
#>  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.1.0)
#>  scales        1.1.1      2020-05-11 [1] CRAN (R 4.1.0)
#>  sessioninfo   1.2.1      2021-11-02 [1] CRAN (R 4.1.1)
#>  sf          * 1.0-2      2021-07-26 [1] CRAN (R 4.1.0)
#>  stringi       1.7.5      2021-10-04 [1] CRAN (R 4.1.1)
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.1.0)
#>  styler        1.6.2      2021-09-23 [1] CRAN (R 4.1.1)
#>  tibble        3.1.5      2021-09-30 [1] CRAN (R 4.1.1)
#>  tidyselect    1.1.1      2021-04-30 [1] CRAN (R 4.1.0)
#>  units         0.7-2      2021-06-08 [1] CRAN (R 4.1.0)
#>  utf8          1.2.2      2021-07-24 [1] CRAN (R 4.1.0)
#>  vctrs         0.3.8      2021-04-29 [1] CRAN (R 4.1.0)
#>  withr         2.4.2      2021-04-18 [1] CRAN (R 4.1.0)
#>  xfun          0.29       2021-12-14 [1] CRAN (R 4.1.2)
#>  xml2          1.3.2      2020-04-23 [1] CRAN (R 4.1.0)
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.1.0)
#> 
#>  [1] C:/R/library
#>  [2] C:/R/R-4.1.1/library
#> 
#> ------------------------------------------------------------------------------

@ThierryO
Copy link
Collaborator

A more condense reprex is

library(grtsdb)

samplesize  <- 100
bbox <- rbind(x = c(22000, 258880), y = c(153050, 244030))
db <- connect_db(file.path("~", "Downloads", "grts.sqlite"))

test_10m <- extract_sample(
  grtsdb = db, samplesize = samplesize, bbox = bbox, cellsize = 10
)
plot(test_10m[, 1:2], asp = 1)

test_20m <- extract_sample(
  grtsdb = db, samplesize = samplesize, bbox = bbox, cellsize = 20
)
plot(test_20m[, 1:2], asp = 1)

However I can't reproduce it when creating a new database with a recent grtsdb version.

library(grtsdb)
samplesize  <- 100
bbox <- rbind(x = c(22000, 258880), y = c(153050, 244030))
db2 <- connect_db(file.path("~", "Downloads", "test.sqlite"))
test <- extract_sample(
  grtsdb = db2, samplesize = samplesize, bbox = bbox, cellsize = 500
)
plot(test[, 1:2], asp = 1)
test <- extract_sample(
  grtsdb = db2, samplesize = samplesize, bbox = bbox, cellsize = 2000
)
plot(test[, 1:2], asp = 1)
compact_db(db2)
test <- extract_sample(
  grtsdb = db2, samplesize = samplesize, bbox = bbox, cellsize = 500
)
plot(test[, 1:2], asp = 1)
test <- extract_sample(
  grtsdb = db2, samplesize = samplesize, bbox = bbox, cellsize = 2000
)
plot(test[, 1:2], asp = 1)
test <- extract_sample(
  grtsdb = db2, samplesize = samplesize, bbox = bbox, cellsize = 200
)
plot(test[, 1:2], asp = 1)

@hansvancalster
Copy link
Contributor Author

That's interesting. Is there a way to rewrite / restructure grts.sqlite (from C-mon project) as if it were written by a recent version of grtsdb?
I'm guessing this is not possible because sqlite does not have a random seed setting and rerunning the script that created grts.sqlite will not result in the same GRTS sample order.
The question is than if a fix is possible to the above problem of drawing a sample at higher hierarchical levels for sqlite databases written with early versions of grtsdb (I think only the C-mon project' sample is affected?).

Could differences in size of the databases rather than the fact that it is written with a recent version of grtsdb interfere with results of extract_sample() for coarser levels?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants