Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R CMD check test failure #279

Closed
pitkant opened this issue Oct 9, 2023 · 13 comments
Closed

R CMD check test failure #279

pitkant opened this issue Oct 9, 2023 · 13 comments
Assignees
Labels

Comments

@pitkant
Copy link
Member

pitkant commented Oct 9, 2023

In current httr2 branch there seems to be a test failure in test_03_get_eurostat_geospatial:

══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test_03_get_eurostat_geospatial.R:121:3'): get_eurostat_geospatial df ──
Error in `if (!setting_geom) {
    ix = if (is.character(i)) 
        which(i == names(x))
    else i
    if (is.null(value)) 
        agr = agr[-ix]
    else {
        if (length(ix) == 0 || ix > length(names(x))) 
            agr = st_agr(c(as.character(agr), NA_character_))
        else agr[ix] = NA
    }
}`: missing value where TRUE/FALSE needed
Backtrace:
     ▆
  1. ├─testthat::expect_message(...) at test_03_get_eurostat_geospatial.R:121:2
  2. │ └─testthat:::expect_condition_matching(...)
  3. │   └─testthat:::quasi_capture(...)
  4. │     ├─testthat (local) .capture(...)
  5. │     │ └─base::withCallingHandlers(...)
  6. │     └─rlang::eval_bare(quo_get_expr(.quo), quo_get_env(.quo))
  7. └─eurostat::get_eurostat_geospatial(nuts_level = "all", output_class = "df")
  8.   ├─sf::st_drop_geometry(shp)
  9.   └─sf:::st_drop_geometry.sf(shp)
 10.     └─sf::st_set_geometry(x, NULL)
 11.       ├─sf::`st_geometry<-`(`*tmp*`, value = value)
 12.       └─sf:::`st_geometry<-.sf`(`*tmp*`, value = value)
 13.         ├─base::`[[<-`(`*tmp*`, attr(x, "sf_column"), value = `<NULL>`)
 14.         └─sf:::`[[<-.sf`(`*tmp*`, attr(x, "sf_column"), value = `<NULL>`)

Specifically this seems to be related to this part of the code:

https://github.com/rOpenGov/eurostat/blob/327964ab096d9207b8e669fe74773d8fb11987a6/R/get_eurostat_geospatial.R#L231:L242

The sf::st_drop_geometry(shp) was complemented with shp$geometry <- NULL in PR #264 . I'm not sure if the problem has existed until that moment.

Something similar was reported in sf packages GitHub issues but this was supposed to be fixed back in 2019: r-spatial/sf#1057

This occurs only when building and running the check --as-cran from the terminal. In RStudio checking the package --as-cran works fine for some reason.

Any ideas why this is happening? @dieghernan

@dieghernan
Copy link
Member

Hi @pitkant

When #264 was introduced, these lines:

if (!requireNamespace("sf", quietly = TRUE)) {
# Remove geometry without sf package
shp$geometry <- NULL
# nocov end

were a little bit odd to me. This is because shp$geometry <- NULL is still relying on sf under the hood, so although is under !requireNamespace("sf", quietly = TRUE) inf fact uses sf.

The error you are getting are in fact coming from sf, see:

https://github.com/r-spatial/sf/blob/acaf33ed7b2ec4de965196daea7e001657c6e39a/R/sf.R#L383-L404

I would go for another approach for dropping the sf-specific attributes by:

  1. Removing the sf class of the object
  2. After that, unselecting the columng with the geometry. Note that step 1 is needed since the sf geometry is sticky.

I would try this alternative implementation, see:

library(eurostat)
#> Warning: package 'eurostat' was built under R version 4.1.3

data("eurostat_geodata_60_2016")

shp <- eurostat_geodata_60_2016


# Start alternative -----
# Remove geometry without sf package

# New classes, set tibble as base for classes
new_class <- intersect(class(shp), class(tibble::tibble()))

# Identify sf column with geometry for removal
geom_col <- attr(shp, "sf_column")

# Unclass and remove specific sf columns
class(shp) <- new_class

shp <- shp[, setdiff(names(shp), geom_col)]
# End alternative ----


# Contrast
class(shp)
#> [1] "data.frame"
names(shp)
#>  [1] "id"         "LEVL_CODE"  "NUTS_ID"    "CNTR_CODE"  "NAME_LATN" 
#>  [6] "NUTS_NAME"  "MOUNT_TYPE" "URBN_TYPE"  "COAST_TYPE" "FID"       
#> [11] "geo"

# Check if same result as sf::st_drop_geometry()
with_sf <- sf::st_drop_geometry(shp)
identical(with_sf, shp)
#> [1] TRUE

Created on 2023-10-09 with reprex v2.0.2

@pitkant
Copy link
Member Author

pitkant commented Oct 9, 2023

Thank you! I have to admit I was sloppy by not checking properly whether shp$geometry <- NULL was relying on sf package under the hood. If you can make a PR please do but I can also take responsibility for implementing the fixes, especially after you provided such excellent code example to start with.

@pitkant
Copy link
Member Author

pitkant commented Oct 10, 2023

Actually now that I was testing this it seems that the new geo_names function added in commit 3a6e48f is messing with attr(shp, "sf_column"):

> shp <- eurostat::eurostat_geodata_60_2016
> attr(shp, "sf_column")
[1] "geometry"
> shp2 <- eurostat::eurostat_geodata_60_2016
> shp2 <- eurostat:::geo_names(shp2)
> attr(shp2, "sf_column")
NULL

While it's nice that there's a specific order for columns I can't actually recall what was the rationale for adding this function in the package.

@dieghernan
Copy link
Member

Hi @pitkant

The function was introduced after this comment #264 (comment) and issue #240

What I have found is that geo_names() only work properly if sf is attached. The problem comes with the reordering of columns, and I and not finding a solution. Is it possible to relax that requirement?

dieghernan added a commit that referenced this issue Oct 11, 2023
@pitkant
Copy link
Member Author

pitkant commented Oct 12, 2023

Yes, although it is a neat function in my opinion and adds a consistent and predictable structure to geospatial objects. I would lean towards making all geospatial functions in eurostat package depend on having sf package installed. It would certainly make our job much easier. How much value does it add to read the singular 2016 file from data folder as a data.frame object? Sure it can be used to get some regional ID codes and region names, but other than that?

@dieghernan
Copy link
Member

Agree, I would do the changes requiring sf so we can avoid workarounds. One question, in which branch? This is also in v4-dev

@pitkant
Copy link
Member Author

pitkant commented Oct 13, 2023

v4-dev would be perfect, yes. I will merge httr2 branch there as well soon and if all tests pass it will go to v4 (and CRAN release) and after CRAN release into master branch.

@pitkant
Copy link
Member Author

pitkant commented Oct 24, 2023

I now get the following errors in get_eurostat_geospatial:


══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test_03_get_eurostat_geospatial.R:110:3'): get_eurostat_geospatial nuts levels ──
Error in `expect_compare(">", act, exp)`: Result of comparison must be a single logical value
Backtrace:
    ▆
 1. └─testthat::expect_gt(nrow(gall), nrow(gn3)) at test_03_get_eurostat_geospatial.R:110:2
 2.   └─testthat:::expect_compare(">", act, exp)
 3.     └─rlang::abort("Result of comparison must be a single logical value")
── Error ('test_03_get_eurostat_geospatial.R:192:3'): get_eurostat_geospatial df ──
Error in `expect_compare(">", act, exp)`: Result of comparison must be a single logical value
Backtrace:
    ▆
 1. └─testthat::expect_gt(nrow(gall), nrow(gn3)) at test_03_get_eurostat_geospatial.R:192:2
 2.   └─testthat:::expect_compare(">", act, exp)
 3.     └─rlang::abort("Result of comparison must be a single logical value")
── Failure ('test_03_get_eurostat_geospatial.R:318:3'): Check column names POLYGONS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:319:3'): Check column names POLYGONS from GISCO ──
names(poly) (`actual`) not identical to `col_order` (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:327:3'): Check column names POLYGONS from GISCO ──
`poly_df` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:328:3'): Check column names POLYGONS from GISCO ──
names(poly_df) (`actual`) not identical to col_order[-length(col_order)] (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:347:3'): Check column names POLYGONS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:348:3'): Check column names POLYGONS from GISCO ──
names(poly) (`actual`) not identical to `col_order` (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:356:3'): Check column names POLYGONS from GISCO ──
`poly_df` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:357:3'): Check column names POLYGONS from GISCO ──
names(poly_df) (`actual`) not identical to col_order[-length(col_order)] (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:543:3'): Check column names BORDERS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:552:3'): Check column names BORDERS from GISCO ──
`poly_df` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:576:3'): Check column names BORDERS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:585:3'): Check column names BORDERS from GISCO ──
`poly_df` is not an S3 object

[ FAIL 14 | WARN 0 | SKIP 0 | PASS 182 ]
Error: Test failures
Execution halted

1 error ✖ | 0 warnings ✔ | 0 notes ✔
Error: R CMD check found ERRORs
Execution halted

These are related to gsc_api_cache function lines 25-26 producing err_dwnload object that inherits try-error and produces a NULL:

err_dwnload <- suppressWarnings(try(download.file(url, file.local, 
    quiet = isFALSE(verbose), mode = "wb"), silent = TRUE))

This is due to HTTP status 503 from GISCO:

Error in download.file(url, file.local, quiet = isFALSE(verbose), mode = "wb",  : 
  cannot open URL 'https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_BN_60M_2010_4326_LEVL_0.geojson'
In addition: Warning messages:
1: In download.file(url, file.local, quiet = isFALSE(verbose), mode = "wb",  :
  downloaded length 0 != reported length 0
2: In download.file(url, file.local, quiet = isFALSE(verbose), mode = "wb",  :
  cannot open URL 'https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_BN_60M_2010_4326_LEVL_0.geojson': HTTP status was '503 Service Unavailable'

That made me wonder if GISCO is using a DDoS protection system as the situation is similar to the situation this Stack Overflow post is describing: https://stackoverflow.com/a/47729625/14413481

@dieghernan Do you get similar errors when running R CMD check on v4-dev branch or is it just me?

@dieghernan
Copy link
Member

Hi @pitkant , I received today an email from CRAN related with this, also some users are reporting this rOpenGov/giscoR#69 See there

@dieghernan
Copy link
Member

Hi @pitkant

New release of giscoR is now on CRAN, the error should be fixed now. Please allow a couple of days on the action since I think it uses posit package manager instead of CRAN, it may take some time until it updates giscoR to 0.4.0 https://packagemanager.posit.co/client/#/repos/cran/packages/giscoR/611842/overview?search=giscoR%23package-details

@hannesaddec
Copy link

I now get the following errors in get_eurostat_geospatial:


══ Failed tests ════════════════════════════════════════════════════════════════
── Error ('test_03_get_eurostat_geospatial.R:110:3'): get_eurostat_geospatial nuts levels ──
Error in `expect_compare(">", act, exp)`: Result of comparison must be a single logical value
Backtrace:
    ▆
 1. └─testthat::expect_gt(nrow(gall), nrow(gn3)) at test_03_get_eurostat_geospatial.R:110:2
 2.   └─testthat:::expect_compare(">", act, exp)
 3.     └─rlang::abort("Result of comparison must be a single logical value")
── Error ('test_03_get_eurostat_geospatial.R:192:3'): get_eurostat_geospatial df ──
Error in `expect_compare(">", act, exp)`: Result of comparison must be a single logical value
Backtrace:
    ▆
 1. └─testthat::expect_gt(nrow(gall), nrow(gn3)) at test_03_get_eurostat_geospatial.R:192:2
 2.   └─testthat:::expect_compare(">", act, exp)
 3.     └─rlang::abort("Result of comparison must be a single logical value")
── Failure ('test_03_get_eurostat_geospatial.R:318:3'): Check column names POLYGONS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:319:3'): Check column names POLYGONS from GISCO ──
names(poly) (`actual`) not identical to `col_order` (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:327:3'): Check column names POLYGONS from GISCO ──
`poly_df` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:328:3'): Check column names POLYGONS from GISCO ──
names(poly_df) (`actual`) not identical to col_order[-length(col_order)] (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:347:3'): Check column names POLYGONS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:348:3'): Check column names POLYGONS from GISCO ──
names(poly) (`actual`) not identical to `col_order` (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:356:3'): Check column names POLYGONS from GISCO ──
`poly_df` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:357:3'): Check column names POLYGONS from GISCO ──
names(poly_df) (`actual`) not identical to col_order[-length(col_order)] (`expected`).

`actual` is NULL
`expected` is a character vector ('id', 'LEVL_CODE', 'NUTS_ID', 'CNTR_CODE', 'NAME_LATN', ...)
── Failure ('test_03_get_eurostat_geospatial.R:543:3'): Check column names BORDERS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:552:3'): Check column names BORDERS from GISCO ──
`poly_df` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:576:3'): Check column names BORDERS from GISCO ──
`poly` is not an S3 object
── Failure ('test_03_get_eurostat_geospatial.R:585:3'): Check column names BORDERS from GISCO ──
`poly_df` is not an S3 object

[ FAIL 14 | WARN 0 | SKIP 0 | PASS 182 ]
Error: Test failures
Execution halted

1 error ✖ | 0 warnings ✔ | 0 notes ✔
Error: R CMD check found ERRORs
Execution halted

These are related to gsc_api_cache function lines 25-26 producing err_dwnload object that inherits try-error and produces a NULL:

err_dwnload <- suppressWarnings(try(download.file(url, file.local, 
    quiet = isFALSE(verbose), mode = "wb"), silent = TRUE))

This is due to HTTP status 503 from GISCO:

Error in download.file(url, file.local, quiet = isFALSE(verbose), mode = "wb",  : 
  cannot open URL 'https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_BN_60M_2010_4326_LEVL_0.geojson'
In addition: Warning messages:
1: In download.file(url, file.local, quiet = isFALSE(verbose), mode = "wb",  :
  downloaded length 0 != reported length 0
2: In download.file(url, file.local, quiet = isFALSE(verbose), mode = "wb",  :
  cannot open URL 'https://gisco-services.ec.europa.eu/distribution/v2/nuts/geojson/NUTS_BN_60M_2010_4326_LEVL_0.geojson': HTTP status was '503 Service Unavailable'

That made me wonder if GISCO is using a DDoS protection system as the situation is similar to the situation this Stack Overflow post is describing: https://stackoverflow.com/a/47729625/14413481

@dieghernan Do you get similar errors when running R CMD check on v4-dev branch or is it just me?

DM me on technical infrastructure issues..

@pitkant
Copy link
Member Author

pitkant commented Oct 31, 2023

Thank you for your message @hannesaddec , I will do so in the future.

@dieghernan I ran the tests with new version of giscoR, everything worked great

@pitkant
Copy link
Member Author

pitkant commented Dec 20, 2023

Closed with the CRAN release of package version 4.0.0

@pitkant pitkant closed this as completed Dec 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants