Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read a file when setting vroom(col_names = FALSE, col_select = c(1, 2)) (Linux/Windows) #381

Closed
arisp99 opened this issue Oct 21, 2021 · 6 comments
Labels
bug an unexpected problem or unintended behavior

Comments

@arisp99
Copy link

arisp99 commented Oct 21, 2021

Hello!

When using Github Actions and setting vroom(col_names = FALSE), I am unable to read a file on Linux and Windows, but can on macOS. Unfortunately, I am using a mac, so I have been unable to reproduce this error locally.

I am not sure if this is the place to report this or in https://github.com/r-lib/actions, but I suspect that {vroom} is the culprit. I have created an example package (https://github.com/arisp99/testactionspkg) which defines the following function:

read <- function(.file,
                 .col_names = FALSE,
                 .col_select = everything(),
                 .name_repair = "unique") {
  vroom::vroom(
    file = .file,
    col_names = .col_names,
    col_select = .col_select,
    show_col_types = FALSE,
    .name_repair = .name_repair
  )
}

When I attempt to use this function in a vignette: testactionspkg::read(path, .col_names = FALSE), I get the following error on Github Actions for Windows and Ubuntu when running an R CMD check workflow derived from https://github.com/r-lib/actions:

* creating vignettes ... ERROR
Error: --- re-building 'read-file.Rmd' using rmarkdown
Quitting from lines 24-25 (read-file.Rmd) 
Error: Error: processing vignette 'read-file.Rmd' failed with diagnostics:
Names repair functions can't return `NA` values.
--- failed re-building 'read-file.Rmd'
SUMMARY: processing the following file failed:
  'read-file.Rmd'
Error: Error: Vignette re-building failed.
Execution halted
Error: Error in proc$get_built_file() : Build process failed
Calls: <Anonymous> ... build_package -> with_envvar -> force -> <Anonymous>
Execution halted
Error: Process completed with exit code 1.

Here is a link to the GHA: https://github.com/arisp99/testactionspkg/runs/3969422512?check_suite_focus=true run.

A reprex for those working on Linux or Windows is below (this should fail, but I cannot test it...):

data <- tibble::tribble(
              ~X1,                ~X2,                ~X3,
        "Gene ID",    "PF3D7_0106300",    "PF3D7_0523000",
           "Gene",             "atp6",             "mdr1",
  "Mutation Name",   "atp6-Ala623Glu",    "mdr1-Asn86Tyr",
     "ExonicFunc", "missense_variant", "missense_variant",
      "AA Change",        "Ala623Glu",         "Asn86Tyr",
       "Targeted",              "Yes",              "Yes",
     "D10-JJJ-23",                "0",               "13",
     "D10-JJJ-43",                "0",                "0"
  )

vroom::vroom_write(data, "~/Desktop/file.csv")

vroom::vroom(
    file = "~/Desktop/file.csv",
    col_names = FALSE,
    show_col_types = FALSE,
  )

Any insight would be appreciated! Thanks!

@maelle
Copy link

maelle commented Oct 22, 2021

The reprex does not fail for me though (Ubuntu)

data <- tibble::tribble(
              ~X1,                ~X2,                ~X3,
        "Gene ID",    "PF3D7_0106300",    "PF3D7_0523000",
           "Gene",             "atp6",             "mdr1",
  "Mutation Name",   "atp6-Ala623Glu",    "mdr1-Asn86Tyr",
     "ExonicFunc", "missense_variant", "missense_variant",
      "AA Change",        "Ala623Glu",         "Asn86Tyr",
       "Targeted",              "Yes",              "Yes",
     "D10-JJJ-23",                "0",               "13",
     "D10-JJJ-43",                "0",                "0"
  )

local_file <- withr::local_tempfile(fileext = ".csv")
#> Setting deferred event(s) on global environment.
#>   * Execute (and clear) with `withr::deferred_run()`.
#>   * Clear (without executing) with `withr::deferred_clear()`.
vroom::vroom_write(data, local_file)

vroom::vroom(
    file = local_file,
    col_names = FALSE,
    show_col_types = FALSE,
  )
#> # A tibble: 9 × 3
#>   X1            X2               X3              
#>   <chr>         <chr>            <chr>           
#> 1 X1            X2               X3              
#> 2 Gene ID       PF3D7_0106300    PF3D7_0523000   
#> 3 Gene          atp6             mdr1            
#> 4 Mutation Name atp6-Ala623Glu   mdr1-Asn86Tyr   
#> 5 ExonicFunc    missense_variant missense_variant
#> 6 AA Change     Ala623Glu        Asn86Tyr        
#> 7 Targeted      Yes              Yes             
#> 8 D10-JJJ-23    0                13              
#> 9 D10-JJJ-43    0                0

Created on 2021-10-22 by the reprex package (v2.0.0)

@arisp99
Copy link
Author

arisp99 commented Oct 22, 2021

I just checked the above reprex using GHA and also found that it passed... However, when I run the following it fails:

data <- tibble::tribble(
              ~X1,                ~X2,                ~X3,
        "Gene ID",    "PF3D7_0106300",    "PF3D7_0523000",
           "Gene",             "atp6",             "mdr1",
  "Mutation Name",   "atp6-Ala623Glu",    "mdr1-Asn86Tyr",
     "ExonicFunc", "missense_variant", "missense_variant",
      "AA Change",        "Ala623Glu",         "Asn86Tyr",
       "Targeted",              "Yes",              "Yes",
     "D10-JJJ-23",                "0",               "13",
     "D10-JJJ-43",                "0",                "0"
  )

local_file <- withr::local_tempfile(fileext = ".csv")
vroom::vroom_write(data, local_file)

vroom::vroom(
    file = local_file,
    col_names = FALSE,
    col_select = c(1, 2),
    show_col_types = FALSE
  )

I have tested running the above code in two different repositories now. The first is the original repository I used in this issue (https://github.com/arisp99/testactionspkg), which uses an R CMD check GHA workflow. I also tested the above lines of code using a workflow that just runs the R code. This code fails on Ubuntu and Windows, but passes on macOS. The workflow is below (repo: https://github.com/arisp99/test-actions):

on:
  push:
    branches: [main, master]

name: Test Vroom

jobs:
  R-CMD-check:
    runs-on: ${{ matrix.config.os }}
    name: ${{ matrix.config.os }} (${{ matrix.config.r }})
    strategy:
      fail-fast: false
      matrix:
        config:
          - {os: macOS-latest,   r: 'release'}
          - {os: windows-latest, r: 'release'}
          - {os: ubuntu-latest,   r: 'devel', http-user-agent: 'release'}
          - {os: ubuntu-latest,   r: 'release'}
          - {os: ubuntu-latest,   r: 'oldrel-1'}

    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
      R_KEEP_PKG_SOURCE: yes

    steps:
      - uses: actions/checkout@v2

      - uses: r-lib/actions/setup-r@v1
        with:
          r-version: ${{ matrix.config.r }}
          http-user-agent: ${{ matrix.config.http-user-agent }}
          use-public-rspm: true

      - name: Run reprex
        run: |
          install.packages(c("vroom", "tibble", "withr"))
          data <- tibble::tribble(
            ~X1,                ~X2,                ~X3,
            "Gene ID",    "PF3D7_0106300",    "PF3D7_0523000",
            "Gene",             "atp6",             "mdr1",
            "Mutation Name",   "atp6-Ala623Glu",    "mdr1-Asn86Tyr",
            "ExonicFunc", "missense_variant", "missense_variant",
            "AA Change",        "Ala623Glu",         "Asn86Tyr",
            "Targeted",              "Yes",              "Yes",
            "D10-JJJ-23",                "0",               "13",
            "D10-JJJ-43",                "0",                "0"
          )

          local_file <- withr::local_tempfile(fileext = ".csv")
          vroom::vroom_write(data, local_file)

          vroom::vroom(
            file = local_file,
            col_names = FALSE,
            col_select = c(1, 2),
            show_col_types = FALSE
          )
        shell: Rscript {0}

@maelle thanks for checking the reprex! Does the modified code work on your local machine?

@maelle
Copy link

maelle commented Oct 22, 2021

I get

data <- tibble::tribble(
              ~X1,                ~X2,                ~X3,
        "Gene ID",    "PF3D7_0106300",    "PF3D7_0523000",
           "Gene",             "atp6",             "mdr1",
  "Mutation Name",   "atp6-Ala623Glu",    "mdr1-Asn86Tyr",
     "ExonicFunc", "missense_variant", "missense_variant",
      "AA Change",        "Ala623Glu",         "Asn86Tyr",
       "Targeted",              "Yes",              "Yes",
     "D10-JJJ-23",                "0",               "13",
     "D10-JJJ-43",                "0",                "0"
  )

local_file <- withr::local_tempfile(fileext = ".csv")
#> Setting deferred event(s) on global environment.
#>   * Execute (and clear) with `withr::deferred_run()`.
#>   * Clear (without executing) with `withr::deferred_clear()`.
vroom::vroom_write(data, local_file)

vroom::vroom(
    file = local_file,
    col_names = FALSE,
    col_select = c(1, 2),
    show_col_types = FALSE
  )
#> Error: Names repair functions can't return `NA` values.

Created on 2021-10-22 by the reprex package (v2.0.0)

@arisp99
Copy link
Author

arisp99 commented Oct 22, 2021

That is the error I have been seeing! Thanks @maelle!

So then to formalize the bug report. I would expect that the following lines of code to return the first two columns of the .csv file. However, when settings both col_names = FALSE and col_select = c(1, 2) there is instead an error.

The code only works on macOS and fails on Linux/Windows. When setting either col_names or col_select individually, there is no error (see my GHA workflow here) .

data <- tibble::tribble(
              ~X1,                ~X2,                ~X3,
        "Gene ID",    "PF3D7_0106300",    "PF3D7_0523000",
           "Gene",             "atp6",             "mdr1",
  "Mutation Name",   "atp6-Ala623Glu",    "mdr1-Asn86Tyr",
     "ExonicFunc", "missense_variant", "missense_variant",
      "AA Change",        "Ala623Glu",         "Asn86Tyr",
       "Targeted",              "Yes",              "Yes",
     "D10-JJJ-23",                "0",               "13",
     "D10-JJJ-43",                "0",                "0"
  )

local_file <- withr::local_tempfile(fileext = ".csv")
#> Setting deferred event(s) on global environment.
#>   * Execute (and clear) with `withr::deferred_run()`.
#>   * Clear (without executing) with `withr::deferred_clear()`.
vroom::vroom_write(data, local_file)

vroom::vroom(
    file = local_file,
    col_names = FALSE,
    col_select = c(1, 2),
    show_col_types = FALSE
  )
#> Error: Names repair functions can't return `NA` values.

@arisp99 arisp99 changed the title Unable to read a file when setting vroom(col_names = FALSE) (Linux/Windows) Unable to read a file when setting vroom(col_names = FALSE , col_select = c(1, 2)) (Linux/Windows) Oct 22, 2021
@arisp99 arisp99 changed the title Unable to read a file when setting vroom(col_names = FALSE , col_select = c(1, 2)) (Linux/Windows) Unable to read a file when setting vroom(col_names = FALSE, col_select = c(1, 2)) (Linux/Windows) Oct 22, 2021
@jimhester
Copy link
Collaborator

jimhester commented Oct 22, 2021

A more minimal reprex is

vroom::vroom(I("foo\tbar\n1\t\2\n"), col_names=FALSE, col_select = 1)
#> Error: Names repair functions can't return `NA` values.

Created on 2021-10-22 by the reprex package (v2.0.1)

@jimhester jimhester added the bug an unexpected problem or unintended behavior label Nov 9, 2021
@jimhester
Copy link
Collaborator

Thank you for opening the issue and for supplying a reproducible example, it is a big help!

This should be fixed in the next release of vroom.

arisp99 added a commit to bailey-lab/miplicorn that referenced this issue Nov 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants