Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

length of NULL cannot be changed using readr::read_tsv (readr v1.1.1) #750

Closed
stianlagstad opened this issue Nov 25, 2017 · 10 comments
Closed
Labels
bug an unexpected problem or unintended behavior

Comments

@stianlagstad
Copy link

col_types_defuse = readr::cols_only(
  "cluster_id" = readr::col_character(),
  "splitr_sequence" = readr::col_character(),
  "splitr_count" = readr::col_integer(),
  "gene1" = readr::col_character(),
  "gene2" = readr::col_character(),
  "gene_chromosome1" = readr::col_character(),
  "gene_chromosome2" = readr::col_character(),
  "gene_name1" = readr::col_character(),
  "gene_name2" = readr::col_character(),
  "gene_strand1" = readr::col_character(),
  "gene_strand2" = readr::col_character(),
  "genomic_break_pos1" = readr::col_integer(),
  "genomic_break_pos2" = readr::col_integer(),
  "span_count" = readr::col_integer(),
  "orf" = readr::col_character(),
  "probability" = readr::col_number()
)
readr::read_tsv(
  file = "defuse_833ke_results.filtered.tsv",
  col_types = col_types_defuse
)

This leads to a lot of warning messages:

Warning messages:
1: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
2: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
...

Is there something I'm doing wrong? I'm doing this from the rocker/rstudio:devel docker image.

defuse_833ke_results.filtered.tsv.tar.gz

@cderv
Copy link
Contributor

cderv commented Nov 26, 2017

to provide more feedback, I tried to reproduce this on my computer. No warning. I did not test it on R rocker/rstudio:devel or R devel version, which can be the issue cause. Just on R 3.4.2 with readr 1.1.1.
A full reprex to help test it further:

reprex::reprex_info()
#> Created by the reprex package v0.1.1.9000 on 2017-11-26

# Download the file in temp folder
url <- "https://github.com/tidyverse/readr/files/1503502/defuse_833ke_results.filtered.tsv.tar.gz"
tmp_file <- tempfile()
download.file(url, tmp_file)
tmp_dir <- tempfile("untar-dir-")
untar(tmp_file, list = T)
#> [1] "defuse_833ke_results.filtered.tsv"
untar(tmp_file, exdir = tmp_dir)
file_name <- list.files(tmp_dir)
path_to_file <- file.path(tmp_dir, file_name)

# Define col types
col_types_defuse = readr::cols_only(
  "cluster_id" = readr::col_character(),
  "splitr_sequence" = readr::col_character(),
  "splitr_count" = readr::col_integer(),
  "gene1" = readr::col_character(),
  "gene2" = readr::col_character(),
  "gene_chromosome1" = readr::col_character(),
  "gene_chromosome2" = readr::col_character(),
  "gene_name1" = readr::col_character(),
  "gene_name2" = readr::col_character(),
  "gene_strand1" = readr::col_character(),
  "gene_strand2" = readr::col_character(),
  "genomic_break_pos1" = readr::col_integer(),
  "genomic_break_pos2" = readr::col_integer(),
  "span_count" = readr::col_integer(),
  "orf" = readr::col_character(),
  "probability" = readr::col_number()
)

# read the file with custom col type
readr::read_tsv(
  file = path_to_file,
  col_types = col_types_defuse
)
#> # A tibble: 17 x 16
#>    cluster_id
#>         <chr>
#>  1       5267
#>  2      12586
#>  3         58
#>  4       2406
#>  5       8264
#>  6       3085
#>  7       2416
#>  8      11901
#>  9       2546
#> 10       8250
#> 11      11758
#> 12       9493
#> 13       8958
#> 14       2374
#> 15      11759
#> 16      11946
#> 17      15540
#> # ... with 15 more variables: splitr_sequence <chr>, splitr_count <int>,
#> #   gene1 <chr>, gene2 <chr>, gene_chromosome1 <chr>,
#> #   gene_chromosome2 <chr>, gene_name1 <chr>, gene_name2 <chr>,
#> #   gene_strand1 <chr>, gene_strand2 <chr>, genomic_break_pos1 <int>,
#> #   genomic_break_pos2 <int>, orf <chr>, span_count <int>,
#> #   probability <dbl>

# delete tmp file
unlink(tmp_dir, recursive = T)
unlink(tmp_file)

Thanks for providing the file.

@stianlagstad
Copy link
Author

Thank you for replying. It works without issue on R 3.4.2 for me as well.

@stianlagstad
Copy link
Author

stianlagstad commented Nov 26, 2017

To make it a bit easier to reproduce I created a docker image.

Dockerfile:

from rocker/rstudio:devel

COPY defuse_833ke_results.filtered.tsv /tmp/defuse_833ke_results.filtered.tsv
COPY script.R /tmp/script.R
RUN chmod u+x /tmp/script.R

RUN R -e 'install.packages("readr")'
RUN ./tmp/script.R

script.R:

#!/usr/bin/env Rscript
setwd("/tmp")
col_types_defuse = readr::cols_only(
  "cluster_id" = readr::col_character(),
  "splitr_sequence" = readr::col_character(),
  "splitr_count" = readr::col_integer(),
  "gene1" = readr::col_character(),
  "gene2" = readr::col_character(),
  "gene_chromosome1" = readr::col_character(),
  "gene_chromosome2" = readr::col_character(),
  "gene_name1" = readr::col_character(),
  "gene_name2" = readr::col_character(),
  "gene_strand1" = readr::col_character(),
  "gene_strand2" = readr::col_character(),
  "genomic_break_pos1" = readr::col_integer(),
  "genomic_break_pos2" = readr::col_integer(),
  "span_count" = readr::col_integer(),
  "orf" = readr::col_character(),
  "probability" = readr::col_number()
)
readr::read_tsv(
  file = "defuse_833ke_results.filtered.tsv",
  col_types = col_types_defuse
)
warnings()

Building the image (docker build . -t readrtest) should give the warnings:

...
46: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
47: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
48: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
49: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
50: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
 ---> 350e29a74bee                                   
Removing intermediate container b016e1b46252         
Successfully built 350e29a74bee                      
Successfully tagged readrtest:latest

I've added the dockerfile, the script and the .tsv file in readr-error.tar.gz.

Changing the dockerfile from saying from rocker/rstudio:devel to saying from rocker/rstudio will not reproduce the warnings.

@jimhester jimhester added the bug an unexpected problem or unintended behavior label Dec 11, 2017
@jimhester
Copy link
Collaborator

Using the current devel version of readr ddbb5f4 and R-devel r73889 I am able to run the code in a rocker/rstudio:devel container without warnings.

So either a change in R or a change in the package seems to have fixed this.

@stianlagstad
Copy link
Author

stianlagstad commented Dec 12, 2017

A rerun of that dockerfile reproduced the warnings (I made sure to delete all my local containers and images before rebuilding), so I guess I'm not getting the latest version of readr (sessionInfo() within the container tells me it's readr_1.1.1). Should I install readr by other means to avoid the error? (Or just wait a few days?)

@batpigandme
Copy link
Contributor

I believe Jim is referring to the dev. version of readr which you can install using:

# install.packages("devtools")
devtools::install_github("tidyverse/readr")

@stianlagstad
Copy link
Author

stianlagstad commented Dec 12, 2017

Now I tried installing it from github with the same error, though. This is the Dockerfile:

from rocker/rstudio:devel

RUN sudo apt-get update
RUN sudo apt-get install libssl-dev zlib1g-dev -y

COPY defuse_833ke_results.filtered.tsv /tmp/defuse_833ke_results.filtered.tsv
COPY script.R /tmp/script.R
RUN chmod u+x /tmp/script.R

RUN R -e 'install.packages("git2r")'
RUN R -e 'install.packages("devtools")'
RUN R -e 'library(devtools);devtools::install_github("tidyverse/readr")'
RUN ./tmp/script.R

Building that with docker rm $(docker ps -a -q) && docker rmi $(docker images -q) && docker build --no-cache=true . -t readrtest produced the following:

...
Step 10/10 : RUN ./tmp/script.R
 ---> Running in 9286c2bf0a13
There were 50 or more warnings (use warnings() to see the first 50)
# A tibble: 17 x 16
   cluster_id
        <chr>
 1       5267
 2      12586
 3         58
 4       2406
 5       8264
 6       3085
 7       2416
 8      11901
 9       2546
10       8250
11      11758
12       9493
13       8958
14       2374
15      11759
16      11946
17      15540
# ... with 15 more variables: splitr_sequence <chr>, splitr_count <int>,
#   gene1 <chr>, gene2 <chr>, gene_chromosome1 <chr>, gene_chromosome2 <chr>,
#   gene_name1 <chr>, gene_name2 <chr>, gene_strand1 <chr>, gene_strand2 <chr>,
#   genomic_break_pos1 <int>, genomic_break_pos2 <int>, orf <chr>,
#   span_count <int>, probability <dbl>
Warning messages:
1: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
2: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
3: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
4: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
5: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
6: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
7: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
8: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
9: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
10: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
11: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
12: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
13: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
14: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
15: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
16: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
17: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
18: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
19: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
20: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
21: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
22: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
23: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
24: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
25: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
26: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
27: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
28: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
29: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
30: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
31: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
32: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
33: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
34: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
35: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
36: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
37: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
38: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
39: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
40: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
41: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
42: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
43: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
44: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
45: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
46: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
47: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
48: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
49: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
50: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
 ---> cea2e539f359
Removing intermediate container 9286c2bf0a13
Successfully built cea2e539f359
Successfully tagged readrtest:latest

Since I now installed readr from github and made sure I pulled the latest rstudio/rocker image, I shoouuld have the latest of both, right?

@stianlagstad
Copy link
Author

Could you try building the Dockerfile I posted above, @jimhester? I'm still seeing this issue.

jimhester added a commit that referenced this issue Apr 30, 2018
These are represented as NULL values, and trying to resize `R_NilValue`
throws a warning in R 3.5+

Fixes #750, #833
@stianlagstad
Copy link
Author

I can confirm that this issue has been resolved by 47ea858. (Ref #833)

stianlagstad added a commit to stianlagstad/chimeraviz that referenced this issue Sep 16, 2018
stianlagstad added a commit to stianlagstad/chimeraviz that referenced this issue Sep 16, 2018
@lock
Copy link

lock bot commented Nov 18, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Nov 18, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

4 participants