New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`length of NULL cannot be changed` using readr::read_tsv (readr v1.1.1) #750

Closed
stianlagstad opened this Issue Nov 25, 2017 · 10 comments

Comments

Projects
None yet
4 participants
@stianlagstad

stianlagstad commented Nov 25, 2017

col_types_defuse = readr::cols_only(
  "cluster_id" = readr::col_character(),
  "splitr_sequence" = readr::col_character(),
  "splitr_count" = readr::col_integer(),
  "gene1" = readr::col_character(),
  "gene2" = readr::col_character(),
  "gene_chromosome1" = readr::col_character(),
  "gene_chromosome2" = readr::col_character(),
  "gene_name1" = readr::col_character(),
  "gene_name2" = readr::col_character(),
  "gene_strand1" = readr::col_character(),
  "gene_strand2" = readr::col_character(),
  "genomic_break_pos1" = readr::col_integer(),
  "genomic_break_pos2" = readr::col_integer(),
  "span_count" = readr::col_integer(),
  "orf" = readr::col_character(),
  "probability" = readr::col_number()
)
readr::read_tsv(
  file = "defuse_833ke_results.filtered.tsv",
  col_types = col_types_defuse
)

This leads to a lot of warning messages:

Warning messages:
1: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
2: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
...

Is there something I'm doing wrong? I'm doing this from the rocker/rstudio:devel docker image.

defuse_833ke_results.filtered.tsv.tar.gz

@cderv

This comment has been minimized.

Contributor

cderv commented Nov 26, 2017

to provide more feedback, I tried to reproduce this on my computer. No warning. I did not test it on R rocker/rstudio:devel or R devel version, which can be the issue cause. Just on R 3.4.2 with readr 1.1.1.
A full reprex to help test it further:

reprex::reprex_info()
#> Created by the reprex package v0.1.1.9000 on 2017-11-26

# Download the file in temp folder
url <- "https://github.com/tidyverse/readr/files/1503502/defuse_833ke_results.filtered.tsv.tar.gz"
tmp_file <- tempfile()
download.file(url, tmp_file)
tmp_dir <- tempfile("untar-dir-")
untar(tmp_file, list = T)
#> [1] "defuse_833ke_results.filtered.tsv"
untar(tmp_file, exdir = tmp_dir)
file_name <- list.files(tmp_dir)
path_to_file <- file.path(tmp_dir, file_name)

# Define col types
col_types_defuse = readr::cols_only(
  "cluster_id" = readr::col_character(),
  "splitr_sequence" = readr::col_character(),
  "splitr_count" = readr::col_integer(),
  "gene1" = readr::col_character(),
  "gene2" = readr::col_character(),
  "gene_chromosome1" = readr::col_character(),
  "gene_chromosome2" = readr::col_character(),
  "gene_name1" = readr::col_character(),
  "gene_name2" = readr::col_character(),
  "gene_strand1" = readr::col_character(),
  "gene_strand2" = readr::col_character(),
  "genomic_break_pos1" = readr::col_integer(),
  "genomic_break_pos2" = readr::col_integer(),
  "span_count" = readr::col_integer(),
  "orf" = readr::col_character(),
  "probability" = readr::col_number()
)

# read the file with custom col type
readr::read_tsv(
  file = path_to_file,
  col_types = col_types_defuse
)
#> # A tibble: 17 x 16
#>    cluster_id
#>         <chr>
#>  1       5267
#>  2      12586
#>  3         58
#>  4       2406
#>  5       8264
#>  6       3085
#>  7       2416
#>  8      11901
#>  9       2546
#> 10       8250
#> 11      11758
#> 12       9493
#> 13       8958
#> 14       2374
#> 15      11759
#> 16      11946
#> 17      15540
#> # ... with 15 more variables: splitr_sequence <chr>, splitr_count <int>,
#> #   gene1 <chr>, gene2 <chr>, gene_chromosome1 <chr>,
#> #   gene_chromosome2 <chr>, gene_name1 <chr>, gene_name2 <chr>,
#> #   gene_strand1 <chr>, gene_strand2 <chr>, genomic_break_pos1 <int>,
#> #   genomic_break_pos2 <int>, orf <chr>, span_count <int>,
#> #   probability <dbl>

# delete tmp file
unlink(tmp_dir, recursive = T)
unlink(tmp_file)

Thanks for providing the file.

@stianlagstad

This comment has been minimized.

stianlagstad commented Nov 26, 2017

Thank you for replying. It works without issue on R 3.4.2 for me as well.

@stianlagstad

This comment has been minimized.

stianlagstad commented Nov 26, 2017

To make it a bit easier to reproduce I created a docker image.

Dockerfile:

from rocker/rstudio:devel

COPY defuse_833ke_results.filtered.tsv /tmp/defuse_833ke_results.filtered.tsv
COPY script.R /tmp/script.R
RUN chmod u+x /tmp/script.R

RUN R -e 'install.packages("readr")'
RUN ./tmp/script.R

script.R:

#!/usr/bin/env Rscript
setwd("/tmp")
col_types_defuse = readr::cols_only(
  "cluster_id" = readr::col_character(),
  "splitr_sequence" = readr::col_character(),
  "splitr_count" = readr::col_integer(),
  "gene1" = readr::col_character(),
  "gene2" = readr::col_character(),
  "gene_chromosome1" = readr::col_character(),
  "gene_chromosome2" = readr::col_character(),
  "gene_name1" = readr::col_character(),
  "gene_name2" = readr::col_character(),
  "gene_strand1" = readr::col_character(),
  "gene_strand2" = readr::col_character(),
  "genomic_break_pos1" = readr::col_integer(),
  "genomic_break_pos2" = readr::col_integer(),
  "span_count" = readr::col_integer(),
  "orf" = readr::col_character(),
  "probability" = readr::col_number()
)
readr::read_tsv(
  file = "defuse_833ke_results.filtered.tsv",
  col_types = col_types_defuse
)
warnings()

Building the image (docker build . -t readrtest) should give the warnings:

...
46: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
47: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
48: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
49: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
50: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :                                         
  length of NULL cannot be changed                   
 ---> 350e29a74bee                                   
Removing intermediate container b016e1b46252         
Successfully built 350e29a74bee                      
Successfully tagged readrtest:latest

I've added the dockerfile, the script and the .tsv file in readr-error.tar.gz.

Changing the dockerfile from saying from rocker/rstudio:devel to saying from rocker/rstudio will not reproduce the warnings.

@jimhester jimhester added the bug label Dec 11, 2017

@jimhester

This comment has been minimized.

Member

jimhester commented Dec 12, 2017

Using the current devel version of readr ddbb5f4 and R-devel r73889 I am able to run the code in a rocker/rstudio:devel container without warnings.

So either a change in R or a change in the package seems to have fixed this.

@jimhester jimhester closed this Dec 12, 2017

@stianlagstad

This comment has been minimized.

stianlagstad commented Dec 12, 2017

A rerun of that dockerfile reproduced the warnings (I made sure to delete all my local containers and images before rebuilding), so I guess I'm not getting the latest version of readr (sessionInfo() within the container tells me it's readr_1.1.1). Should I install readr by other means to avoid the error? (Or just wait a few days?)

@batpigandme

This comment has been minimized.

Member

batpigandme commented Dec 12, 2017

I believe Jim is referring to the dev. version of readr which you can install using:

# install.packages("devtools")
devtools::install_github("tidyverse/readr")
@stianlagstad

This comment has been minimized.

stianlagstad commented Dec 12, 2017

Now I tried installing it from github with the same error, though. This is the Dockerfile:

from rocker/rstudio:devel

RUN sudo apt-get update
RUN sudo apt-get install libssl-dev zlib1g-dev -y

COPY defuse_833ke_results.filtered.tsv /tmp/defuse_833ke_results.filtered.tsv
COPY script.R /tmp/script.R
RUN chmod u+x /tmp/script.R

RUN R -e 'install.packages("git2r")'
RUN R -e 'install.packages("devtools")'
RUN R -e 'library(devtools);devtools::install_github("tidyverse/readr")'
RUN ./tmp/script.R

Building that with docker rm $(docker ps -a -q) && docker rmi $(docker images -q) && docker build --no-cache=true . -t readrtest produced the following:

...
Step 10/10 : RUN ./tmp/script.R
 ---> Running in 9286c2bf0a13
There were 50 or more warnings (use warnings() to see the first 50)
# A tibble: 17 x 16
   cluster_id
        <chr>
 1       5267
 2      12586
 3         58
 4       2406
 5       8264
 6       3085
 7       2416
 8      11901
 9       2546
10       8250
11      11758
12       9493
13       8958
14       2374
15      11759
16      11946
17      15540
# ... with 15 more variables: splitr_sequence <chr>, splitr_count <int>,
#   gene1 <chr>, gene2 <chr>, gene_chromosome1 <chr>, gene_chromosome2 <chr>,
#   gene_name1 <chr>, gene_name2 <chr>, gene_strand1 <chr>, gene_strand2 <chr>,
#   genomic_break_pos1 <int>, genomic_break_pos2 <int>, orf <chr>,
#   span_count <int>, probability <dbl>
Warning messages:
1: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
2: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
3: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
4: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
5: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
6: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
7: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
8: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
9: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
10: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
11: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
12: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
13: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
14: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
15: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
16: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
17: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
18: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
19: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
20: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
21: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
22: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
23: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
24: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
25: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
26: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
27: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
28: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
29: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
30: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
31: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
32: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
33: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
34: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
35: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
36: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
37: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
38: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
39: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
40: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
41: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
42: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
43: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
44: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
45: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
46: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
47: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
48: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
49: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
50: In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed
 ---> cea2e539f359
Removing intermediate container 9286c2bf0a13
Successfully built cea2e539f359
Successfully tagged readrtest:latest

Since I now installed readr from github and made sure I pulled the latest rstudio/rocker image, I shoouuld have the latest of both, right?

@stianlagstad

This comment has been minimized.

stianlagstad commented Jan 7, 2018

Could you try building the Dockerfile I posted above, @jimhester? I'm still seeing this issue.

jimhester added a commit that referenced this issue Apr 30, 2018

Do not try to resize skipped columns
These are represented as NULL values, and trying to resize `R_NilValue`
throws a warning in R 3.5+

Fixes #750, #833

@nevrome nevrome referenced this issue May 5, 2018

Closed

version 1.0 #41

@stianlagstad

This comment has been minimized.

stianlagstad commented May 22, 2018

I can confirm that this issue has been resolved by 47ea858. (Ref #833)

stianlagstad added a commit to stianlagstad/chimeraviz that referenced this issue Sep 16, 2018

stianlagstad added a commit to stianlagstad/chimeraviz that referenced this issue Sep 16, 2018

@lock

This comment has been minimized.

lock bot commented Nov 18, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Nov 18, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.