New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

length of NULL cannot be changed errors on installing R 3.5.0 #833

Closed
md0u80c9 opened this Issue Apr 25, 2018 · 16 comments

Comments

Projects
None yet
5 participants
@md0u80c9
Contributor

md0u80c9 commented Apr 25, 2018

Hi,

I'm in the early stages of trying to track this down - asking in case it is 'known' or anyone has any ideas before I spend too long on hunting it down.

I installed R 3.5.0 today and reinstalled packages along with it. I'm using readr 1.2.0, tibble 1.4.2.9001, rlang 0.2.0.9001.

My code sits in a package for loading my dataset. The code is 'known working' on my 3.4.4 machine and has been for some time so it looks like a regression which has occurred somewhere.

I've narrowed it down to calling read_csv which is generating the warnings (>50 warnings).

I'm using defined import columns. Because the column list is large, I'm managing it by splitting it into two definitions (columnTypes and followupColumnTypes), and then calling read_csv:

  columnTypes <- readr::cols(
    ProClinV1Id = readr::col_integer(),
    PatientId = readr::col_integer(),
    TeamCode = readr::col_character())

  followupColumnTypes <- readr::cols(
    LockedS8 = readr::col_skip(),
    LockedS8DateTime = readr::col_skip(),
    LockedS8UserName = readr::col_skip(),
    S8Status = readr::col_skip(),
    S8FollowUp = readr::col_factor(c('Y', 'N', 'NB', 'ND', NA)),
    S8FollowUpType = readr::col_factor(c('IP', 'T', 'O', 'P', NA)))

  importColumns <- columnTypes
  importColumns$cols <- c(columnTypes$cols,
                          followupColumnTypes$cols)

  importedData <- readr::read_csv(filename,
                                    col_names = TRUE,
                                    col_types = importColumns)

I'm getting >50 warnings of:

In read_tokens_(data, tokenizer, col_specs, col_names,  ... :
  length of NULL cannot be changed

Haven't yet managed to distill the example down further into a reprex - if it doesn't ring any bells with anyone I'll try to do that.

The imported data itself seems fine so whatever is going on isn't affecting the output (but is screwing up my debug as I can't see the wood for the trees! I can workaround by disabling warnings on input - but that is ignoring the issue really!)

The only major R thing I can think of is the byte-compiling by default causing a problem somewhere.

@jimhester

This comment has been minimized.

Member

jimhester commented Apr 25, 2018

This seems to be similar to #750, if you can reproduce it with a small example dataset I will try and fix it.

@md0u80c9

This comment was marked as outdated.

Contributor

md0u80c9 commented Apr 25, 2018

Will try my best. First question is of course if size itself is the issue. Second will be to check different col types.

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 25, 2018

I have just reviewed Issue 750 - agree it sounds the same issue. Interesting thing from that report was that the same code worked for some and not for others so it may be something environmental.

Could it be a platform / compiler issue? I’m using a Mac with High Sierra. I know there are some compiler warnings when installing - wonder if they shed any light on things.

Or a version dependency issue? Any dependencies to consider that may influence things?

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 25, 2018

Right - I have been trying to narrow this down. Can't make it a reprex just yet, but I think it has something to do with code handling col_skip().

I firstly reduced my defined cols list which significantly reduced the number of errors to 15 warnings.

Within that were 15 warnings. col_skip() was present 5 times.

It appears that each occurrence of col_skip has been causing the warning 3 times.

Furthermore, using cols_only with my subset of definitions generates >50 warnings again - suggesting that by behaving as 'col_skip' for all the other columns has generated the additional warnings.

Does that help narrow it down at all?

@md0u80c9

This comment was marked as outdated.

Contributor

md0u80c9 commented Apr 25, 2018

I should point out that if that is the case it may be separate (albeit possibly related) to the other error.

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 27, 2018

Hi,

I have tested this down to a bare minimum. I created the following as a CSV file:


# A tibble: 3 x 2
  Name    Number
  <chr>    <dbl>
1 Adam         1
2 Bob          2
3 Charlie      3

I then read the file in with:

read_csv('../minimal.csv', col_names = TRUE, col_types = cols(col_character(), col_skip()))

On my MacBook Pro running R 3.5.0 and the packages as described above I see:

# A tibble: 3 x 1
  Name   
  <chr>  
1 Adam   
2 Bob    
3 Charlie
Warning messages:
1: In read_tokens_(data, tokenizer, col_specs, col_names, locale_,  :
  length of NULL cannot be changed
2: In read_tokens_(data, tokenizer, col_specs, col_names, locale_,  :
  length of NULL cannot be changed

On my iMac running R 3.4 (I would need to check the package versions on the iMac again) this runs flawlessly.

The trigger is definitely from defining the column using col_skip(). The following work fine:

> read_csv('../minimal.csv', col_names = TRUE)
Parsed with column specification:
cols(
  Name = col_character(),
  Number = col_double()
)
# A tibble: 3 x 2
  Name    Number
  <chr>    <dbl>
1 Adam         1
2 Bob          2
3 Charlie      3
> read_csv('../minimal.csv', col_names = TRUE, col_types = cols(col_character(), col_integer()))
# A tibble: 3 x 2
  Name    Number
  <chr>    <int>
1 Adam         1
2 Bob          2
3 Charlie      3
> 

Does this help narrow it down?

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 27, 2018

Just in case it helps - I had downloaded readr from the GitHub site using devtools::install_github. It's compiling on the mac, my clang version is:

clang --version
Apple LLVM version 9.1.0 (clang-902.0.39.1)
Target: x86_64-apple-darwin17.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 28, 2018

Tried this morning to see if I can narrow any setup differences between the iMac and the MacBook Pro. I've ensured that they are running the same versions of tibble, rlang, readr from GitHub and same version of clang - same results (MacBook Pro shows warnings, iMac does not on the same code). So the only logical difference between machines is the version of R.

jimhester added a commit that referenced this issue Apr 30, 2018

Do not try to resize skipped columns
These are represented as NULL values, and trying to resize `R_NilValue`
throws a warning in R 3.5+

Fixes #750, #833
@jimhester

This comment has been minimized.

Member

jimhester commented Apr 30, 2018

Thanks for the reprex, fixed by 47ea858

@jimhester jimhester closed this Apr 30, 2018

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 30, 2018

This great - thank you!

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 30, 2018

Do you think this fixes issue 750 as well? I think they may be different based on the above.

@md0u80c9

This comment has been minimized.

Contributor

md0u80c9 commented Apr 30, 2018

Tested with the code that caused this problem initially and can confirm this fixes it.

@jciconsult

This comment has been minimized.

jciconsult commented Aug 9, 2018

I am using read_csv with a subset of columns (cols_only) in R 3.5.1 with everything on the latest version and I am getting the same warning message.

@thanosgatos

This comment has been minimized.

thanosgatos commented Aug 22, 2018

I have exactly the same issue as @jciconsult. Should we open a new issue for this?

@batpigandme

This comment has been minimized.

Member

batpigandme commented Aug 22, 2018

@thanosgatos, have you updated to the latest development version? (i.e. devtools::install_github("tidyverse/readr") After doing so, I was unable to reproduce.

suppressPackageStartupMessages(library(tidyverse))
mydat <- data_frame(one = c("Bob", "Mary", "Sue"),
                    two = c(1, 2, 3))
mydat
#> # A tibble: 3 x 2
#>   one     two
#>   <chr> <dbl>
#> 1 Bob       1
#> 2 Mary      2
#> 3 Sue       3
mycsv <- tempfile(fileext = ".csv")
write_csv(mydat, mycsv)
readat <- read_csv(mycsv, col_names = TRUE, cols(one = col_character(),
                                                 two = col_skip()))
readat
#> # A tibble: 3 x 1
#>   one  
#>   <chr>
#> 1 Bob  
#> 2 Mary 
#> 3 Sue

unlink(mycsv)

Created on 2018-08-22 by the reprex package (v0.2.0.9000).

If you open a new issue, please do so with a small reprex.

@thanosgatos

This comment has been minimized.

thanosgatos commented Aug 22, 2018

@batpigandme, I've been running the latest CRAN versions. Upgrading to the latest dev version from GitHub did indeed solve the issue.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment