Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formulas are loaded as 0 instead of calculated values #495

Closed
hlenka opened this issue Jul 12, 2018 · 6 comments
Closed

Formulas are loaded as 0 instead of calculated values #495

hlenka opened this issue Jul 12, 2018 · 6 comments

Comments

@hlenka
Copy link

hlenka commented Jul 12, 2018


Hi,

I have a problem loading .xlsx files that contain formulas to R using readxl. The columns with formulas load as 0 instead of the value calculated using the formula. When I manually convert formulas to values in excel, the files are loaded properly.

Here are my example files:

original file with formulas: formulas.xlsx
file where formulas were manually converted to values: values.xlsx

My problem seems similar to the one reported in the issue #186. However, I can load the readxl_example.xlsx file provided within the issue #186 without problems (similar to Jenny).

I tried creating the reprex of my problem but when I render it, the code is not executed properly. When I run the code in RStudio, it works fine. I suspect there might be a problem with the file path specification, but I could not resolve it :(

library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.3.3
#> Warning: package 'tibble' was built under R version 3.3.3
#> Warning: package 'tidyr' was built under R version 3.3.3
#> Warning: package 'readr' was built under R version 3.3.3
#> Warning: package 'purrr' was built under R version 3.3.3
#> Warning: package 'dplyr' was built under R version 3.3.3
#> Warning: package 'forcats' was built under R version 3.3.3
library(readxl)
#> Warning: package 'readxl' was built under R version 3.3.3
  
  
file.names <- list.files (pattern=".xlsx")
file.names
#> character(0)


file.formulas <- read_excel("formulas.xlsx", skip=14, 
                          col_names = TRUE, col_types = NULL)
#> Error: `path` does not exist: 'formulas.xlsx'

file.formulas[1:10, 1:10]
#> Error in eval(expr, envir, enclos): object 'file.formulas' not found


file.values <- read_excel("values.xlsx", skip=14, 
                 col_names = TRUE, col_types = NULL)
#> Error: `path` does not exist: 'values.xlsx'

file.values[1:10, 1:10]
#> Error in eval(expr, envir, enclos): object 'file.values' not found
@jennybc
Copy link
Member

jennybc commented Jul 13, 2018

I tried creating the reprex of my problem but when I render it, the code is not executed properly. When I run the code in RStudio, it works fine. I suspect there might be a problem with the file path specification, but I could not resolve it :(

reprex defaults to working in the session temp directory. So you need to either give an absolute path to the xlsx file or you need to request that reprex work in current working directory (where I assume the xlsx file lives).

Option 1: give absolute path looks something like read_excel("~/path/to/formulas.xlsx", ...).

Option 2: give reprex permission to work in current directory: reprex(outfile = NA)

@hlenka
Copy link
Author

hlenka commented Jul 19, 2018

Thanks for the advice, Jenny.

Here's a functional reprex. The problem with zeros is in columns E, A and Ca.

library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.3.3
#> Warning: package 'tibble' was built under R version 3.3.3
#> Warning: package 'tidyr' was built under R version 3.3.3
#> Warning: package 'readr' was built under R version 3.3.3
#> Warning: package 'purrr' was built under R version 3.3.3
#> Warning: package 'dplyr' was built under R version 3.3.3
#> Warning: package 'forcats' was built under R version 3.3.3
library(readxl)
#> Warning: package 'readxl' was built under R version 3.3.3

  
file.names <- list.files (pattern=".xlsx")
file.names
#> [1] "formulas.xlsx" "values.xlsx"


file.formulas <- read_excel("formulas.xlsx", skip=14, 
                          col_names = TRUE, col_types = NULL)

file.formulas[1:10, 1:10]
#> # A tibble: 10 x 10
#>      obs time   elapsed  date    hhmmss  plant_id TIME   E     A     Ca   
#>    <dbl> <chr>  <chr>    <chr>   <chr>   <chr>    <chr>  <chr> <chr> <chr>
#>  1    NA s      s        <NA>    <NA>    <NA>     s      mol ~ µmol~ µmol~
#>  2     1 15313~ 0        201807~ 11:51:~ 1        15313~ 0     0     0    
#>  3     2 15313~ 120.5    201807~ 11:53:~ 1        15313~ 0     0     0    
#>  4     3 15313~ 222.5    201807~ 11:55:~ 1        15313~ 0     0     0    
#>  5     4 15313~ 327      201807~ 11:56:~ 1        15313~ 0     0     0    
#>  6     5 15313~ 438      201807~ 11:58:~ 1        15313~ 0     0     0    
#>  7     6 15313~ 553      201807~ 12:00:~ 1        15313~ 0     0     0    
#>  8     7 15313~ 625      201807~ 12:01:~ 1        15313~ 0     0     0    
#>  9     8 15313~ 765.599~ 201807~ 12:04:~ 1        15313~ 0     0     0    
#> 10     9 15313~ 826.599~ 201807~ 12:05:~ 1        15313~ 0     0     0


file.values <- read_excel("values.xlsx", skip=14, 
                 col_names = TRUE, col_types = NULL)

file.values[1:10, 1:10]
#> # A tibble: 10 x 10
#>      obs time   elapsed  date   hhmmss  plant_id TIME   E      A     Ca   
#>    <dbl> <chr>  <chr>    <chr>  <chr>   <chr>    <chr>  <chr>  <chr> <chr>
#>  1    NA s      s        <NA>   <NA>    <NA>     s      mol m~ µmol~ µmol~
#>  2     1 15313~ 0        20180~ 11:51:~ 1        15313~ 1.967~ 35.2~ 1956~
#>  3     2 15313~ 120.5    20180~ 11:53:~ 1        15313~ 1.899~ 32.4~ 1460~
#>  4     3 15313~ 222.5    20180~ 11:55:~ 1        15313~ 1.887~ 32.1~ 961.~
#>  5     4 15313~ 327      20180~ 11:56:~ 1        15313~ 1.864~ 31.5~ 713.~
#>  6     5 15313~ 438      20180~ 11:58:~ 1        15313~ 1.802~ 25.5~ 470.~
#>  7     6 15313~ 553      20180~ 12:00:~ 1        15313~ 1.720~ 16.0~ 281.~
#>  8     7 15313~ 625      20180~ 12:01:~ 1        15313~ 1.716~ 3.89~ 95.4~
#>  9     8 15313~ 765.599~ 20180~ 12:04:~ 1        15313~ 1.689~ 0.60~ 49.2~
#> 10     9 15313~ 826.599~ 20180~ 12:05:~ 1        15313~ 1.710~ -2.5~ 2.43~

Created on 2018-07-19 by the reprex package (v0.2.0).

@zhujiedong
Copy link

I have the same question, and I also use R to read LI-6800, and it fails with values of zero in the cells has formula, does it can be fixed?

@jennybc
Copy link
Member

jennybc commented Dec 14, 2018

The best I can tell is that you need to somehow put this workbook into different calculation mode, so that the calculated values of the formulas are actually stored to the .xlsx. That is the only way readxl will be able to get them. readxl is not going to be able to read and parse the formulas and then compute their values.

@jennybc jennybc closed this as completed Dec 14, 2018
@jennybc
Copy link
Member

jennybc commented Dec 14, 2018

Yeah, I was able to get your xlsx into a state where read_excel() returns what you expect by fiddling with "Calculate Sheet" and "Calculate Now" on the Formulas menu in the Calculation group, in Excel itself. So you will need to adjust your workflow so that this happens for all sheets produced by this vendor or instrument, before you attempt to read with readxl.

@hlenka
Copy link
Author

hlenka commented Dec 14, 2018

Thanks for looking into this, Jenny. I'll try to implement the solution using VBA. For the time being I just copy-pasted the formulas as values which was doable for the number of files I had to handle so far.

@lock lock bot locked and limited conversation to collaborators Dec 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants