Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting variables from a formula in an Excel spreadsheet #21

Open
aammd opened this issue Feb 26, 2016 · 1 comment
Open

Extracting variables from a formula in an Excel spreadsheet #21

aammd opened this issue Feb 26, 2016 · 1 comment

Comments

@aammd
Copy link
Member

aammd commented Feb 26, 2016

@BIOL548O/all ,

What can you do if you have a spreadsheet, but the data you need is inside a formula? If you read the sheet in to R, you'll get the formula's output, but not the value that's actually inside the formula. for example, say the cell contained:

=4.67*D2/C2

What are our options?! I asked some of the #rstats Twitter community for their advice:

OK here goes.
Is it possible to read a value out of a *formula* in an excel spreadsheet? either xlsx or google drive #rstats cc @JennyBryan

— Andrew MacDonald (@polesasunder) February 26, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

apparently, One Day Soon, we shall be able to extract formulae from Excel directly into R (and more besides), and on that day this will become very straightforward.

In the meantime, it was the BRC's own Matt Strimas-Mackey (@mstrimas) who had what I considered the most practical advice:

@polesasunder @JennyBryan In Excel, find & replace = with '= will convert formula to text string, export to csv, import to R, and use regex

— Matt Strimas-Mackey (@StrimasMackey) February 26, 2016
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>

Although, note that you could probably just read the new .xls file directly into R with readxl.

@aammd
Copy link
Member Author

aammd commented Feb 26, 2016

Another, even simpler option: convert formulae to text (as above, substituting '=) and use the excel function MID to rip the required data right of the cell.

I know it seems like bad practice to be editing data in excel spreadsheets, but I think this would be OK. If you wanted, you could save the new excel spreadsheet under a new name, and add a readme to the data-raw/ folder explaining your actions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant