-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_file directly into environment with user-specified file format #35
Comments
What are your thoughts? Would this restriction (ie, only ingested datasets are returned as data.frames) be too limiting? |
Well, here's a lead: "Confirming what Phil said - if the original ingested file was Stata (.dta) or SPSS (.sav or *.por), we use R package "foreign" to directly convert that saved original file to an .RData dataframe. For all the other supported formats, the dataframe is generated by R from the tab-delimited file and the variable metadata in the database." -- https://groups.google.com/d/msg/dataverse-community/QDRnM6ztbt8/AYynuwocBAAJ Let me dig a bit. Update. I'm pretty sure this R code is called: https://github.com/IQSS/dataverse/blob/v4.18.1/src/main/java/edu/harvard/iq/dataverse/rserve/scripts/dataverse_r_functions.R From this Java code: https://github.com/IQSS/dataverse/blob/v4.18.1/src/main/java/edu/harvard/iq/dataverse/rserve/RemoteDataFrameService.java#L125 |
@pdurbin, that helped a lot @kuriwaki, this shows how inexperienced I still am with Dataverse. I didn't realize they really meant "RData", instead of "Rds". So unless Dataverse also offers Rds files soon, I totally support with your proposal. In addition, what do you think about a function that always returns a data.frame for an ingested tab file? In that case, it never passes through the rds stage. Something like For those who don't know, RData saves the equivalent of an environment/workspace --not necessarily a single rectangular data. When it's restored from all the variables used by the developer populate the client. The user is forced to (at least initially) use the old names. Besides the naming complication, multiple variables can use contained, which can lead to more confusion. Excerpt from Efficient R programming
|
Thank you. My intention with the As for ingested datasets.. my sense is that Re:
|
This just in. A request for RDS support in Dataverse from @reikoch at IQSS/dataverse#6678 @wibeasley @kuriwaki please feel free to comment on that issue! You both know way more about R than I do! 😄 |
This comment has been minimized.
This comment has been minimized.
This functionality is now called I reread this conversation after implementing that PR. Re the above comment (#35 (comment)) by @wibeasley:
|
What the issue is about:
Issue: I think most users who want to get data from the R dataverse package want to start working with the data in their R environment right away. However,
get_file
only returns raw binary output which is not usable on its own.Proposal: The help page shows how to write the class
raw
object into a temp file and read it back in. The proposed feature is to add an optional argument inget_file
or make a function that does this write-in / read-in-again process automatically. Users will enter a function that will be used to read in the tempfile. An example function that does this is below.How does this sound?
Created on 2019-12-16 by the reprex package (v0.3.0)
The text was updated successfully, but these errors were encountered: