-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support reading from more general inputs #278
Comments
Question: Along with this support for more general inputs, will that also signal a transition to |
The |
@jennybc If you're happy with that logic, I'd be happy to file a PR. |
@MichaelChirico Yeah, that is an option we've contemplated re: "faking" read from a URL. I'm definitely interested in a PR that downloads to a temp file, then reads that. |
Does this include reading files from within a zip file using the
This would be a handy option for allowing users to access compressed files. |
@jknowles I rather doubt it. It's especially hard to see that working for xlsx, where we explicitly unpack that into individual XML files. |
I started to tackle the URL piece of this as part of #454 and there is also another separate PR for URL download in #426. But Slack discussion has convinced me I really should tackle all of these together and soon-ish. But not for this week's release, which gets the security-patched libxls out there. Notes from team discussion:
|
Would appreciate if you could reproduce/summarize the part of the discussion regarding why to keep using Thanks again! |
It was a rather vague suggestion, based on past experience with things "just working" as expected and consistently across OSes. |
Has there been any update here? [Update, for others who land here after getting stuck]
|
Sorry. I am a little confused. I saw a commit that added the ability to read from URL in April 2018: Lines 6 to 10 in 839d023
Somehow that edit never got integrated. Was it taken out for a specific reason? |
@jennybc Thanks for clarifying. I am just not sure if you are postponing URL handling or giving up on it entirely. |
No, definitely not giving up on it. Looking back at that, it looks like I did not want it wrapped up with some of the other things I was doing in that PR. |
I think this is the deal: to do the temp file thing to "fake" reading from URL is easy and perhaps I should just do that. But the declared goal here is noticeably higher, which is why it hasn't been handled yet. |
@jennybc Thank you for clarifying again. Dealing with remote data is complicated, so I understand the concern. If my opinion makes any difference, you can add the quick temp file fix for now. If anyone runs into any issues with it, they can always just not use a URL (same as they would now). |
right now I'm reading excel files from a URL using tf = tempfile(fileext = ".xlsx")
curl::curl_download(url, tf)
readxl::read_excel(tf, ...) But I could also see a more generic approach using a con = curl::curl(url)
read_excel.con(con, ...)
close(con) or a raw = curl::curl_fetch_memory(url)$content
read_excel.raw(raw, ...) |
library(openxlsx) data <- read.xlsx(url) This works well. |
If you are reading a public Google Sheet, you can also use googlesheets4 without having it transit through the xlsx format. Or a private Sheet, for that matter. |
@jennybc thank you. It works when I converted the Excel file to google sheets format. I used googlesheets4::read_sheet("https://docs.google.com/spreadsheets/d/1wyhnEPoRB-JMssCWD_xslh_9G1JQiPXRuT_I0ibW600/edit#gid=74876023") |
Hi, Based upon the thread content, I am not sure if anything has been done to allow input via rawConnection objects. Can you please confirm if that is the case or not? If it is possible to read from raw connections, can you please indicate how to properly do this? Thanks |
readxl cannot read from rawConnection objects. |
Am I correct that the main roadblock to reading from, say, rawConnection (i.e. from memory), is the |
Right now readxl reads only from xlsx and xls files.
In the fullness of time, some functionality for supporting more general inputs will be pulled out of readr, at which point readxl can exploit that. We might implement an interim solution for some of these in the meantime.
This issue will cover all related feature requests (all of which I'm now closing):
The text was updated successfully, but these errors were encountered: