Data Access API file download problem #4373

conjugateprior · 2017-12-10T03:06:11Z

I've been trying to figure out how to download one of my own data files (say id 109356) from http://hdl.handle.net/1902.1/FYXLAWZRIA.

The dataverse R package fails to let me do this. Bug report here. (tl;dr the metadata downloads but the requesting the file itself gives a 503 error.)

So I turned to the docs to see if I could get a working it working with curl. The introduction suggests a line of the form

curl -H "X-Dataverse-key: MYKEY" https://demo.dataverse.org/api/datasets/:persistentId?persistentId=hdl:1902.1/FYXLAWZRIA

using the key MYKEY I generated. This failed with message

{"status":"ERROR","message":"Bad api key 'MYKEY'"}

which seemed odd since I just generated that key and pasted it out of my own user account. So I generated a fresh one and got the same error.

curl "https://demo.dataverse.org/api/datasets/:persistentId?persistentId=hdl:1902.1/FYXLAWZRIA&key=MYKEY"

also fails.

The text was updated successfully, but these errors were encountered:

pdurbin · 2017-12-10T12:56:31Z

@conjugateprior thanks for opening this issue as well as the one about the documentation being confusing: #4374

how to download one of my own data files (say id 109356)

This question has come up so often that for #3584 we started showing "Download URL" on file landing pages. It looks like this for the file id you mentioned above:

Once you have the download URL for a file you can just paste it into your browser...

... and the file should begin to download, like this:

This should work fine from the command line with curl but I thought screenshots like this would be easier to follow.

Please note that if you're actually using curl, you'll need to add -J or --remote-header-name to have curl save the file with the name that shows in Dataverse (1995-1999 Levels of Source _ Target.tab) rather than the file id (109356):

curl -O -J https://dataverse.harvard.edu/api/access/datafile/109356 shows "curl: Saved to filename '1995-1999 Levels of Source _ Target.tab'".

conjugateprior · 2017-12-10T14:16:37Z

Great, thanks.

From this I conclude that:

the Data Access API is not actually necessary for this kind of data access.
files get an id that is not relative to the DV they are in.
it is still unclear why API access does not work for my repo, relatedly why the API does not believe in my token.

The first two would great things to have, or have more prominently, in the API documentation.

Re #3584, I didn't know how public the file was, having previously been obliged to click through some T&Cs in the browser.

For reference, I was planning to embed the data request in an R package. That now works as

> library(httr)
> resp <- GET("https://dataverse.harvard.edu/api/access/datafile/109356")
> content(resp)

No encoding supplied: defaulting to UTF-8.
Parsed with column specification:
cols(
  CODE = col_character(),
  LEVEL = col_character(),
  DESCRIPT = col_character()
)
# A tibble: 17 x 3
     CODE                       LEVEL
    <chr>                       <chr>
 1 <CAPI>                    Capitals

[snip]

which is perfect.

So, this fixes the immediate problem and resolves the ticket. Thanks again.

pdurbin · 2017-12-10T14:29:57Z

@conjugateprior sure, but you mentioned a 503 before and when I dig in a bit more I think you're onto something. Check this out.

I go to the file landing page at https://dataverse.harvard.edu/file.xhtml?fileId=109356 and click "Download" and then "Original File Format (UNKNOWN)" (seeing "UNKNOWN" here is already somewhat suspicious to me):

Then, I get a 503 error at when format=original) at https://dataverse.harvard.edu/api/access/datafile/109356?format=original&gbrecs=true (last I checked gbrecs=true doesn't actually do anything):

So there's something strange going on.

pdurbin · 2018-07-13T02:39:01Z

The "original file" download option is no longer available via the dataset above because worked on this issue: Make "download as original" disappear from download options, when there is no saved original. #4796

Here's how it looks now:

@conjugateprior , who opened this issue, seems fine with using the tab-delimited version of the file so I think it's safe to close this issue. It's specific to a file for a particular installation of Dataverse (Harvard Dataverse).

pdurbin added UX & UI: Design This issue needs input on the design of the UI and from the product owner Feature: File Upload & Handling labels Dec 10, 2017

pdurbin mentioned this issue Dec 11, 2017

get_dataset by hdl code IQSS/dataverse-client-r#17

Closed

pdurbin added the Vote to Close: pdurbin label Jul 13, 2018

djbrooke added this to Inbox 🗄 in IQSS/dataverse (TO BE RETIRED / DELETED in favor of project 34) May 8, 2019

mheppler removed the UX & UI: Design This issue needs input on the design of the UI and from the product owner label Feb 14, 2020

pdurbin closed this as completed Oct 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data Access API file download problem #4373

Data Access API file download problem #4373

conjugateprior commented Dec 10, 2017

pdurbin commented Dec 10, 2017

conjugateprior commented Dec 10, 2017

pdurbin commented Dec 10, 2017 •

edited

pdurbin commented Jul 13, 2018

Data Access API file download problem #4373

Data Access API file download problem #4373

Comments

conjugateprior commented Dec 10, 2017

pdurbin commented Dec 10, 2017

conjugateprior commented Dec 10, 2017

pdurbin commented Dec 10, 2017 • edited

pdurbin commented Jul 13, 2018

pdurbin commented Dec 10, 2017 •

edited