New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEcAn API: Workflow status, download input/output files, JSON workflow submission #2674
Conversation
@mdietze @robkooper I have made all the suggested modifications. Requesting you to review it once & let me know your views. |
filepath <- paste0(input$file_path, "/", input$file_name) | ||
|
||
if(! file.exists(filepath)){ | ||
res$status <- 404 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to return different error messages for the file not existing vs nrow(input)==0? Having these two cases return the same error doesn't tell the user what happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm...that's a bit inconvenient I guess because the content-type is specified to be application/octet-stream
as a decorator over the function, and changing that here may not be the easiest thing to do as of now. I think we can keep this on a hold for now & meanwhile I will see if something could be done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Not the end of the world. Just didn't know if there was a way to send back a message in addition to the raw status code. Definitely not worth holding up this PR for.
|
||
* __Inputs:__ | ||
* [`GET /api/inputs/`](#get-apiinputs): Search for inputs needed for a PEcAn workflow based on `model_id` & `site_id` | ||
* [`GET /api/inputs/{input_id}`](#get-apiinputsinput_id) *: Download the desired input file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you test that this works correctly when an input_id points to multiple files (e.g. a folder full of met files, one for each year)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Umm...Currently I implemented it in a way assuming that the input_id points to a file & not a folder. Also, the current VM that I am working on doesn't seem to have folders containing multiple met files. So would be difficult to test. Can you suggest some way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robkooper I'm not sure what's on the machine @tezansahu is testing the API on, but could you point him to an example CF met Input record where the files are annual.
@tezansahu if the API system only allows one file to be returned at a time, then I think you'd want to have an optional way to get file by name or by order in a sequence. For the latter I'm thinking something similar to how the search functions will return the first N queries, then you query again with an offset to get the next N queries. But that options doesn't seem great to me because you wouldn't really know which file you're getting so you wouldn't know how to name them. So I guess the simplest place to start (e.g. to get this PR in) is to just give a check on the number of files in an Input and if it's >1 and the file isn't requested by name then we need to throw an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have addressed this in the latest commits
* __Workflows:__ | ||
* [`GET /api/workflows/`](#get-apiworkflows): Retrieve a list of PEcAn workflows | ||
* [`POST /api/workflows/`](#post-apiworkflows): Submit a new PEcAn workflow | ||
* [`GET /api/workflows/{id}`](#get-apiworkflowsid): Obtain the details of a particular PEcAn workflow by supplying its ID | ||
* [`GET /api/workflows/{id}/status`](#get-apiworkflowsidstatus): Obtain the status of a particular PEcAn workflow | ||
* [`GET /api/workflows/{id}/file/{filename}`](#get-apiworkflowsidfilefilename): Download the desired file from a PEcAn workflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with files in subdirectories, or just files in the workflow directory itself? Also, does /api/workflows/{id} return the list of files so that the user knows which to request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, /api/workflows/{id}
returns the list of files so that the user knows which to request. As of now, it fetches files in the workflow directory only....coz the ones in the workflow/runs/
are fetched by the /api/runs/{run_id}/input/{filename}
& those in the workflow/out/
are fetched by the /api/runs/{run_id}/input/{filename}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't want to hold up this PR further, but we should also find a way to fetch files in the PFT directory, as a lot of sensitivity analysis results get squirreled away in there. Likewise, the PDA and SDA analyses often create additional subdirectories when run.
"http://localhost:8000/api/inputs/99000000003", | ||
httr::authenticate("carya", "illinois") | ||
) | ||
writeBin(res$content, "test.2002.nc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we use the Input query like this, how are we supposed to know what the filename should be? If not provided in an example here, you should definitely make sure there are helper functions in the pecan package that does both the GET and the query for the filenames and the write. @robkooper beyond this PR, but I think it would be great if there were then core PEcAn functions that would not just use the API to grab files, but then stick them into the local DB as new dbfiles associated with the correct Input record. Would really like to get to the point where PEcAn nodes default to grabbing files from each other before attempting to download and reprocess the raw Inputs from their original sources (e.g. I could grab the met I need for a run from Shawn or Ankur if they've already run that site)
…der & a filename is specified
model_id
&site_id
format_name
&mimetype
AUTH_REQ
environment variable (See issue Use environment variables in the API for Authentication Requirement & Visibility of Details queried from betyDB #2639)Review Time Estimate
Types of changes
Checklist: