Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: read_civis.civis_script #159

Closed
patr1ckm opened this issue Feb 26, 2019 · 2 comments
Closed

ENH: read_civis.civis_script #159

patr1ckm opened this issue Feb 26, 2019 · 2 comments

Comments

@patr1ckm
Copy link
Contributor

patr1ckm commented Feb 26, 2019

Read civis interface to scripts

I'm proposing that read_civis.civis_script have two additional arguments: regex and using. This design mimics the current behavior of read_civis which is to return values whenever it is unambiguous to do so.

Default

The default is using = NULL, which returns a named list of file ids. This doesn't return values. It does return a helpful result if you don't know what the object types are that are posted in the script.

out <- read_civis(civis_script(234)) 
val <- read_civis(out$name, using = reader))
> out
$name
[1] 1

With a regex, read_civis.civis_script returns a list of ids maching the regex.

out <- read_civis(civis_script(234), regex = ‘name’)
# always returns a list
val <- read_civis(out[[1]], using = reader)
vals <- lapply(out, read_civis, using = reader)

using = reader

With using = reader, read_civis returns the values directly.

Here reader is used to read all outputs:

val <- read_civis(civis_script(234), using = reader)

A more relevant case is with regex, which you can use to filter outputs that match a file extension, and then parse them with the right using:

val <- read_civis(civis_script(234), regex = ‘.csv’, using = read.csv))
val <- read_civis(civis_script(234), regex = ‘.rds’)
val <- read_civis(civis_script(234), regex = ‘.json’, using = jsonlite::fromJSON)

JsonValues

A really helpful feature of the API is that json values are simply returned as-is in out$value. So any non-null using can simply return JsonValues directly without attempting to parse/download the output.

Thoughts on this @keithing ? Is it too magical?

@keithing
Copy link
Contributor

I like this a lot.

Is it too magical?

I don't think so. It looks explicit to me.

Some misc thoughts:

  • How does civis_script handle custom vs container jobs? Should it be civis_container instead?
  • How are you planning to handle run ids? Default to latest if it's NULL?
  • Maybe out of scope, but would it make sense to implement write_civis(civis_container(...)) too?

@patr1ckm
Copy link
Contributor Author

patr1ckm commented Feb 26, 2019

Yeah, these are all great questions.

  • How does civis_script handle custom vs container jobs? Should it be civis_container instead?

I'm using jobs_get to rerturn something for all job types. Then I can use scripts_list_*_runs_outputs based on the job type (I just have a lookup table).

The only confusion is that the job type is the same for custom and container. However, I can tell if a script is custom by whether it has a non-null fromTemplateId. So I can fully resolve what to do unambiguously for any job type returned from jobs_get.

  • How are you planning to handle run ids? Default to latest if it's NULL?

Yep!

  • Maybe out of scope, but would it make sense to implement write_civis(civis_container(...)) too?

Yes! I'm planning on thinking about writing to container outputs next.

patr1ckm added a commit that referenced this issue Feb 28, 2019
ENH civis_script
ENH read_civis.civis_script
ENH fetch_output
ENH fetch_output_file_ids
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants