New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: make TaskParameter type field optional #76

Open
pditommaso opened this Issue Aug 3, 2017 · 10 comments

Comments

Projects
None yet
5 participants
@pditommaso

pditommaso commented Aug 3, 2017

Currently the TaskParameter type is a required field. The caller needs to specify if the path is a file or directory both for input and output parameters.

However I have the impression this information add almost no value, because the server implementation needs in any case query the remote resource the apply the required operation (upload/download).

On the other hand, requiring this information as mandatory severely limits the ability of tools that dynamically discovers the type of output paths, such as Nextflow, to implement this specification.

My proposal is to make this field optional. When it is specified the semantic remains identical, and the server validates the inputs/outputs verifying the type matches. When the type is not specified the server just applies the operation to the path regardless if it is a file or directory.

@buchanae

This comment has been minimized.

Contributor

buchanae commented Aug 3, 2017

We've considered whether we should just remove this field. Previously there were other fields which made it necessary, but I believe they have been removed. If possible, I would prefer removing it instead of making it optional.

@geoffjentry

This comment has been minimized.

Contributor

geoffjentry commented Aug 3, 2017

Agree that I'd prefer to try to remove things vs. optionalizing them where possible.

@adamstruck

This comment has been minimized.

Member

adamstruck commented Aug 4, 2017

+1 for removing the field.

@pditommaso

This comment has been minimized.

pditommaso commented Aug 4, 2017

I'm fine to remove it as TaskParameter. It may still be needed in the task view in relation to #77 (comment) .

@buchanae

This comment has been minimized.

Contributor

buchanae commented Oct 24, 2017

As I understand, one argument for keeping this field is that it adds a level of type checking to the system: if you define the input/output as a Directory and the system encounters a File, the system should let the user know something unexpected occurred by returning an error.

A counterargument is that this type checking responsibility should be handled by workflow engines or other TES clients. In the case of Nextflow, this allows them to forgo type checking, which is a core feature of their system. Engines which want simple file/directory type checking would implement this by communicating with the storage layer directly.

Hope I captured the arguments accurately.

From a technical standpoint, we (the Funnel dev team) think it's possible to implement a system with a pre-defined input/output file type in Funnel, but we'd like the chance to get it fully implemented in order to discover any edge cases.

@pditommaso

This comment has been minimized.

pditommaso commented Oct 24, 2017

There's are three options here: 1) keep mandatory, 2) make it optional, 3) removed it.

I've explained in the original comment why having it as a mandatory field is a problem. I'm fine both with 2 and 3. Though I tend to think the best solution would be 2 (optional), because it would give the opportunity to enforce the type checking if needed/requested.

Also it could be useful in relation to #77 (eg. outputs all files/directories matching a glob pattern).

@adamstruck

This comment has been minimized.

Member

adamstruck commented Jun 8, 2018

I am also fine with both options 2 and 3. I'd like to get some version of this and #77 into the spec soon.

@geoffjentry

This comment has been minimized.

Contributor

geoffjentry commented Jun 10, 2018

@erikvdbergh Is this something you all have an opinion on?

@erikvdbergh

This comment has been minimized.

erikvdbergh commented Jun 13, 2018

I would want to keep it around but making it optional is fine. I'm curious @pditommaso how NextFlow handles HTTP inputs when this is not defined? A lot of http endpoints don't allow directory listing so for us it is a quick way to throw it back with an error.

I understand that for FTP, S3 and other endpoints this is a moot field, so I'd agree with making it optional. Keeping it around would also prevent clobbering of directories with files if the output type is unclear or in case of some errors.

@pditommaso

This comment has been minimized.

pditommaso commented Jun 13, 2018

how NextFlow handles HTTP inputs when this is not defined?

Nextflow does not allow glob patter for HTTP input files, exactly for the reason you have mentioned. However the proposal for an optional GLOB file type is related to *output* files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment