Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] More ISA-tab-like Inputs and Outputs #316

Closed
WaljaWanney opened this issue Jul 21, 2023 · 7 comments
Closed

[Feature Request] More ISA-tab-like Inputs and Outputs #316

WaljaWanney opened this issue Jul 21, 2023 · 7 comments
Labels
Type: Feature Request This item is confirmed by the maintainers to be a request for a new feature

Comments

@WaljaWanney
Copy link

Problem

In the ISA-tab study and assay files, there is a strict differentiation between Source, Sample, and Extract (etc), and Protocols. In contrast, in SWATE, each input is considered a source, and each output is treated as a sample. Consequently, a sample in one file can be or must be the source of another file.

The way SWATE is right now, it is encouraged to always use Source Name as input, while the input for an ISA-tab assay file would usually be Sample Name, and the output would be e.g. Extract Name.

Solution

Adding more Input/Output options.

Additional context

This is related to my issue # 63 in Swate-templates nfdi4plants/Swate-templates#63

@WaljaWanney WaljaWanney added the Type: Feature Request This item is confirmed by the maintainers to be a request for a new feature label Jul 21, 2023
@Freymaurer
Copy link
Collaborator

The current goal for Input and output columns in ISA-XLSX is the following:

Input and Output must always be declared with Input [x]/Output [x], where x can be a any of:

type IOType =
    | Source
    | Sample
    | RawDataFile
    | DerivedDataFile
    | ImageFile
    | Material
    | FreeText of string

with anything not matching will be parsed as FreeText of string. This means anything goes as input for x, but we do have additional support for the others mentioned above.

Let me know if this solution works for you!

@andreaschrader
Copy link
Member

I would like to leave a comment:
Personnaly I like this solution and naming.

and suggestion to the image you shared above:
Adding 'RessourceDataFile' already as a suggestion for the input would be very helpful and provide orientation:
Projects use unpublished input files from others that they are allowed to use for their project, place in the ressources directory and use these in their scripts within the ARC.

I also see the 'FreeText' option. However, questions on this are reoccuring. Therefore my suggestion.

@WaljaWanney
Copy link
Author

That looks great!
Just a comment: As far as I understand ISA, Material is the umbrella term for Source, Sample, ... . I still think it good to have Material as a valid input/output.

What do you think about the addition of Extract, as it is a term used in ISA as the materialistic output of an assay.
Source -> Sample -> Extract -> RawDataFile -> DerivedDataFile seems like a pipeline that I would use in the lab doku all the time.
My other option then would be:
Source -> Sample -> Sample -> RawDataFile -> DerivedDataFile
First option seems a bit clearer.

@HLWeil
Copy link
Member

HLWeil commented Jul 21, 2023

Extract might be a good addition. It does have an explicit representation in ISA-Json too.

@UrsulaE
Copy link

UrsulaE commented Nov 17, 2023

It is still not clear to me how this is supposed to be translated into real live examples. If there are already templates that follow this format with Input[...] and Output[...] columns, please point the users towards them.

As the isa specification is that sources must be specified in a column called "Source name", same for sample and "Sample name", it could either e.g. be Source name[Image file], or should there be additional columns called Input[] or Output[] that are neither in the categories Parameter, Characteristic, Factor, Component nor Protocol. Presumably, if an entire workflow was to be represented by a single Study or Assay, there could be several Input and Output columns in a table? Would then Source name and Sample name represent the initial input and final output? Or are we still with the only one source and sample column policy?

@Freymaurer
Copy link
Collaborator

Here is a link to the ISA-XLSX specification: https://github.com/nfdi4plants/ARC-specification/blob/main/ISA-XLSX.md#inputs-and-outputs (thanks to @HLWeil )

Should this not answer your questions feel free to ping me 😄

@Freymaurer
Copy link
Collaborator

Done for now! Until we update the IOTypes available!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Request This item is confirmed by the maintainers to be a request for a new feature
Projects
None yet
Development

No branches or pull requests

5 participants