Should the pipeline interface include path to sample yaml outputs? #61

nsheff · 2018-10-24T15:40:52Z

In #32 we build a pipeline interface section called summary_results, which records the location of summarizer results.

What about something similar to report the location of sample yaml file from the pipeline?

The text was updated successfully, but these errors were encountered:

stolarczyk · 2020-04-03T13:29:58Z

I'm not sure I understand. Could you give more context here?

is the goal here to set the future location of sample.yaml file in a pipeline interface and then use this path to save the file there instead of in a default spot?

nsheff · 2020-04-03T14:24:33Z

well, this is a few years old... but yes I believe your interpretation is correct. I believe this is sort of accomplished by the output schema concept...

stolarczyk · 2020-04-03T14:33:06Z

more by the input schema, I think. {sample_name}.yaml file consists of key-value pairs of all public sample attributes, so input schema is related. Yet it just defines the type of the attrs, not their values.

stolarczyk · 2020-04-03T14:37:01Z

so what's the key for the sample yaml path in the pipeline interface, if we even want to proceed?

sample_yaml_path, sample_attrs_path, sample_path, sample_file_path, sample_file, sample_yaml?

nsheff · 2020-04-03T14:43:45Z

Well, there can be an input sample.yaml, which is in some sense an instance of the object specified by the input schema, and an output sample.yaml which is in some sense an instance of the object specified by the output schema.

we could produce both yamls. right now we only produce the first.
there could be:

pipelines:
  pipeline:
    input_sample_yaml_path: {sample.sample_name}.yam
    output_sample_yaml_path: {sample.sample_name}_output.yaml

those are relative to the path specified in looper.output_dir, with the above values as defaults, but can be overridden the pipeline interface?

just brainstorming here...

stolarczyk · 2020-04-03T14:49:38Z

I think it all makes sense.

So, in practice:
output sample yaml = input sample yaml + populated sample attrs defined in the pipeline output schema ?

nsheff · 2020-04-03T16:16:46Z

output sample yaml = input sample yaml + populated sample attrs defined in the pipeline output schema ?

makes sense to me... I think it's a superset of the input yaml. the only reason we make the input yaml is because it's used as an input to the pipeline.

In fact... why even make the input yaml? if the output yaml is a superset of it, then it could be used as an input to the pipeline as well...

so now, with this model, there is only 1 sample.yaml, which is exactly what you say: input yaml + populated sample attrs defined in the output schema.

one question: would this yaml include a property given in the table table that is not specified in the input schema?

stolarczyk · 2020-04-03T20:01:27Z

that's right, input sample.yaml is probably not necessary in such a case.

one question: would this yaml include a property given in the table table that is not specified in the input schema?

I'd say yes, I can imagine writing a small schema (for example with just required attrs), that does not necessarily cover all the sample attributes. Still I'd expect all my columns from sample table to be accessible in the yaml file.

nsheff · 2020-04-03T20:11:15Z

I'd say yes,

I agree.

stolarczyk · 2020-04-04T23:01:54Z

for future reference:

sample yaml file path is constructed from a template in pipeline interface file

pipelines:
  pipeline:
    sample_yaml_path: >
     {sample.sample_name}.yaml  # relative to looper.output_dir

if sample_yaml_path section is missing it is saved in submission directory in <sample_name>.yaml

nsheff added this to the 0.12 milestone Feb 19, 2019

nsheff mentioned this issue Apr 16, 2019

Project.get_outputs() functionality #165

Closed

nsheff mentioned this issue Aug 26, 2019

sample_structure yaml #216

Closed

This was referenced Mar 14, 2020

Dividing PEP 2.0 from looper interface revamp #234

Closed

How should we define what a pipeline produces? #237

Closed

stolarczyk added a commit that referenced this issue Apr 4, 2020

enrich sample with processed attrs read from output schema; #61

fb84b6e

stolarczyk added a commit that referenced this issue Apr 4, 2020

implement sample yaml loc specification possibility; #61

8ba4275

stolarczyk closed this as completed Apr 4, 2020

stolarczyk mentioned this issue Apr 29, 2020

looper documentation; complete(?) todo list #254

Closed

17 tasks

stolarczyk mentioned this issue Aug 26, 2020

Moving _get_sample_yaml_path to a sample #288

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should the pipeline interface include path to sample yaml outputs? #61

Should the pipeline interface include path to sample yaml outputs? #61

nsheff commented Oct 24, 2018

stolarczyk commented Apr 3, 2020

nsheff commented Apr 3, 2020

stolarczyk commented Apr 3, 2020

stolarczyk commented Apr 3, 2020

nsheff commented Apr 3, 2020

stolarczyk commented Apr 3, 2020 •

edited

Loading

nsheff commented Apr 3, 2020

stolarczyk commented Apr 3, 2020

nsheff commented Apr 3, 2020

stolarczyk commented Apr 4, 2020

Should the pipeline interface include path to sample yaml outputs? #61

Should the pipeline interface include path to sample yaml outputs? #61

Comments

nsheff commented Oct 24, 2018

stolarczyk commented Apr 3, 2020

nsheff commented Apr 3, 2020

stolarczyk commented Apr 3, 2020

stolarczyk commented Apr 3, 2020

nsheff commented Apr 3, 2020

stolarczyk commented Apr 3, 2020 • edited Loading

nsheff commented Apr 3, 2020

stolarczyk commented Apr 3, 2020

nsheff commented Apr 3, 2020

stolarczyk commented Apr 4, 2020

stolarczyk commented Apr 3, 2020 •

edited

Loading