Clarify the semantics of the interaction of repeated tasks with reports and plots #103

jonrkarr · 2021-02-02T17:09:23Z

As noted in the specifications (NOTE: This example produces three dimensional results ...) repeated tasks implicitly produce multi-dimensional results. The specifications show examples that generate three dimensions (time, model variable, iteration over a range of a repeated task). More generally, repeated tasks can produce yet more dimensions. If a repeated task has multiple subtasks, there should be an additional dimension (e.g., time, model variable, iteration over a range of a repeated task, subtask within the repeated task). For spatial simulations that produce multiple dimensions on their own, the results could have dimensions (time, x, y, z, model variable, iteration over a range of a repeated task, subtask within the repeated task).

The semantics of the above for variables, data generators and especially reports and plots is under-specified.

Reports are ill-defined, especially because reports implicitly are encouraged to use CSV.
Plots are ill-defined. For example, should simulation tools display one curve per iteration and subtask?

A few changes could make reports better defined

Explicitly indicate that multi-dimensional reports need to be stored with a format such as HDF5
Define a convention for the order of dimensions of the multi-dimensional reports
1. Data set
2. Model dimensions (e.g., {time} for non-spatial simulations or {time, x, y, z} for spatial simulations)
3. Subtask within listOfSubTasks with a repeated task
4. Iteration through the range of a repeated task
Define conventions for labeling the axes of reports and each slice of the subtask and iteration dimensions.
- One useful ontology for labeling axes is SIO.
Embrace conventions for annotating the dimensions of reports. This is another weakness of CSV/TSV that HDF5 overcomes.

Addressing the issues with plots requires more work.

In addition to these changes, more examples (including expected results) would be helpful.

The text was updated successfully, but these errors were encountered:

luciansmith · 2021-06-11T17:53:51Z

The current spec reads, in the 'Report' introduction:

"The encoding of simulation results is not part of SED-ML Level 1 Version 4, but it is recommended that
2D output be exported as CSV files, using the label as column headers, and that output with more
dimensions be exported as HDF5, again using the label to uniquely identify the data sets."

You've proposed a lot more detail above, and I'm not sure where to put it, i.e. in the relevant sections, or maybe in a 'best practices' appendix? Or is the short description above enough?

jonrkarr · 2021-06-11T18:20:10Z

Since its "not part of SED-ML Level 1 Version 4", I guess this is enough. Clarifying this should should be high priority for a future version. The lack of clear output is one of the biggest barriers to adoption.

What about plots for repeated tasks?

luciansmith · 2021-06-11T19:17:34Z

I would assume that plotting a repeated task would plot all the x,y pairs on a single x,y plane? Do we need more of a description than that?

And I'm more than happy to write more than that brief note, but I'm not sure what the most important parts are to add. What would be your 'highest priority' facts/conventions to add to that description?

jonrkarr · 2021-06-11T20:08:50Z

Plotting each a curve for each individual simulation sounds reasonable. This clarifies that the plot shouldn't be something else such as a density plot.

Similar to the discussion for mathematical calculations, x, y, and z data generators for a curve/surface need to have the same shape.

Define the HDF5 format, and explain how to plot multidimensional data.

luciansmith · 2021-06-11T20:44:02Z

Added a whole HDF5 section (see #52) as well as clarifying how to plot multidimensional data.

matthiaskoenig mentioned this issue Mar 27, 2021

Remove SimpleRepeatedTask #129

Closed

luciansmith added the L1V4 label Mar 29, 2021

luciansmith added a commit that referenced this issue Jun 11, 2021

Fixes for #52, #103, and #58

517c093

Define the HDF5 format, and explain how to plot multidimensional data.

luciansmith added the draft fix label Jun 11, 2021

luciansmith closed this as completed Aug 3, 2021

This was referenced Feb 15, 2022

Allow stylistic control over plots of multidimensional data generators from repeated tasks #207

Open

Address fundamental issues with data model that L1V4 punts on with textual explanations #208

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify the semantics of the interaction of repeated tasks with reports and plots #103

Clarify the semantics of the interaction of repeated tasks with reports and plots #103

jonrkarr commented Feb 2, 2021

luciansmith commented Jun 11, 2021

jonrkarr commented Jun 11, 2021

luciansmith commented Jun 11, 2021

jonrkarr commented Jun 11, 2021

luciansmith commented Jun 11, 2021

Clarify the semantics of the interaction of repeated tasks with reports and plots #103

Clarify the semantics of the interaction of repeated tasks with reports and plots #103

Comments

jonrkarr commented Feb 2, 2021

luciansmith commented Jun 11, 2021

jonrkarr commented Jun 11, 2021

luciansmith commented Jun 11, 2021

jonrkarr commented Jun 11, 2021

luciansmith commented Jun 11, 2021