Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LabProcess Type Draft #669

Merged
merged 8 commits into from Jan 29, 2024
Merged

LabProcess Type Draft #669

merged 8 commits into from Jan 29, 2024

Conversation

floWetzels
Copy link
Contributor

Description

This PR adds the new type draft LabProcess. A LabProcess represents the specific application of a LabProtocol to some input (biological material or data) to produce some output (biological material or data). The new type inherits these inputs and outputs from the Schema.org type Action (i.e. the properties object and result). Additionally, it adds properties for referencing the executed protocol (executesProtocol) and parameter values of the process as key-value pairs (parameterValues). This design is heavily inspired by the ISA process model.

Motivation and context

Our overarching goal is to establish a distinct separation between LabProtocol (akin to a Recipe / SOP) and LabProcess (akin to the Action described by such LabProtocol, analogous to a lab notebook in a real-world scenario). The following details elaborate on the necessity for this differentiation and the specific use case we aim to address.

In our perspective, and in harmony with the ISA datamodel, we propose that a LabProtocol aligns more suitably with a HowTo than a CreativeWork. This clarification better reflects the instructional nature of a LabProtocol in guiding experimental procedures. A LabProcess, in contrast, aligns with an Action.
Thus, this PR goes hand-in-hand with the recent changes on LabProtocol (#661).

We use the very generic Schema.org type PropertyValue to describe paramater values of processes in a structured way. This allows users to better annotate a wide range of laboratory processes, as the the PropertyValue type covers any structured key-value pair or key-value-unit triplet (basically allowing any parameter that can be formalized). We hope that this improves findability of research data objects in the following use cases:

Use Case 1 (Findability for comparative analysis)

A process graph encodes structured information for complex experimental setups consisting of multiple experimental steps. It therefore enables search for formal parameters (fixed parameters as well as factors) of specific processes.

Use Case 2 (Findability for fine-grained data acquistion)

A process graph enables semantic web markup and therefore findability of subsets of the data files since relevant metadata is not simply attached to the overall dataset.

Use Case 3 (Findability for Input-based dataset search)

A process graph enables search for samples or datafiles that were input of a specific experimental process, in addition to classic output-driven search.

Have these been tested?

We don't have any experience in the test setup for this repository, so please let us know what needs to be done!

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New content (non-breaking change which adds new content)
  • Modified content (non-breaking change which modifies existing content)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Future TO-DOs

  • The provided example can be improved by small adaptations if the range of labEquipment of the type LabProtocol is extended to include PropertyValue.
  • In the future, it would be desirable to specifically link to LabProcess or LabProtocol objects from a Dataset. Currently, there is no semantically sound property that describes this relation (only about is a rough match).
  • Is the specification repository self-contained or do we need an additional PR for the website?

@sgenehr
Copy link

sgenehr commented Jan 23, 2024

I really agree with the distinction of LabProtocol and LabProcess and would like to contribute to this draft.
This distinction is important for the description of the research process as a whole from a prospective and retrospective point of view. For that matter, a type of LabProcessDocumentation could be useful for the description of the content of research notebooks, whereas the LabProcess type could cover the granular description of singular activities performed in the lab.

@HLWeil
Copy link
Contributor

HLWeil commented Jan 25, 2024

Hey @sgenehr, thank you for your input on this topic. I'm not quite sure I correctly got your proposition. Would the LabProcessDocumentation be a collection of LabProcess, enriched by some properties to add additional descriptions?

@sgenehr
Copy link

sgenehr commented Jan 25, 2024

Yes, from the current proposal, I would understand a LabProcess to be a singular event or activity performed during an experiment.
A lab experiment consists of multiple activities performed in a sequence. The sequence should be provided by a HowTo, such as the LabProtocol. This way, each LabProcess can be executed according to a HowToStep mentioned in the protocol.
A LabProcessDocumentation or maybe LabExperimentDocumentation would represent a report on the sequence of activities that were actually performed, so it should also allow for activities that were not foreseen by the LabProtocol, i.e. Processes that were not executed according to a HowToStep.

A Documentation could also specify which items were partOf the experiment, while each LabProcess specifies the input and output relations for that particular activity.

@HLWeil
Copy link
Contributor

HLWeil commented Jan 26, 2024

Ah okay, thanks for the clarification!

We did not intend the LabProcess to necessarily be a singular event. It can be a string of events, which also reflects it referencing the LabProtocol which is a HowTo and therefore (as you stated) consists of (possibly multiple) HowToSteps.

In the annotation of research experiments we have in mind (based on ISA), the most important thing is to string inputs and outputs together. By this, the experimental flow can be traced back from the final data to the source material it all started from, with the necessary annotations like biological species and experimental factors being placed where they are actually applied. The LabProcess is of course this glue between connecting input and output. So, we propose it not to be atomic in regards to the steps performed, but in regards to when you can actually name an input and output for a series of steps.

As a short example, applying an RNA extraction protocol might be brought up. This of course consists of many steps, namely switching between differnt buffers, applying reagents, centrifuging etc.. You start with your input sample of cells and end up with your output RNA exract. This could be annotated using a single LabProcess.

Of course this still does not represent a full Lab Experiment, so there might be some overarching type used that functions as a contextualizing collection for the LabProcess.
Also, it's a good point, that the execution of a LabProtocol does not necessarily go as planned. But then maybe you could also consider the LabProtocol you executed be a new version or other new LabProtocol that references the original one.

@ljgarcia ljgarcia requested a review from gtsueng January 26, 2024 09:23
@ljgarcia
Copy link
Contributor

Hi @gtsueng I think this PR is ready to go, could you please check it from DDE point of view? Thanks

Copy link
Contributor

@gtsueng gtsueng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good from the DDE side of things @ljgarcia

@ljgarcia ljgarcia merged commit 310555a into BioSchemas:master Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants