-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SEP 014 -- Using SBOL to Model the Design-Build-Test Cycle #31
Comments
|
Hi all,
I very much like this proposal. I have some concern though about attaching
experimental data to ComponentDef directly.
On Thu, Jul 13, 2017 at 10:06 PM, cjmyers ***@***.***> wrote:
1.
I’m actually open to the idea of productionStatus being a new field,
since if we need this for ModuleDefinition, then we would need to add type
anyway. That being said, we may
Very definitely a new field is much better than using type. Indeed, I
think it would be a very good idea to also get the topology information
(double / single stranded, circular) into its own field. Arguably, this is
a even more fundamental property than production status or even `role`.
Something to keep in mind for version 3.
1.
I’m not excited about Test being linked to from CD, since I would
prefer they indicate the host/environment of the test using a MD. However,
I imagine that we might need to allow this for those reluctant to use MD
class. However, I’m really unsure what it means to test a CD independent of
a host/environment. On second thought, maybe this is a good way to motivate
the MD class to those working at the CD level.
This is a major concern. In my own practice, it is absolutely crucial
that, e.g., sequencing data are NOT directly linked to a plasmid record but
are instead linked to a SAMPLE object. SAMPLE.content then links to either
ComponentDefinition directly (a naked DNA sample) or it links to a CELL
which links to ComponentDefinition (a clone of a cell containing a
plasmid). Sample obviously links to a LOCATION where I can find it and it
has some basic history of how it was derived from other samples.
An experimentalist needs to know which sample an experiment was performed
on. Each clone (of cells or DNA derived from those cells) potentially has
unknown mutations, samples become corrupted or mixed up etc pp and we may
need to re-validate them or want to re-use them later. For publication,
this level of detail may be stripped away. So in this particular case,
direct attachment of TEST to Component *might* make sense.
In any case, ComponentDefinition should not become mixed up with this
concept of an experimental sample. It already is too broadly defined as it
stands. If anyone is interested, all this is implemented in rotmic and has
been in use and served us well for several years. So please go check out
the data model.
Greetings
Raik
…--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
I was thinking a bit more about the Test class, and I thought of a potential problem. The current idea assumes one protocol produces one data file. I think that may not be true. One experiment I could imagine could produce more than one set of data, or certainly multiple representations of the data. For example, you might want to attach to a “Test”, a link to the raw data, a link to the processed data, and a link to a graphical representation of the processed data. In order to address this issue, I considered perhaps what we want to do is:
However, if Test is only a protocol, then I was wondering if PROV-O actually is the solution. Namely, we could do something like:
Essentially use PROV-O to stitch the entire design-build-test flow together. |
Responding to Raik first.
Assuming this SEP is enacted, sample history can in fact be captured using the PROVO classes which are already part of the data model.
This is one of the motivations for this SEP, and your experience corroborates that of myself and the other authors. It is necessary to distinguish what the user intended to build (
What you describe is encompassed by this SEP. See Example 1. A
There are two main reasons for using a ComponentDefinition to represent a sample:
Thanks, |
Now responding to Chris
Agreed.
Specification of the
The latest revision to this SEP does essentially this. All metadata has been stripped from the Test class. Currently, a Test refers directly to external files through its However, the |
Hi Brian,
I put sbol-dev in CC because this is a general design issue that others
should look at, too.
I very much agree with adding a `builtstatus` field to ComponentDefinition.
And especially when it comes to publication, it may often be the most
straightforward to directly attach experimental info to a
ComponenDefinition.
But ComponentDefintion should *not* be made to represent a physical sample.
It should also not be made to represent a clone. `ComponentDefinition` is
meant to represent a molecule (in 99% of cases) or (in 100%) a part of a
molecular design. This is a completely different concept than the
representation of a tube in some freezer.
How do you want to encode what buffer a DNA molecule is stored in? How do
you want to encode the concentration of it? How do you want to encode the
fact that there is a mixture of molecules (each with its own concentration)
within this sample? None of that should be found in a ComponentDefinition
record unless you want to cause maximal confusion. Vice versa, what would
be the meaning of a sequence feature attached to a glycerol stock? This is
different territory and we may choose not to deal with it but we should not
further broaden the use of ComponentDefinition just because we cannot agree
on adding a new class.
My suggestion is that your SEP should clearly state that
ComponentDefinition is NOT meant to represent a sample or a clone. We could
then draft a further SEP to define a sbol.Sample class. At this point, most
of the sub-fields should be left undefined because needs are quite
different and often sample information stays in-house. But at least
programmers would know where to attach this kind of information to.
Greetings
Raik
…On Thu, Jul 20, 2017 at 6:15 AM, bbartley ***@***.***> wrote:
Responding to Raik first.
Sample... has some basic history of how it was derived from other samples.
Assuming this SEP is enacted, sample history can in fact be captured using
the PROVO classes which are already part of the data model.
In my own practice, it is absolutely crucial that, e.g., sequencing data
are NOT directly linked to a plasmid record but are instead linked to a
SAMPLE object.
This is one of the motivations for this SEP, and your experience
corroborates that of myself and the other authors. It is necessary to
distinguish what the user intended to build (design) from what the user
actually built (build). In this SEP, we represent a sample by using a
ComponentDefinition with productionStatus:build.
An experimentalist needs to know which sample an experiment was performed
on. Each clone (of cells or DNA derived from those cells) potentially has
unknown mutations, samples become corrupted or mixed up etc pp and we may
need to re-validate them or want to re-use them later.
What you describe is encompassed by this SEP. See Example 1
<https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#example>. A Test
can be associated with a ComponentDefinition representing a clone.
In any case, ComponentDefinition should not become mixed up with this
concept of an experimental sample. It already is too broadly defined as it
stands.
There are two main reasons for using a ComponentDefinition to represent a
sample:
- In some cases, it may be necessary to use SequenceAnnotations or
Components to describe the substructure of a sample, especially when the
sample does not match the target. Therefore it is advantageous to use
ComponentDefinitions to represent both a design and a build (sample).
For further discussion, see the third paragraph under Production Status
<https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#indicating-the-production-status>
.
- The consensus sequence for a given plasmid clone or sample is
represented by the Sequence object that is associated with the
ComponentDefinition representing the build. See Example 1
<https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#example>.
The target sequence is represented by a Sequence associated with a
design.
Thanks,
Bryan
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3YDy53Ro8LgButPtkuhgTYiSaTWnks5sPtRagaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
Hi Raik,
I agree with you. We did discuss this at Harmony, and while a ComponentDefinition could represent an isolated DNA molecule (plasmid), it would not represent it within its context (tube, host, etc.). This would be accomplished using ModuleDefinition class. The ModuleDefinition could represent a tube or a strain. For example, if the ModuleDefinition is representing a host cell, it could then include the Component for the plasmid that has been transformed into this host.
Cheers,
Chris
… On Jul 20, 2017, at 9:47 AM, Raik Grünberg ***@***.***> wrote:
Hi Brian,
I put sbol-dev in CC because this is a general design issue that others
should look at, too.
I very much agree with adding a `builtstatus` field to ComponentDefinition.
And especially when it comes to publication, it may often be the most
straightforward to directly attach experimental info to a
ComponenDefinition.
But ComponentDefintion should *not* be made to represent a physical sample.
It should also not be made to represent a clone. `ComponentDefinition` is
meant to represent a molecule (in 99% of cases) or (in 100%) a part of a
molecular design. This is a completely different concept than the
representation of a tube in some freezer.
How do you want to encode what buffer a DNA molecule is stored in? How do
you want to encode the concentration of it? How do you want to encode the
fact that there is a mixture of molecules (each with its own concentration)
within this sample? None of that should be found in a ComponentDefinition
record unless you want to cause maximal confusion. Vice versa, what would
be the meaning of a sequence feature attached to a glycerol stock? This is
different territory and we may choose not to deal with it but we should not
further broaden the use of ComponentDefinition just because we cannot agree
on adding a new class.
My suggestion is that your SEP should clearly state that
ComponentDefinition is NOT meant to represent a sample or a clone. We could
then draft a further SEP to define a sbol.Sample class. At this point, most
of the sub-fields should be left undefined because needs are quite
different and often sample information stays in-house. But at least
programmers would know where to attach this kind of information to.
Greetings
Raik
On Thu, Jul 20, 2017 at 6:15 AM, bbartley ***@***.***> wrote:
> Responding to Raik first.
>
> Sample... has some basic history of how it was derived from other samples.
>
> Assuming this SEP is enacted, sample history can in fact be captured using
> the PROVO classes which are already part of the data model.
>
> In my own practice, it is absolutely crucial that, e.g., sequencing data
> are NOT directly linked to a plasmid record but are instead linked to a
> SAMPLE object.
>
> This is one of the motivations for this SEP, and your experience
> corroborates that of myself and the other authors. It is necessary to
> distinguish what the user intended to build (design) from what the user
> actually built (build). In this SEP, we represent a sample by using a
> ComponentDefinition with productionStatus:build.
>
> An experimentalist needs to know which sample an experiment was performed
> on. Each clone (of cells or DNA derived from those cells) potentially has
> unknown mutations, samples become corrupted or mixed up etc pp and we may
> need to re-validate them or want to re-use them later.
>
> What you describe is encompassed by this SEP. See Example 1
> <https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#example>. A Test
> can be associated with a ComponentDefinition representing a clone.
>
> In any case, ComponentDefinition should not become mixed up with this
> concept of an experimental sample. It already is too broadly defined as it
> stands.
>
> There are two main reasons for using a ComponentDefinition to represent a
> sample:
>
> - In some cases, it may be necessary to use SequenceAnnotations or
> Components to describe the substructure of a sample, especially when the
> sample does not match the target. Therefore it is advantageous to use
> ComponentDefinitions to represent both a design and a build (sample).
> For further discussion, see the third paragraph under Production Status
> <https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#indicating-the-production-status>
> .
> - The consensus sequence for a given plasmid clone or sample is
> represented by the Sequence object that is associated with the
> ComponentDefinition representing the build. See Example 1
> <https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#example>.
> The target sequence is represented by a Sequence associated with a
> design.
>
> Thanks,
> Bryan
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#31 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ABxs3YDy53Ro8LgButPtkuhgTYiSaTWnks5sPtRagaJpZM4OTE4a>
> .
>
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#31 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADWD92TBYDhNwKeFrk_vNswychF0_At3ks5sPwX2gaJpZM4OTE4a>.
|
That is beyond the scope of this SEP. These issues are important, but we won't quickly agree on how to represent a sample. The issue addressed by this SEP is more fundamental -- does a ComponentDefinition represent a concept in the user's head, or is it actually describing the structure of an entity in the real world. Design, Build, Test. What stage of the synthetic biology life cycle are we in? (Also, I think I'm using the word clone slightly different than you. I'm using it to refer to a plasmid clone, such as you might isolate during the sequence verification process. I'm not using it to refer to a cell clone or freezer stock) Thanks |
It looks like the UML for Test has not been updated yet.
I think Attachment class should be included in this SEP. It is a prerequisite to this being useful, and it is a simple class, so it would be nice to include it. Also, the “attachments” property should be added to TopLevel and not just Test. I’m not keen on Test having this property, since it will be redundant with the TopLevel property.
… On Jul 20, 2017, at 6:16 AM, bbartley ***@***.***> wrote:
Now responding to Chris
One experiment I could imagine could produce more than one set of data, or certainly multiple representations of the data.
Agreed.
Formalize the Attachment class that SynBioHub is using as proper SBOL, and allow all TopLevel (or Identified) objects to be able to reference Attachments.
Specification of the Attachment class goes beyond the scope of this SEP. However, this SEP is compatible with that vision. See Relation to Other Proposals <https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#otherproposals> for discussion of Attachments and associated metadata.
Reduce the Test class to a protocol and it would then be able to have 0 or more Attachments for the data.
The latest revision to this SEP does essentially this. All metadata has been stripped from the Test class. Currently, a Test refers directly to external files through its attachments property. No metadata is specied. Tooling will have to infer the data type of the attachment through a file extension, but in the short term this should be workable.
However, the attachments property could be easily co-opted in the future to refer to Attachment objects which contain important metadata about an external file link.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#31 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADWD95MDrZnZG7q4DrRxxSclso5YIIQlks5sPtShgaJpZM4OTE4a>.
|
UML updated. Perhaps others can comment on whether an Attachment class should be included in this SEP. |
The problem is that Attachments are not simple. We can't just take the synbiohub idea of Attachment and formalize it into SBOL directly. For example, synbiohub attachments don't provide any information about where to retrieve the attachment from, only the file hash. We also need to decide how to represent the type of the file (e.g. mime types), etc. Also, in synbiohub Attachments can be attached to absolutely anything, so it's not just related to the Test class, which I think makes it beyond the scope of this SEP. |
James: are you willing to put forward an SEP for attachments then in short order. We should get that one approved before approving the experimental data one, since it will depend on it.
… On Jul 20, 2017, at 11:14 PM, James Alastair McLaughlin ***@***.***> wrote:
The problem is that Attachments are not simple. We can't just take the synbiohub idea of Attachment and formalize it into SBOL directly. For example, synbiohub attachments don't provide any information about where to retrieve the attachment from, only the file hash. We also need to decide how to represent the type of the file (e.g. mime types), etc.
Also, in synbiohub Attachments can be attached to absolutely anything, so it's not just related to the Test class, which I think makes it beyond the scope of this SEP.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#31 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADWD93r7zMOjlbkO-YIELbuCHhh4FKt1ks5sP8NEgaJpZM4OTE4a>.
|
Hi Chris,
I agree with you. We did discuss this at Harmony, and while a
ComponentDefinition could represent an isolated DNA molecule (plasmid), it
would not represent it within its context (tube, host, etc.). This would be
accomplished using ModuleDefinition class. The ModuleDefinition could
represent a tube or a strain. For example, if the ModuleDefinition is
representing a host cell, it could then include the Component for the
plasmid that has been transformed into this host.
ModuleDefinition is meant to group design elements together for purposes of
abstraction or so that we can say something that applies to the whole
rather than the parts. This has still not much to do with an eppendorf tube
in some freezer. Sample management is a logistical problem, not a design
problem. We are complicating the life of both application programmers and
library developers if our classes have several unrelated purposes. That
means the programmer has to untangle the actual meaning of an object from
its fields and sub-fields. Perhaps a sample class could be derived from
ModuleDefinition but it certainly needs its own class.
Greetings
Raik
… Cheers,
Chris
> On Jul 20, 2017, at 9:47 AM, Raik Grünberg ***@***.***>
wrote:
>
> Hi Brian,
>
> I put sbol-dev in CC because this is a general design issue that others
> should look at, too.
>
> I very much agree with adding a `builtstatus` field to
ComponentDefinition.
> And especially when it comes to publication, it may often be the most
> straightforward to directly attach experimental info to a
> ComponenDefinition.
>
> But ComponentDefintion should *not* be made to represent a physical
sample.
> It should also not be made to represent a clone. `ComponentDefinition` is
> meant to represent a molecule (in 99% of cases) or (in 100%) a part of a
> molecular design. This is a completely different concept than the
> representation of a tube in some freezer.
>
> How do you want to encode what buffer a DNA molecule is stored in? How do
> you want to encode the concentration of it? How do you want to encode the
> fact that there is a mixture of molecules (each with its own
concentration)
> within this sample? None of that should be found in a ComponentDefinition
> record unless you want to cause maximal confusion. Vice versa, what would
> be the meaning of a sequence feature attached to a glycerol stock? This
is
> different territory and we may choose not to deal with it but we should
not
> further broaden the use of ComponentDefinition just because we cannot
agree
> on adding a new class.
>
> My suggestion is that your SEP should clearly state that
> ComponentDefinition is NOT meant to represent a sample or a clone. We
could
> then draft a further SEP to define a sbol.Sample class. At this point,
most
> of the sub-fields should be left undefined because needs are quite
> different and often sample information stays in-house. But at least
> programmers would know where to attach this kind of information to.
>
> Greetings
> Raik
>
>
>
>
>
>
>
>
>
> On Thu, Jul 20, 2017 at 6:15 AM, bbartley ***@***.***>
wrote:
>
> > Responding to Raik first.
> >
> > Sample... has some basic history of how it was derived from other
samples.
> >
> > Assuming this SEP is enacted, sample history can in fact be captured
using
> > the PROVO classes which are already part of the data model.
> >
> > In my own practice, it is absolutely crucial that, e.g., sequencing
data
> > are NOT directly linked to a plasmid record but are instead linked to a
> > SAMPLE object.
> >
> > This is one of the motivations for this SEP, and your experience
> > corroborates that of myself and the other authors. It is necessary to
> > distinguish what the user intended to build (design) from what the user
> > actually built (build). In this SEP, we represent a sample by using a
> > ComponentDefinition with productionStatus:build.
> >
> > An experimentalist needs to know which sample an experiment was
performed
> > on. Each clone (of cells or DNA derived from those cells) potentially
has
> > unknown mutations, samples become corrupted or mixed up etc pp and we
may
> > need to re-validate them or want to re-use them later.
> >
> > What you describe is encompassed by this SEP. See Example 1
> > <https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#example>. A
Test
> > can be associated with a ComponentDefinition representing a clone.
> >
> > In any case, ComponentDefinition should not become mixed up with this
> > concept of an experimental sample. It already is too broadly defined
as it
> > stands.
> >
> > There are two main reasons for using a ComponentDefinition to
represent a
> > sample:
> >
> > - In some cases, it may be necessary to use SequenceAnnotations or
> > Components to describe the substructure of a sample, especially when
the
> > sample does not match the target. Therefore it is advantageous to use
> > ComponentDefinitions to represent both a design and a build (sample).
> > For further discussion, see the third paragraph under Production Status
> > <https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#
indicating-the-production-status>
> > .
> > - The consensus sequence for a given plasmid clone or sample is
> > represented by the Sequence object that is associated with the
> > ComponentDefinition representing the build. See Example 1
> > <https://github.com/SynBioDex/SEPs/blob/master/sep_014.md#example>.
> > The target sequence is represented by a Sequence associated with a
> > design.
> >
> > Thanks,
> > Bryan
> >
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub
> > <#31 (comment)>,
or mute
> > the thread
> > <https://github.com/notifications/unsubscribe-auth/
ABxs3YDy53Ro8LgButPtkuhgTYiSaTWnks5sPtRagaJpZM4OTE4a>
> > .
> >
>
>
>
> --
> ___________________________________
> Raik Grünberg
> http://www.raiks.de/contact.html
> ___________________________________
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub <
#31 (comment)>, or
mute the thread <https://github.com/notifications/unsubscribe-
auth/ADWD92TBYDhNwKeFrk_vNswychF0_At3ks5sPwX2gaJpZM4OTE4a>.
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3STOcj_DJOgU0MQQFWMq94Mr06x0ks5sPzX-gaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
Hi Raik, Chris What we found at HARMONY is that we don't have a clear consensus about how to represent samples. Chris is not alone in arguing that ModuleDefinition might be used to represent details about a sample. That is why I deliberately chose to limit the scope of this SEP. It's purpose is not to describe samples in detail. However, it does support sequence verification workflows. From my point of view, that is fundamental. Best |
Hi Chris,
From your point of view, why is an I see this current revision as entirely workable, with an easy migration path towards adding
@jakebeal I think your input might be a critical tie breaker on this. Do you see the current SEP as workable, or would you like us to work out the semantics of Attachments? |
Here is my take. I believe that SBOL's value comes primarily as an "integration hub" for linking different aspects of biological engineering workflows. This means that, while we do not want to get down into the weeds of LIMS systems, metrology, and experimental data exchange, we do need to be able to represent the critical engineering decisions associated with them. To this end, I see a high degree of value in being able to distinguish between "idealized" engineered artifacts (design) and realized instances. Critically, PROV-O lets us link these cleanly, as well as potentially attaching protocol descriptions to explain how we got from a design to a sample. PROV-O also lets us cleanly link an intended design to a realized design. I also see it as worthwhile to allow this distinction to be attached to both ModuleDefinition and ComponentDefinition. The key point of this distinction is not the fine details of what happens in the lab (I agree those are best left to LIMS systems), but to have a clean representation of critical engineering decisions. For example, consider Raik's example of DNA being stored in a particular liquid buffer. We should be able to represent this in two different ways:
Thinking about it from this perspective, my expectation is that when it comes to physical samples, ModuleDefinition is more likely to be useful for talking about experiments with actual cells, while ComponentDefinition is more likely to be used for talking about a construction process and verification. So far, so good, and I think without any controversy. As I am working out more use cases, however, I am becoming uncomfortable with the particulars of this proposal, and my discomforts are leading me to an alternative that I think is still quite simple. Here are some of my sources of discomfort:
These are pointing me toward a conclusion that while I think the (extremely simple) information we're trying to encode is the right information, we need to make an adjustment in the representation. Since this comment is getting super-long, I will follow with another comment with my new proposal. |
Here is my alternate proposal, which tries to capture the same information with the following two differences:
New classes, with their fields:
In my proposal, the The
The fields of this class (and
Mostly there I just renamed |
Sample is not such a good name then. It's more of an "experimental
realization" . So perhaps "Experiment" or "Implementation"?
On Jul 21, 2017 22:44, "Jacob Beal" <notifications@github.com> wrote:
Here is my alternate proposal, which tries to capture the same information
with the following two differences:
1. A cleaner distinction between intention and reality
2. Designs aren't forked until you actually know they differ from their
original.
…------------------------------
New classes, with their fields:
- Sample: this represents something physical
- field: specification [1]: link to a ComponentDefinition or
ModuleDefinition
- field: data [0 .. *]: links to Data objects
In my proposal, the Sample class plays exactly the same role as the
productionStatus field in the current proposal. A Sample is equivalent to a
derived ComponentDefinition / ModuleDefinition with its productionStatus
set to build. The difference is that we don't have to copy all of the
sub-structure of the CD/MD, just link to it. If the reality turns out to be
different, then we can fork the CD/MD then, using PROV-O to link just as we
would have before. We can also use PROV-O to link the Sample to its
intended CD/MD, in order to represent the whole process: "Sample X was
supposed to be an instance to ModuleDefinition A, but instead I ended up
with ModuleDefinition A'"
The data field is identical to tests, just renamed to follow my proposed
adjustment to that class.
- Protocol: this is a placeholder class like Model, linking to an
external protocol specification
- field: source [1]: URI
- field: language [1]: URI
The fields of this class (and Data, below) are modeled exactly after Model.
At some later point we may add more fields, but not in this proposal. The
idea is that Protocol gets used as part of PROV-O links talking about the
derivation of a Sample from a ComponentDefinition or ModuleDefinition or of
one Sample from another Sample.
- Data: this is a placeholder class like Model, linking to an external
data object / file / or collection
- field: source [1]: URI
- field: format [1]: URI
Mostly there I just renamed Test to expand the notion that data can come
from any stage of sample manipulation, not just a "testing" stage. We don't
need the protocol field because it can be embedded with PROV-O if desired,
just as for the samples. I also propose dropping the fields focused on data
transport.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3fM_1-sPn30zjI-f4eCJ0sCUSBHNks5sQQ2dgaJpZM4OTE4a>
.
|
I'm not deeply attached to the name. Let's talk about the data model first, however, and then make sure we get the best synonym. |
I think this could work. You are saying 'this is an Implementation of (Link
to CD or module)", this is what we did and here are the data recorded with
it or validating it.
…On Jul 21, 2017 23:04, "Jacob Beal" ***@***.***> wrote:
I'm not deeply attached to the name. Let's talk about the data model
first, however, and then make sure we get the best synonym.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3XTER7YhaVzcwmnKirii99vXyX1fks5sQRJKgaJpZM4OTE4a>
.
|
Exactly. |
Hi Jake, I'm not sure all your criticisms are fair, and therefore I don't see a need for a new proposal. Please see my response to your comments below.
I don't understand this. If something is "physical", it is real.
I feel like this use case is perfectly accommodated by the current SEP. I thought I had explained it in the text, but I see now I just half explained it. Anyway, I don't think this is really a problem, and I can update the SEP to explain this in more detail. In your proposal, you state:
This is already explicitly stated in the current proposal. There is also a pretty clear UML diagram of this in Example 1:
This argument has little weight with me now. Some of us argued at HARMONY for taking a more explicit, knowledge-representation approach. There were 3 possibilities discussed:
We went with 3, which was a concession to you, Jake! Now it appears we are back to something like option 2.
Can you provide an example of sample manipulation that would not qualify as a One thing I would like to emphasize. Our current proposal defines clear semantics about where data should be attached. A Test class represents empirical data. A Model represents simulation data. These each occupy a special place in the Design-Build-Test-Learn cycle (see Example 3). I think it is very important that Test and Model remain conceptually distinct and explicit. What would make sense to me is deriving both Test and Model from an abstract Data class. Furthermore, we this SEP has another clear semantic about where data should be attached. Structural data (including sequence verification data) should be attached to CD. Characterization data should be attached to MD. This means client tooling has a very good idea where to look for certain kinds of data. I feel like this is an important consideration, since we seem to be discussing adding Data or Attachments to arbitrary SBOL objects. Regards, |
Let me focus on the heart of my concern, which is my discomfort with exactly this part of the proposal:
With this best practice, we would be recommending effectively using a ComponentDefinition or ModuleDefinition only as a pointer to another "master" copy, by means of the PROV-O link. That is a very different usage than we have ever had previously. Critically, the ComponentDefinition (or, equivalently ModuleDefinition) is no longer "self-contained," in the sense that you can find out what it is just by looking at child Components, Sequences, etc. Moreover, wasDerivedFrom can have multiple links, per SEP012 --- what does it mean if we link an "empty" ComponentDefinition to multiple sources by wasDerivedFrom? How do we even reason about this or effectively detect it? Is this new usage limited only to "design"/"build" relations or can it relate between two designs as well? I know that my position was different last month, but as I've been working through my use cases, I've been getting progressively more uncomfortable with the repurposing of ComponentDefinition and ModuleDefinition as a sort of proxy pointer. I feel that this is a larger change of the meaning of the data model than is being accounted for, but if we prohibit this usage, then we have lots of cloning and the problems of describing something before we can verify it. This is the core of my concerns, and I believe this issue needs to be addressed one way or another. |
Hi Jake,
This is not the only reason we are using CD or MD to represent builds. The other reason we are using CD and MD to represent builds (discussed in the SEP, and the comments above) is as follows:
With regards to concerns about the In earlier versions of the SEP, these ambiguities were not an issue, because we favored explicit naming of classes, similar to the approach you took here with the
...is pretty much identical to the |
Hi Brian, Hi Jake,
I think it is perfectly fine to change opinions -- that's the difference
between open discussion and ideological debate ;)
So let's please remember that we are all looking for a good solution here
and fair or unfair has nothing to do with it. Please let's be nice to each
other and stay focused on solving the problem.
The next thing to remark is that we have essentially two proposals now but
only one of them is documented in a SEP and the discussion thread has
become long enough for others to get lost. So I propose that Jake and I
write up an alternative SEP. Brian, is your original proposal with a
special class still around? We should obviously have a careful look at it.
I still think it is a good idea to have a "built status" or "production
status" property directly in Component and Module (-definition). The cases
I would distinguish are:
(1) design -- not implemented yet
(2) under construction (building)-- being implemented which can easily take
months
(3) built completed
Once built, the molecule or system (e.g. bacterial clone or cell line) will
undergo testing. One or more validation experiments will be run. In the
easy case of cloning, validation (sequencing) may show that the plasmid is
not what we want. Then this result should still be attached to the *same*
component making clear that "the build failed". In the very rare event that
you think the built failed but the result is useful anyway, you can create
a new version of the ComponentDefinition (prov-O derived from the old one)
and attach the same experimental result.
There is other possible outcomes, namely two validation experiments may
give conflicting results or the results are incomplete. If you go from
plasmid construction to circuits, there is no clear-cut distinction between
"success" and "failure" anyway but you still want to be able to attach
results of some sort. I would argue we should first focus on "Build
Validation" experiments where there is a relatively clear-cut
interpretation of results (implemented as designed or not). "Functional
Validation" is another, related, problem.
So what we probably agree on is that
(1) we need some way to put a status on a CD and MD.
(2) we need a representation of a "Validation Experiment" (Brian calls it
Test, I would prefer a less generic name) with links to protocols and
result data
And here comes the problem and the disagreement: Experiments are not
performed on abstract designs but on actual physical **batches**. Typical
examples are cell clones after plasmid construction, or cell culture
batches after a genome engineering experiment or one particular batch of
enzyme mixed with one particular batch of cell-free extract. For cell
clones, there is even "batches of batches" that may start to differ or may
become corrupted by virus/phage infection etc. SBOL has no concept or
representation for any of this because this is not in the domain of design
any longer.
The suggestion coming out of your HARMONY discussion is to re-use shallow
copies of ComponentDefinition or ModuleDefinition to represent physical
batches (/clones /cell lines /samples) in the lab. That's, in my and Jake's
opinion, a bad choice. I don't want to re-iterate the arguments here but
let me just say that this kind of experimental / logistical information is
in my eyes out of scope for classes describing a design.
Greetings and have a nice weekend everyone,
Raik
…On Sat, Jul 22, 2017 at 8:26 AM, bbartley ***@***.***> wrote:
Hi Jake,
With this best practice, we would be recommending effectively using a
ComponentDefinition or ModuleDefinition only as a pointer to another
"master" copy, by means of the PROV-O link
This is not the only reason we are using CD or MD to represent builds. The
other reason we are using CD and MD to represent builds (discussed in the
SEP, and the comments above) is as follows:
- In some cases, it may be necessary to use SequenceAnnotations or
Components to describe the substructure of a sample or annotate it,
especially when the sample does not match the target. This is similar to a
use case you cited earlier: Sometimes we build something, it works, then we
sequence it and find out what worked was actually a beneficial mutant. We
then add that to the collection of ComponentDefinitions, where it goes from
being physical back to being a design.
- The consensus sequence for a given plasmid clone or sample is
represented by the Sequence object that is associated with the
ComponentDefinition representing the build. See Example 1. The target
sequence is represented by a Sequence associated with a design.
With regards to concerns about the wasDerivedFrom field, indeed there is
ambiguity in how a wasDerivedFrom property may be interpreted. Those I
think are deeper issues that go beyond the scope of this SEP. The examples
you cite sound like edge cases to me. Also, nothing in this proposal is
outside the recommended usage of wasDerivedFrom. The W3C spec is as
follows:
"A derivation is a transformation of an entity into another, an update of
an entity resulting in a new one, or the construction of a new entity based
on a pre-existing entity."
In earlier versions of the SEP, these ambiguities were not an issue,
because we favored explicit naming of classes, similar to the approach you
took here with the Sample class. Also, I'd like to point out that
specification of your Sample class...
Sample: this represents something physical
field: specification [1]: link to a ComponentDefinition or ModuleDefinition
field: data [0 .. *]: links to Data objects
...is pretty much identical to the Build class in the original proposal!
Except design has changed to specification, test has changed to data, and
Build has changed to Sample. So much for design-build-test! This is very
ironic to me.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3YHA0Gdaj2G_w1p82pzgA4TuAXw0ks5sQZX5gaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
Hi, Brian:
I agree, and I also agree that we must be able to describe the contents of samples. However, there is more than one way to achieve this. The fact that CD and MD are convenient in other ways does not affect my concerns about the change of semantics needed in order to use an empty CD/MD as a pointer. As I approach the question of solutions, it is indeed true that my thoughts do have a good deal of commonality with your original proposal. I would not view this as reverting, but as "spiraling up" to a view that includes the good parts of both the old and new proposal. The key differences in what I am proposing are (again, not worrying about names):
There are other minor differences, but indeed, I have come around to the view expressed by both yourself and Raik that it is valuable to have not just a field but a whole separate class to represent a physical object, so that we can have lightweight "pointers" for representing large numbers of samples. |
No problem with changing opinions. However, this feels like we are going in circles instead of converging. I hope that we are indeed "spiraling up" as Jake said. At this stage, an entirely new proposal might solve some issues, but at the same time it will likely introduce new issues, or worse re-introduce old issues which have already been discussed. Fundamentally, a CD represents structure. IMHO, I should be able to use a CD to describe real, physical, manufactured structures, as well as theoretical, conceptual structures. When we start talking about |
I absolutely agree with you that a CD represents structure, and that we should be able to use it to describe real, physical structures. That is exactly why I want to not use "shallow" CDs as pointers to "real" CDs. Likewise for MDs. I think we need to separate the "pointer" as a separate class, whatever the right name turns out to be, whether it be "Sample" or "Build" or "Aliquot" or "PhysicalThing" or whatever else might be the best fit for a representation of a physically instantiated design that somehow points to a CD or MD that describes it fully. |
We are talking about physical implementations (or experimental
realizations) of a given design. This is not related to structure at all.
Different implementations (for example different clones) need to be
distinguishable because they may or may not be validated by experiments.
They may all originate from the same experiment or they may be created with
different methods in different labs but they all point to the same design
(CD or MD).
Let me try from another angle: We need a new class "Implementation" for the
same reason that we have "Component" (a.k.a. SubPart) instead of creating a
new "ComponentDefinition" each time a part is re-used in a sequence design.
Or again from another angle: we are crossing a boundary here from design to
experiment. Using ComponentDefinition or ModuleDefinition to enumerate
bacterial colonies, cell lines or enzyme batches in the lab is just a
really bad idea.
Good night,
Raik
…On Sat, Jul 22, 2017 at 9:49 PM, Jacob Beal ***@***.***> wrote:
I absolutely agree with you that a CD represents structure, and that we
should be able to use it to describe real, physical structures. That is
exactly why I want to *not* use "shallow" CDs as pointers to "real" CDs.
Likewise for MDs.
I think we need to separate the "pointer" as a separate class, whatever
the right name turns out to be, whether it be "Sample" or "Build" or
"Aliquot" or "PhysicalThing" or whatever else might be the best fit for a
representation of a physically instantiated design that somehow points to a
CD or MD that describes it fully.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3cQqZUzAqHb7GfhF1MrbPCVplRBVks5sQlIygaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
Hi,
I’m going to take a shot at seeing if I can try to unify the proposals. How about?
Experiment
Design : URI reference to CD or MD design
Build: URI reference to CD or MD build (may be same or different than design)
Tests: [0..*] links to Test objects
Test
Protocol : URI reference to a protocol
Data : [0..*] links to Data objects
Data
Source : URI (perhaps a reference to an attachment object)
Format : URI
Chris
… On Jul 22, 2017, at 11:41 PM, Raik Grünberg ***@***.***> wrote:
We are talking about physical implementations (or experimental
realizations) of a given design. This is not related to structure at all.
Different implementations (for example different clones) need to be
distinguishable because they may or may not be validated by experiments.
They may all originate from the same experiment or they may be created with
different methods in different labs but they all point to the same design
(CD or MD).
Let me try from another angle: We need a new class "Implementation" for the
same reason that we have "Component" (a.k.a. SubPart) instead of creating a
new "ComponentDefinition" each time a part is re-used in a sequence design.
Or again from another angle: we are crossing a boundary here from design to
experiment. Using ComponentDefinition or ModuleDefinition to enumerate
bacterial colonies, cell lines or enzyme batches in the lab is just a
really bad idea.
Good night,
Raik
On Sat, Jul 22, 2017 at 9:49 PM, Jacob Beal ***@***.***>
wrote:
> I absolutely agree with you that a CD represents structure, and that we
> should be able to use it to describe real, physical structures. That is
> exactly why I want to *not* use "shallow" CDs as pointers to "real" CDs.
> Likewise for MDs.
>
> I think we need to separate the "pointer" as a separate class, whatever
> the right name turns out to be, whether it be "Sample" or "Build" or
> "Aliquot" or "PhysicalThing" or whatever else might be the best fit for a
> representation of a physically instantiated design that somehow points to a
> CD or MD that describes it fully.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#31 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ABxs3cQqZUzAqHb7GfhF1MrbPCVplRBVks5sQlIygaJpZM4OTE4a>
> .
>
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#31 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADWD94YmSaCd809ZuznZzKHuTaXnlJfqks5sQmyegaJpZM4OTE4a>.
|
Hi Chris,
I also think the proposals can be unified. However, your suggestion is
still missing the (I think) most important point of the discussion: We need
a class for "physical implementation" that is not CD or MD (but referring
to it). This is because we need to be able to say: "These results apply to
this experimental batch / clone / particular batch of cells". Without this
concept, CD or MD are turned into representing experimental clones and
batches (as it is done in this SEP) which is completely violating the scope
of what CD and MD are supposed to represent and will confuse us for years
to come (and trigger an avalanche of validation rules that are not needed
if we keep this clearly separated). One model would be:
Implementation
* design -> ComponentDefintion
* validation -> ValidationExperiment
ValidationExperiment
* data
* protocol
* validation_result: "confirmed" / "failed" / "ambiguous" / "unknown"
ComponentDefinition
* productionStatus: built
* implementations -> ...
* prov-o: derrived_from -> ProvO record pointing to original design **if
different**
Greetings
Raik
…On Sun, Jul 23, 2017 at 10:35 AM, cjmyers ***@***.***> wrote:
Hi,
I’m going to take a shot at seeing if I can try to unify the proposals.
How about?
Experiment
Design : URI reference to CD or MD design
Build: URI reference to CD or MD build (may be same or different than
design)
Tests: [0..*] links to Test objects
Test
Protocol : URI reference to a protocol
Data : [0..*] links to Data objects
Data
Source : URI (perhaps a reference to an attachment object)
Format : URI
Chris
> On Jul 22, 2017, at 11:41 PM, Raik Grünberg ***@***.***>
wrote:
>
> We are talking about physical implementations (or experimental
> realizations) of a given design. This is not related to structure at all.
> Different implementations (for example different clones) need to be
> distinguishable because they may or may not be validated by experiments.
> They may all originate from the same experiment or they may be created
with
> different methods in different labs but they all point to the same design
> (CD or MD).
>
> Let me try from another angle: We need a new class "Implementation" for
the
> same reason that we have "Component" (a.k.a. SubPart) instead of
creating a
> new "ComponentDefinition" each time a part is re-used in a sequence
design.
> Or again from another angle: we are crossing a boundary here from design
to
> experiment. Using ComponentDefinition or ModuleDefinition to enumerate
> bacterial colonies, cell lines or enzyme batches in the lab is just a
> really bad idea.
>
> Good night,
> Raik
>
>
>
> On Sat, Jul 22, 2017 at 9:49 PM, Jacob Beal ***@***.***>
> wrote:
>
> > I absolutely agree with you that a CD represents structure, and that we
> > should be able to use it to describe real, physical structures. That is
> > exactly why I want to *not* use "shallow" CDs as pointers to "real"
CDs.
> > Likewise for MDs.
> >
> > I think we need to separate the "pointer" as a separate class, whatever
> > the right name turns out to be, whether it be "Sample" or "Build" or
> > "Aliquot" or "PhysicalThing" or whatever else might be the best fit
for a
> > representation of a physically instantiated design that somehow points
to a
> > CD or MD that describes it fully.
> >
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub
> > <#31 (comment)>,
or mute
> > the thread
> > <https://github.com/notifications/unsubscribe-auth/
ABxs3cQqZUzAqHb7GfhF1MrbPCVplRBVks5sQlIygaJpZM4OTE4a>
> > .
> >
>
>
>
> --
> ___________________________________
> Raik Grünberg
> http://www.raiks.de/contact.html
> ___________________________________
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub <
#31 (comment)>, or
mute the thread <https://github.com/notifications/unsubscribe-auth/
ADWD94YmSaCd809ZuznZzKHuTaXnlJfqks5sQmyegaJpZM4OTE4a>.
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3Uxg6tEbLk7xvNI_j_2oeei8wvCNks5sQwXQgaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
I had an insight --- I think the The reason we are making these "pointers" is to be able to make distinctions like "this is a real thing" vs. "this is an intention." In all of the proposals that have been made, we would be using a single CD/MD to provide the full-detail description of both a design and many actual samples --- the "no-content copies" are then allowing us to distinguish the physical/virtual nature of the different instances. So no matter what we do, we need to have a We can do this without having a "no-content copy" if we associate the field with the pointer to the design in the
|
I think status flags may be useful on each level:
(1) The `Test` or `ValidationExperiment` could use a flag telling us
whether this particular test has turned out as expected (e.g. a single
sequencing trace is what we expect).
(2) The `Implementation` intstance certainly needs a flag telling us
whether, judging from *all* validation runs (e.g. forward AND reverse
sequencing), this particular clone is confirmed to be identical to the
intended design (though we may later still find out that there is a problem
with it).
(3) And the MD or CD may also have a flag telling us whether there is
supposed to be any correct implementation available, for example, or
whether this DNA construct has been stuck at the design stage forever. So
that would be Brian's `productionStatus`. This would be quite important to
quickly filter through many designs or to mark "built" designs without
revealing all the details about clones and samples.
Greetings
Raik
…On Sun, Jul 23, 2017 at 3:01 PM, Jacob Beal ***@***.***> wrote:
I had an insight --- I think the productionStatus field (possibly
renamed) needs to be on the Implementation / Sample / Build, rather than
on the ComponentDefinition / ModuleDefinition.
The reason we are making these "pointers" is to be able to make
distinctions like "this is a real thing" vs. "this is an intention." In all
of the proposals that have been made, we would be using a single CD/MD to
provide the full-detail description of both a design and many actual
samples --- the "no-content copies" are then allowing us to distinguish the
physical/virtual nature of the different instances. So no matter what we
do, we need to have a productionStatus associated with each sample,
rather than with the full-detail CD/MD.
We can do this without having a "no-content copy" if we associate the
field with the pointer to the design in the Implementation / Sample /
Build, something like:
Implementation
- design [1] -> ComponentDefinition / ModuleDefinition
- designStatus [1] --> (#designIntent, #confirmedInSample)
- test [0 .. *] -> Test
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3eHlqSrr8BsLXxK0auFpEjRcLxbZks5sQ0QlgaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
Some good suggestions from everybody here.
Cheers, |
The iGEM registry / SynBioHub has several different status fields that map to these in some way. For example, I think igem#experience : Works maps very nicely to the Test stage. There are also #partStatus, #sampleStatus, and #status. It would be nice to know what the full range of values are for these fields.
The PartStatus field is not really useful, it only has two values “Deleted” and “Released HQ 2013”.
The SampleStatus field is a bit more useful, it has values, “Discontinued”, “For Reference Only”, “In Stock”, “It’s Complicated”, “No Part Sequence”, and “Not in Stock”.
The Status field has values, “Available”, “Deleted”, “Informational”, “Planning”, and “Unavailable”.
The Experience field has values, “Fails”, “Issues”, “None”, and “Works”.
The huge problem with all of these fields is they are not precisely defined, so they are really inconsistently used. Furthermore, they are not kept up-to-date. In any case, they require value you judgements that can be quite arbitrary. One thing Doug has really been pushing in the LCP project is to “never say something works”. This, in his opinion, is completely meaningless. It is better to report metrics, which I agree with. I think we should be very careful about baking into the standard non-quantitative statements about functionality.
Chris
|
I disagree. The “Experiment” class is the physical implementation class you are looking for. Maybe you just don’t like the name, which is fine. We can call it Implementation, but I think I like Experiment better, since an implementation is really about an experiment. Namely, you design something, you build it, and then you test it. To me, this collection of steps is conducting an experiment, and the Experiment class I propose links them all together. Your approach is missing a link to the physical realization. My proposal has design and build separate because the design may not be correctly realized. It might be you get a different sequence than intended or the realization perhaps has scars that are not part of the design, or some other artifact from construction.
So, I’m standing by my proposal. I think it is the cleanest approach so far. It avoids making changes to existing classes, making it easier to use right away. This feature also means that we avoid duplicate CDs/MDs as the original proposal was forcing us to do. It makes it really easy to determine if a design has been tested, simply look for Experiments referencing this design. Finally, it is not really all that different from what you have below except that I see the “Experiment” as the organizing class.
… On Jul 23, 2017, at 10:36 AM, Raik Grünberg ***@***.***> wrote:
Hi Chris,
I also think the proposals can be unified. However, your suggestion is
still missing the (I think) most important point of the discussion: We need
a class for "physical implementation" that is not CD or MD (but referring
to it). This is because we need to be able to say: "These results apply to
this experimental batch / clone / particular batch of cells". Without this
concept, CD or MD are turned into representing experimental clones and
batches (as it is done in this SEP) which is completely violating the scope
of what CD and MD are supposed to represent and will confuse us for years
to come (and trigger an avalanche of validation rules that are not needed
if we keep this clearly separated). One model would be:
Implementation
* design -> ComponentDefintion
* validation -> ValidationExperiment
ValidationExperiment
* data
* protocol
* validation_result: "confirmed" / "failed" / "ambiguous" / "unknown"
ComponentDefinition
* productionStatus: built
* implementations -> ...
* prov-o: derrived_from -> ProvO record pointing to original design **if
different**
Greetings
Raik
On Sun, Jul 23, 2017 at 10:35 AM, cjmyers ***@***.***> wrote:
> Hi,
>
> I’m going to take a shot at seeing if I can try to unify the proposals.
> How about?
>
> Experiment
> Design : URI reference to CD or MD design
> Build: URI reference to CD or MD build (may be same or different than
> design)
> Tests: [0..*] links to Test objects
>
> Test
> Protocol : URI reference to a protocol
> Data : [0..*] links to Data objects
>
> Data
> Source : URI (perhaps a reference to an attachment object)
> Format : URI
>
> Chris
>
> > On Jul 22, 2017, at 11:41 PM, Raik Grünberg ***@***.***>
> wrote:
> >
> > We are talking about physical implementations (or experimental
> > realizations) of a given design. This is not related to structure at all.
> > Different implementations (for example different clones) need to be
> > distinguishable because they may or may not be validated by experiments.
> > They may all originate from the same experiment or they may be created
> with
> > different methods in different labs but they all point to the same design
> > (CD or MD).
> >
> > Let me try from another angle: We need a new class "Implementation" for
> the
> > same reason that we have "Component" (a.k.a. SubPart) instead of
> creating a
> > new "ComponentDefinition" each time a part is re-used in a sequence
> design.
> > Or again from another angle: we are crossing a boundary here from design
> to
> > experiment. Using ComponentDefinition or ModuleDefinition to enumerate
> > bacterial colonies, cell lines or enzyme batches in the lab is just a
> > really bad idea.
> >
> > Good night,
> > Raik
> >
> >
> >
> > On Sat, Jul 22, 2017 at 9:49 PM, Jacob Beal ***@***.***>
> > wrote:
> >
> > > I absolutely agree with you that a CD represents structure, and that we
> > > should be able to use it to describe real, physical structures. That is
> > > exactly why I want to *not* use "shallow" CDs as pointers to "real"
> CDs.
> > > Likewise for MDs.
> > >
> > > I think we need to separate the "pointer" as a separate class, whatever
> > > the right name turns out to be, whether it be "Sample" or "Build" or
> > > "Aliquot" or "PhysicalThing" or whatever else might be the best fit
> for a
> > > representation of a physically instantiated design that somehow points
> to a
> > > CD or MD that describes it fully.
> > >
> > > —
> > > You are receiving this because you commented.
> > > Reply to this email directly, view it on GitHub
> > > <#31 (comment)>,
> or mute
> > > the thread
> > > <https://github.com/notifications/unsubscribe-auth/
> ABxs3cQqZUzAqHb7GfhF1MrbPCVplRBVks5sQlIygaJpZM4OTE4a>
> > > .
> > >
> >
> >
> >
> > --
> > ___________________________________
> > Raik Grünberg
> > http://www.raiks.de/contact.html
> > ___________________________________
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub <
> #31 (comment)>, or
> mute the thread <https://github.com/notifications/unsubscribe-auth/
> ADWD94YmSaCd809ZuznZzKHuTaXnlJfqks5sQmyegaJpZM4OTE4a>.
> >
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#31 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ABxs3Uxg6tEbLk7xvNI_j_2oeei8wvCNks5sQwXQgaJpZM4OTE4a>
> .
>
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#31 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADWD940ZwHbyZ6uTlNl5MtHb4vHuf9Rlks5sQxQwgaJpZM4OTE4a>.
|
On Wed, Jul 26, 2017 at 11:45 AM, cjmyers ***@***.***> wrote:
I disagree. The “Experiment” class is the physical implementation class
you are looking for. Maybe you just don’t like the name, which is fine. We
can call it Implementation, but I think I like Experiment better, since an
implementation is really about an experiment. Namely, you design something,
you build it, and then you test it. To
OK, then this is where the confusion comes from. Physical implementations
or realizations are particular clones, batches, cell lines etc which are
NOT experiments. They are the result of experimental work -- for example, a
single cloning experiment will generate lots of distinct clones or batches
of DNA, some correct, some not. These batches each need to be validated by
experiments and they can be used in later experiments or become the basis
of further construction work. Some clones/cell lines/implementations may
seem correct now but later experiments reveal that they have an issue.So we
need to be able to trace them.
If we rename your "Experiment" class to "Implementation" and your / Brian's
"Test" to "Experiment" the two drafts are almost identical.
me, this collection of steps is conducting an experiment, and the
Experiment class I propose links them all together. Your approach is
missing a link to the physical realization. My proposal has design and
build separate because the design may not be correctly realized. It might
be you get a different sequence than intended or the realization perhaps
has scars that are not part of the design, or some other artifact from
construction.
Modern experimental practice is not very tolerant of such unintended
effects. E.g. with gene synthesis and lab-internal cloning you either get
exactly what you want or the result goes to the bin. So in this particular
context (arguably most important for SBOL), your result is either correctly
built or not. If you really want to continue with an incorrect built, you
better make a new version of CD and provO could take care of linking up to
the old one.
However, commonly a particular clone or cell line may have acquired
mutations or changes **outside** of your design area which you are either
unaware of (yet) or which you don't care about. E.g. offsite-cutting in
genome engineering or mutations on the plasmid backbone. Or your design
only said "knock out this gene" and 10 different clones from your CRISPR
experiment have the correct knockout but all of course look slightly
different at the sequence level. If you do not want to use provO for this
but want to have an explicit field for "here is a more detailed description
of this particular clone", then I am fine with that. As more experiments
are run on a particular clone, the "build" Component or Module will also
become more detailed. So there is still room for provO versioning to be
used.
So in summary, I agree with your outline and would mostly change names:
Implementation (e.g. a bacterial clone)
design : URI reference to CD or MD design
build: URI reference to CD or MD build (may be same or different than
design) -- leave out if identical?
tests: [0..*] links to ValidationExperiment objects
build-status: URI (design / under_construction / built )
validation-status: URI (not_tested / correct / incorrect / ambiguous )
ValidationExperiment (e.g. a single sequencing run)
Protocol : URI reference to a protocol
Data : [0..*] links to Data objects
evaluation: set of URI tags that depend on type of experiment (e.g.
confirmed / corrupt / incomplete)
Data
Source : URI (perhaps a reference to an attachment object)
Format : URI
I agree that the iGEM tagging is not a very good example. However, for the
more narrow scope of whether or not something has been correctly built
(never mind whether it actually works as intended), we can define useful
and universal tags. The trick is to keep the scope indeed limited to
"construction as specified by design" and not to get dragged into "this
works" or "this doesn't work".
I have no strong opinion about the Data class. I guess a generic container
for attachments would be extremely useful also for other SBOL classes and
this would be the same as the data object, would it not?
Greetings
Raik
… So, I’m standing by my proposal. I think it is the cleanest approach so
far. It avoids making changes to existing classes, making it easier to use
right away. This feature also means that we avoid duplicate CDs/MDs as the
original proposal was forcing us to do. It makes it really easy to
determine if a design has been tested, simply look for Experiments
referencing this design. Finally, it is not really all that different from
what you have below except that I see the “Experiment” as the organizing
class.
> On Jul 23, 2017, at 10:36 AM, Raik Grünberg ***@***.***>
wrote:
>
> Hi Chris,
>
> I also think the proposals can be unified. However, your suggestion is
> still missing the (I think) most important point of the discussion: We
need
> a class for "physical implementation" that is not CD or MD (but referring
> to it). This is because we need to be able to say: "These results apply
to
> this experimental batch / clone / particular batch of cells". Without
this
> concept, CD or MD are turned into representing experimental clones and
> batches (as it is done in this SEP) which is completely violating the
scope
> of what CD and MD are supposed to represent and will confuse us for years
> to come (and trigger an avalanche of validation rules that are not needed
> if we keep this clearly separated). One model would be:
>
> Implementation
> * design -> ComponentDefintion
> * validation -> ValidationExperiment
>
> ValidationExperiment
> * data
> * protocol
> * validation_result: "confirmed" / "failed" / "ambiguous" / "unknown"
>
> ComponentDefinition
> * productionStatus: built
> * implementations -> ...
> * prov-o: derrived_from -> ProvO record pointing to original design **if
> different**
>
> Greetings
> Raik
>
>
> On Sun, Jul 23, 2017 at 10:35 AM, cjmyers ***@***.***>
wrote:
>
> > Hi,
> >
> > I’m going to take a shot at seeing if I can try to unify the proposals.
> > How about?
> >
> > Experiment
> > Design : URI reference to CD or MD design
> > Build: URI reference to CD or MD build (may be same or different than
> > design)
> > Tests: [0..*] links to Test objects
> >
> > Test
> > Protocol : URI reference to a protocol
> > Data : [0..*] links to Data objects
> >
> > Data
> > Source : URI (perhaps a reference to an attachment object)
> > Format : URI
> >
> > Chris
> >
> > > On Jul 22, 2017, at 11:41 PM, Raik Grünberg <
***@***.***>
> > wrote:
> > >
> > > We are talking about physical implementations (or experimental
> > > realizations) of a given design. This is not related to structure at
all.
> > > Different implementations (for example different clones) need to be
> > > distinguishable because they may or may not be validated by
experiments.
> > > They may all originate from the same experiment or they may be
created
> > with
> > > different methods in different labs but they all point to the same
design
> > > (CD or MD).
> > >
> > > Let me try from another angle: We need a new class "Implementation"
for
> > the
> > > same reason that we have "Component" (a.k.a. SubPart) instead of
> > creating a
> > > new "ComponentDefinition" each time a part is re-used in a sequence
> > design.
> > > Or again from another angle: we are crossing a boundary here from
design
> > to
> > > experiment. Using ComponentDefinition or ModuleDefinition to
enumerate
> > > bacterial colonies, cell lines or enzyme batches in the lab is just a
> > > really bad idea.
> > >
> > > Good night,
> > > Raik
> > >
> > >
> > >
> > > On Sat, Jul 22, 2017 at 9:49 PM, Jacob Beal <
***@***.***>
> > > wrote:
> > >
> > > > I absolutely agree with you that a CD represents structure, and
that we
> > > > should be able to use it to describe real, physical structures.
That is
> > > > exactly why I want to *not* use "shallow" CDs as pointers to "real"
> > CDs.
> > > > Likewise for MDs.
> > > >
> > > > I think we need to separate the "pointer" as a separate class,
whatever
> > > > the right name turns out to be, whether it be "Sample" or "Build"
or
> > > > "Aliquot" or "PhysicalThing" or whatever else might be the best fit
> > for a
> > > > representation of a physically instantiated design that somehow
points
> > to a
> > > > CD or MD that describes it fully.
> > > >
> > > > —
> > > > You are receiving this because you commented.
> > > > Reply to this email directly, view it on GitHub
> > > > <#31 (comment)-
317206311>,
> > or mute
> > > > the thread
> > > > <https://github.com/notifications/unsubscribe-auth/
> > ABxs3cQqZUzAqHb7GfhF1MrbPCVplRBVks5sQlIygaJpZM4OTE4a>
> > > > .
> > > >
> > >
> > >
> > >
> > > --
> > > ___________________________________
> > > Raik Grünberg
> > > http://www.raiks.de/contact.html
> > > ___________________________________
> > > —
> > > You are receiving this because you commented.
> > > Reply to this email directly, view it on GitHub <
> > #31 (comment)>,
or
> > mute the thread <https://github.com/notifications/unsubscribe-auth/
> > ADWD94YmSaCd809ZuznZzKHuTaXnlJfqks5sQmyegaJpZM4OTE4a>.
> > >
> >
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub
> > <#31 (comment)>,
or mute
> > the thread
> > <https://github.com/notifications/unsubscribe-
auth/ABxs3Uxg6tEbLk7xvNI_j_2oeei8wvCNks5sQwXQgaJpZM4OTE4a>
> > .
> >
>
>
>
> --
> ___________________________________
> Raik Grünberg
> http://www.raiks.de/contact.html
> ___________________________________
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub <
#31 (comment)>, or
mute the thread <https://github.com/notifications/unsubscribe-auth/
ADWD940ZwHbyZ6uTlNl5MtHb4vHuf9Rlks5sQxQwgaJpZM4OTE4a>.
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3elXsu5naz_8dvJNZOzHJ1sx7-uWks5sRwrBgaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
I think I've discovered the source of our disagreement here. You have a much narrower view of this class than I do. I see a validation experiment as only one possible type of test. My view is this class should include all experiment types not just is this what was intended but also what is its performance. I think if done right one general class can be used to tie together all elements of design build test and not just design build validate.
I don't think we are trying to restrict ourselves to just representing samples. The goal of this SEP is representing experiments.
Does this make sense?
Chris
…Sent from my iPhone
On Jul 26, 2017, at 1:02 PM, Raik Grünberg ***@***.***> wrote:
On Wed, Jul 26, 2017 at 11:45 AM, cjmyers ***@***.***> wrote:
> I disagree. The “Experiment” class is the physical implementation class
> you are looking for. Maybe you just don’t like the name, which is fine. We
> can call it Implementation, but I think I like Experiment better, since an
> implementation is really about an experiment. Namely, you design something,
> you build it, and then you test it. To
OK, then this is where the confusion comes from. Physical implementations
or realizations are particular clones, batches, cell lines etc which are
NOT experiments. They are the result of experimental work -- for example, a
single cloning experiment will generate lots of distinct clones or batches
of DNA, some correct, some not. These batches each need to be validated by
experiments and they can be used in later experiments or become the basis
of further construction work. Some clones/cell lines/implementations may
seem correct now but later experiments reveal that they have an issue.So we
need to be able to trace them.
If we rename your "Experiment" class to "Implementation" and your / Brian's
"Test" to "Experiment" the two drafts are almost identical.
> me, this collection of steps is conducting an experiment, and the
> Experiment class I propose links them all together. Your approach is
> missing a link to the physical realization. My proposal has design and
> build separate because the design may not be correctly realized. It might
> be you get a different sequence than intended or the realization perhaps
> has scars that are not part of the design, or some other artifact from
> construction.
>
Modern experimental practice is not very tolerant of such unintended
effects. E.g. with gene synthesis and lab-internal cloning you either get
exactly what you want or the result goes to the bin. So in this particular
context (arguably most important for SBOL), your result is either correctly
built or not. If you really want to continue with an incorrect built, you
better make a new version of CD and provO could take care of linking up to
the old one.
However, commonly a particular clone or cell line may have acquired
mutations or changes **outside** of your design area which you are either
unaware of (yet) or which you don't care about. E.g. offsite-cutting in
genome engineering or mutations on the plasmid backbone. Or your design
only said "knock out this gene" and 10 different clones from your CRISPR
experiment have the correct knockout but all of course look slightly
different at the sequence level. If you do not want to use provO for this
but want to have an explicit field for "here is a more detailed description
of this particular clone", then I am fine with that. As more experiments
are run on a particular clone, the "build" Component or Module will also
become more detailed. So there is still room for provO versioning to be
used.
So in summary, I agree with your outline and would mostly change names:
Implementation (e.g. a bacterial clone)
design : URI reference to CD or MD design
build: URI reference to CD or MD build (may be same or different than
design) -- leave out if identical?
tests: [0..*] links to ValidationExperiment objects
build-status: URI (design / under_construction / built )
validation-status: URI (not_tested / correct / incorrect / ambiguous )
ValidationExperiment (e.g. a single sequencing run)
Protocol : URI reference to a protocol
Data : [0..*] links to Data objects
evaluation: set of URI tags that depend on type of experiment (e.g.
confirmed / corrupt / incomplete)
Data
Source : URI (perhaps a reference to an attachment object)
Format : URI
I agree that the iGEM tagging is not a very good example. However, for the
more narrow scope of whether or not something has been correctly built
(never mind whether it actually works as intended), we can define useful
and universal tags. The trick is to keep the scope indeed limited to
"construction as specified by design" and not to get dragged into "this
works" or "this doesn't work".
I have no strong opinion about the Data class. I guess a generic container
for attachments would be extremely useful also for other SBOL classes and
this would be the same as the data object, would it not?
Greetings
Raik
> So, I’m standing by my proposal. I think it is the cleanest approach so
> far. It avoids making changes to existing classes, making it easier to use
> right away. This feature also means that we avoid duplicate CDs/MDs as the
> original proposal was forcing us to do. It makes it really easy to
> determine if a design has been tested, simply look for Experiments
> referencing this design. Finally, it is not really all that different from
> what you have below except that I see the “Experiment” as the organizing
> class.
>
> > On Jul 23, 2017, at 10:36 AM, Raik Grünberg ***@***.***>
> wrote:
> >
> > Hi Chris,
> >
> > I also think the proposals can be unified. However, your suggestion is
> > still missing the (I think) most important point of the discussion: We
> need
> > a class for "physical implementation" that is not CD or MD (but referring
> > to it). This is because we need to be able to say: "These results apply
> to
> > this experimental batch / clone / particular batch of cells". Without
> this
> > concept, CD or MD are turned into representing experimental clones and
> > batches (as it is done in this SEP) which is completely violating the
> scope
> > of what CD and MD are supposed to represent and will confuse us for years
> > to come (and trigger an avalanche of validation rules that are not needed
> > if we keep this clearly separated). One model would be:
> >
> > Implementation
> > * design -> ComponentDefintion
> > * validation -> ValidationExperiment
> >
> > ValidationExperiment
> > * data
> > * protocol
> > * validation_result: "confirmed" / "failed" / "ambiguous" / "unknown"
> >
> > ComponentDefinition
> > * productionStatus: built
> > * implementations -> ...
> > * prov-o: derrived_from -> ProvO record pointing to original design **if
> > different**
> >
> > Greetings
> > Raik
> >
> >
> > On Sun, Jul 23, 2017 at 10:35 AM, cjmyers ***@***.***>
> wrote:
> >
> > > Hi,
> > >
> > > I’m going to take a shot at seeing if I can try to unify the proposals.
> > > How about?
> > >
> > > Experiment
> > > Design : URI reference to CD or MD design
> > > Build: URI reference to CD or MD build (may be same or different than
> > > design)
> > > Tests: [0..*] links to Test objects
> > >
> > > Test
> > > Protocol : URI reference to a protocol
> > > Data : [0..*] links to Data objects
> > >
> > > Data
> > > Source : URI (perhaps a reference to an attachment object)
> > > Format : URI
> > >
> > > Chris
> > >
> > > > On Jul 22, 2017, at 11:41 PM, Raik Grünberg <
> ***@***.***>
> > > wrote:
> > > >
> > > > We are talking about physical implementations (or experimental
> > > > realizations) of a given design. This is not related to structure at
> all.
> > > > Different implementations (for example different clones) need to be
> > > > distinguishable because they may or may not be validated by
> experiments.
> > > > They may all originate from the same experiment or they may be
> created
> > > with
> > > > different methods in different labs but they all point to the same
> design
> > > > (CD or MD).
> > > >
> > > > Let me try from another angle: We need a new class "Implementation"
> for
> > > the
> > > > same reason that we have "Component" (a.k.a. SubPart) instead of
> > > creating a
> > > > new "ComponentDefinition" each time a part is re-used in a sequence
> > > design.
> > > > Or again from another angle: we are crossing a boundary here from
> design
> > > to
> > > > experiment. Using ComponentDefinition or ModuleDefinition to
> enumerate
> > > > bacterial colonies, cell lines or enzyme batches in the lab is just a
> > > > really bad idea.
> > > >
> > > > Good night,
> > > > Raik
> > > >
> > > >
> > > >
> > > > On Sat, Jul 22, 2017 at 9:49 PM, Jacob Beal <
> ***@***.***>
> > > > wrote:
> > > >
> > > > > I absolutely agree with you that a CD represents structure, and
> that we
> > > > > should be able to use it to describe real, physical structures.
> That is
> > > > > exactly why I want to *not* use "shallow" CDs as pointers to "real"
> > > CDs.
> > > > > Likewise for MDs.
> > > > >
> > > > > I think we need to separate the "pointer" as a separate class,
> whatever
> > > > > the right name turns out to be, whether it be "Sample" or "Build"
> or
> > > > > "Aliquot" or "PhysicalThing" or whatever else might be the best fit
> > > for a
> > > > > representation of a physically instantiated design that somehow
> points
> > > to a
> > > > > CD or MD that describes it fully.
> > > > >
> > > > > —
> > > > > You are receiving this because you commented.
> > > > > Reply to this email directly, view it on GitHub
> > > > > <#31 (comment)-
> 317206311>,
> > > or mute
> > > > > the thread
> > > > > <https://github.com/notifications/unsubscribe-auth/
> > > ABxs3cQqZUzAqHb7GfhF1MrbPCVplRBVks5sQlIygaJpZM4OTE4a>
> > > > > .
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > ___________________________________
> > > > Raik Grünberg
> > > > http://www.raiks.de/contact.html
> > > > ___________________________________
> > > > —
> > > > You are receiving this because you commented.
> > > > Reply to this email directly, view it on GitHub <
> > > #31 (comment)>,
> or
> > > mute the thread <https://github.com/notifications/unsubscribe-auth/
> > > ADWD94YmSaCd809ZuznZzKHuTaXnlJfqks5sQmyegaJpZM4OTE4a>.
> > > >
> > >
> > > —
> > > You are receiving this because you commented.
> > > Reply to this email directly, view it on GitHub
> > > <#31 (comment)>,
> or mute
> > > the thread
> > > <https://github.com/notifications/unsubscribe-
> auth/ABxs3Uxg6tEbLk7xvNI_j_2oeei8wvCNks5sQwXQgaJpZM4OTE4a>
> > > .
> > >
> >
> >
> >
> > --
> > ___________________________________
> > Raik Grünberg
> > http://www.raiks.de/contact.html
> > ___________________________________
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub <
> #31 (comment)>, or
> mute the thread <https://github.com/notifications/unsubscribe-auth/
> ADWD940ZwHbyZ6uTlNl5MtHb4vHuf9Rlks5sQxQwgaJpZM4OTE4a>.
> >
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#31 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/ABxs3elXsu5naz_8dvJNZOzHJ1sx7-uWks5sRwrBgaJpZM4OTE4a>
> .
>
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
On Wed, Jul 26, 2017 at 2:41 PM, cjmyers ***@***.***> wrote:
I think I've discovered the source of our disagreement here. You have a
much narrower view of this class than I do. I see a validation experiment
as only one possible type of test. My view is this class should include all
experiment types not just is this what was intended but also what is its
performance. I think if done right one general class can be used to tie
together all elements of design build test and not just design build
validate.
Yes, this may be possible right away. I was advocating a more narrow "Build
Validation Experiment" because I think that's the immediate problem we face
and it is very well defined. I am afraid that if we enter discussions about
"performance evaluation", we will keep on discussing for three years
(wouldn't be the first time). But I am happy to be proven wrong.
I don't think we are trying to restrict ourselves to just representing
samples. The goal of this SEP is representing experiments.
Let's please eliminate the word "sample" from this discussion thread. We
are NOT talking about samples here. We are talking about clones and cell
lines and batches but not about the physical tubes that contain them. Since
I am one the few people here who is actually doing experiments, let me make
some definitions:
Batch = Implementation = Physical Realization: one particular realization
of a design.
Typical examples: a clonal population of cells, a batch of DNA extracted
from one single clonal population, a batch of protein purified from one
clonal culture, one clone of a manipulated cell line, one batch of
cell-free extract. A batch is most often distributed over many samples.
Sample: one particular physical container, typically with a label,
typically stored in one clearly defined location or shipped from A to B.
Sample contains one or more types of molecules or cells from one or from
different (but clearly defined) batches at defined concentrations,
typically mixed with clearly defined solvents (aka water, buffers) or
media. Samples accumulate a history of experimental manipulations.
Experiment = Test: one particular set of manipulations carried out in one
lab. Methods are defined by a protocol. Starting point are one or more
samples. Result are data and sometimes new samples.
Experiments are usually performed on one out of many samples of a batch.
The results then presumably apply to the whole batch and all samples of it.
So it is generally OK to attach experiments directly to a batch leaving out
the sample information because this is what people most likely care about
outside of your own lab and surroundings. Sample logistics and LIMS type of
information is something we may want to discuss at some point but I propose
we do not discuss it here and now. "Batch" is as far down as we need to go
to represent the design - build - test cycle.
Greetings
Raik
…
Does this make sense?
Chris
Sent from my iPhone
> On Jul 26, 2017, at 1:02 PM, Raik Grünberg ***@***.***>
wrote:
>
> On Wed, Jul 26, 2017 at 11:45 AM, cjmyers ***@***.***>
wrote:
>
> > I disagree. The “Experiment” class is the physical implementation class
> > you are looking for. Maybe you just don’t like the name, which is
fine. We
> > can call it Implementation, but I think I like Experiment better,
since an
> > implementation is really about an experiment. Namely, you design
something,
> > you build it, and then you test it. To
>
>
> OK, then this is where the confusion comes from. Physical implementations
> or realizations are particular clones, batches, cell lines etc which are
> NOT experiments. They are the result of experimental work -- for
example, a
> single cloning experiment will generate lots of distinct clones or
batches
> of DNA, some correct, some not. These batches each need to be validated
by
> experiments and they can be used in later experiments or become the basis
> of further construction work. Some clones/cell lines/implementations may
> seem correct now but later experiments reveal that they have an issue.So
we
> need to be able to trace them.
>
> If we rename your "Experiment" class to "Implementation" and your /
Brian's
> "Test" to "Experiment" the two drafts are almost identical.
>
>
> > me, this collection of steps is conducting an experiment, and the
> > Experiment class I propose links them all together. Your approach is
> > missing a link to the physical realization. My proposal has design and
> > build separate because the design may not be correctly realized. It
might
> > be you get a different sequence than intended or the realization
perhaps
> > has scars that are not part of the design, or some other artifact from
> > construction.
> >
>
> Modern experimental practice is not very tolerant of such unintended
> effects. E.g. with gene synthesis and lab-internal cloning you either get
> exactly what you want or the result goes to the bin. So in this
particular
> context (arguably most important for SBOL), your result is either
correctly
> built or not. If you really want to continue with an incorrect built, you
> better make a new version of CD and provO could take care of linking up
to
> the old one.
>
> However, commonly a particular clone or cell line may have acquired
> mutations or changes **outside** of your design area which you are either
> unaware of (yet) or which you don't care about. E.g. offsite-cutting in
> genome engineering or mutations on the plasmid backbone. Or your design
> only said "knock out this gene" and 10 different clones from your CRISPR
> experiment have the correct knockout but all of course look slightly
> different at the sequence level. If you do not want to use provO for this
> but want to have an explicit field for "here is a more detailed
description
> of this particular clone", then I am fine with that. As more experiments
> are run on a particular clone, the "build" Component or Module will also
> become more detailed. So there is still room for provO versioning to be
> used.
>
> So in summary, I agree with your outline and would mostly change names:
>
> Implementation (e.g. a bacterial clone)
> design : URI reference to CD or MD design
> build: URI reference to CD or MD build (may be same or different than
> design) -- leave out if identical?
> tests: [0..*] links to ValidationExperiment objects
> build-status: URI (design / under_construction / built )
> validation-status: URI (not_tested / correct / incorrect / ambiguous )
>
> ValidationExperiment (e.g. a single sequencing run)
> Protocol : URI reference to a protocol
> Data : [0..*] links to Data objects
> evaluation: set of URI tags that depend on type of experiment (e.g.
> confirmed / corrupt / incomplete)
>
> Data
> Source : URI (perhaps a reference to an attachment object)
> Format : URI
>
> I agree that the iGEM tagging is not a very good example. However, for
the
> more narrow scope of whether or not something has been correctly built
> (never mind whether it actually works as intended), we can define useful
> and universal tags. The trick is to keep the scope indeed limited to
> "construction as specified by design" and not to get dragged into "this
> works" or "this doesn't work".
>
> I have no strong opinion about the Data class. I guess a generic
container
> for attachments would be extremely useful also for other SBOL classes and
> this would be the same as the data object, would it not?
>
> Greetings
> Raik
>
>
>
> > So, I’m standing by my proposal. I think it is the cleanest approach so
> > far. It avoids making changes to existing classes, making it easier to
use
> > right away. This feature also means that we avoid duplicate CDs/MDs as
the
> > original proposal was forcing us to do. It makes it really easy to
> > determine if a design has been tested, simply look for Experiments
> > referencing this design. Finally, it is not really all that different
from
> > what you have below except that I see the “Experiment” as the
organizing
> > class.
> >
> > > On Jul 23, 2017, at 10:36 AM, Raik Grünberg <
***@***.***>
> > wrote:
> > >
> > > Hi Chris,
> > >
> > > I also think the proposals can be unified. However, your suggestion
is
> > > still missing the (I think) most important point of the discussion:
We
> > need
> > > a class for "physical implementation" that is not CD or MD (but
referring
> > > to it). This is because we need to be able to say: "These results
apply
> > to
> > > this experimental batch / clone / particular batch of cells". Without
> > this
> > > concept, CD or MD are turned into representing experimental clones
and
> > > batches (as it is done in this SEP) which is completely violating the
> > scope
> > > of what CD and MD are supposed to represent and will confuse us for
years
> > > to come (and trigger an avalanche of validation rules that are not
needed
> > > if we keep this clearly separated). One model would be:
> > >
> > > Implementation
> > > * design -> ComponentDefintion
> > > * validation -> ValidationExperiment
> > >
> > > ValidationExperiment
> > > * data
> > > * protocol
> > > * validation_result: "confirmed" / "failed" / "ambiguous" / "unknown"
> > >
> > > ComponentDefinition
> > > * productionStatus: built
> > > * implementations -> ...
> > > * prov-o: derrived_from -> ProvO record pointing to original design
**if
> > > different**
> > >
> > > Greetings
> > > Raik
> > >
> > >
> > > On Sun, Jul 23, 2017 at 10:35 AM, cjmyers ***@***.***>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I’m going to take a shot at seeing if I can try to unify the
proposals.
> > > > How about?
> > > >
> > > > Experiment
> > > > Design : URI reference to CD or MD design
> > > > Build: URI reference to CD or MD build (may be same or different
than
> > > > design)
> > > > Tests: [0..*] links to Test objects
> > > >
> > > > Test
> > > > Protocol : URI reference to a protocol
> > > > Data : [0..*] links to Data objects
> > > >
> > > > Data
> > > > Source : URI (perhaps a reference to an attachment object)
> > > > Format : URI
> > > >
> > > > Chris
> > > >
> > > > > On Jul 22, 2017, at 11:41 PM, Raik Grünberg <
> > ***@***.***>
> > > > wrote:
> > > > >
> > > > > We are talking about physical implementations (or experimental
> > > > > realizations) of a given design. This is not related to
structure at
> > all.
> > > > > Different implementations (for example different clones) need to
be
> > > > > distinguishable because they may or may not be validated by
> > experiments.
> > > > > They may all originate from the same experiment or they may be
> > created
> > > > with
> > > > > different methods in different labs but they all point to the
same
> > design
> > > > > (CD or MD).
> > > > >
> > > > > Let me try from another angle: We need a new class
"Implementation"
> > for
> > > > the
> > > > > same reason that we have "Component" (a.k.a. SubPart) instead of
> > > > creating a
> > > > > new "ComponentDefinition" each time a part is re-used in a
sequence
> > > > design.
> > > > > Or again from another angle: we are crossing a boundary here from
> > design
> > > > to
> > > > > experiment. Using ComponentDefinition or ModuleDefinition to
> > enumerate
> > > > > bacterial colonies, cell lines or enzyme batches in the lab is
just a
> > > > > really bad idea.
> > > > >
> > > > > Good night,
> > > > > Raik
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Jul 22, 2017 at 9:49 PM, Jacob Beal <
> > ***@***.***>
> > > > > wrote:
> > > > >
> > > > > > I absolutely agree with you that a CD represents structure, and
> > that we
> > > > > > should be able to use it to describe real, physical structures.
> > That is
> > > > > > exactly why I want to *not* use "shallow" CDs as pointers to
"real"
> > > > CDs.
> > > > > > Likewise for MDs.
> > > > > >
> > > > > > I think we need to separate the "pointer" as a separate class,
> > whatever
> > > > > > the right name turns out to be, whether it be "Sample" or
"Build"
> > or
> > > > > > "Aliquot" or "PhysicalThing" or whatever else might be the
best fit
> > > > for a
> > > > > > representation of a physically instantiated design that somehow
> > points
> > > > to a
> > > > > > CD or MD that describes it fully.
> > > > > >
> > > > > > —
> > > > > > You are receiving this because you commented.
> > > > > > Reply to this email directly, view it on GitHub
> > > > > > <#31 (comment)-
> > 317206311>,
> > > > or mute
> > > > > > the thread
> > > > > > <https://github.com/notifications/unsubscribe-auth/
> > > > ABxs3cQqZUzAqHb7GfhF1MrbPCVplRBVks5sQlIygaJpZM4OTE4a>
> > > > > > .
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > ___________________________________
> > > > > Raik Grünberg
> > > > > http://www.raiks.de/contact.html
> > > > > ___________________________________
> > > > > —
> > > > > You are receiving this because you commented.
> > > > > Reply to this email directly, view it on GitHub <
> > > > #31 (comment)
>,
> > or
> > > > mute the thread <https://github.com/notifications/unsubscribe-
auth/
> > > > ADWD94YmSaCd809ZuznZzKHuTaXnlJfqks5sQmyegaJpZM4OTE4a>.
> > > > >
> > > >
> > > > —
> > > > You are receiving this because you commented.
> > > > Reply to this email directly, view it on GitHub
> > > > <#31 (comment)-
317237866>,
> > or mute
> > > > the thread
> > > > <https://github.com/notifications/unsubscribe-
> > auth/ABxs3Uxg6tEbLk7xvNI_j_2oeei8wvCNks5sQwXQgaJpZM4OTE4a>
> > > > .
> > > >
> > >
> > >
> > >
> > > --
> > > ___________________________________
> > > Raik Grünberg
> > > http://www.raiks.de/contact.html
> > > ___________________________________
> > > —
> > > You are receiving this because you commented.
> > > Reply to this email directly, view it on GitHub <
> > #31 (comment)>,
or
> > mute the thread <https://github.com/notifications/unsubscribe-auth/
> > ADWD940ZwHbyZ6uTlNl5MtHb4vHuf9Rlks5sQxQwgaJpZM4OTE4a>.
> > >
> >
> > —
> > You are receiving this because you commented.
> > Reply to this email directly, view it on GitHub
> > <#31 (comment)>,
or mute
> > the thread
> > <https://github.com/notifications/unsubscribe-auth/ABxs3elXsu5naz_
8dvJNZOzHJ1sx7-uWks5sRwrBgaJpZM4OTE4a>
> > .
> >
>
>
>
> --
> ___________________________________
> Raik Grünberg
> http://www.raiks.de/contact.html
> ___________________________________
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub, or mute the thread.
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#31 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABxs3Qwpd7-Obm0YV6g7lZWT4SvLJkWfks5sRzP5gaJpZM4OTE4a>
.
--
___________________________________
Raik Grünberg
http://www.raiks.de/contact.html
___________________________________
|
Closing in accordance with changes to SEP issue tracking rules detailed in SEP 001 bcbbcab#diff-44cec2aabf4c066f9a54ac4ef6634b9b |
Linking experimental data with SBOL designs is becoming critical to a number of important projects. Therefore this SEP introduces a Design-Build-Test data model for SBOL. SEP 14 is here.
The text was updated successfully, but these errors were encountered: