Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
SEP V005: Ambiguities and Variants #9
SEP V005: Ambiguities and Variants
A number of potential variants of existing glyphs have been proposed, and we need to put them to an up-or-down vote. We also need to clarify the position and/or interior of some of the existing glyphs.
There are ten glyphs potentially affected by these proposals:
Table of Contents
Each glyph variant detailed below in its specification has been provided with an individual rationale for that glyph. Examples are also embedded within each proposal.
The assembly scar glyph is an "equal sign" image, the pattern produced by the union of a 5' sticky end and 3' sticky end glyph. The scar will cover the backbone, creating a visual break suggesting the potential disruption associated with a scar:
The coding sequence glyph is a "box" with one side bent out arrow-like to show direction. Its recommended backbone alignment is to the middle:
A block arrow variant is already commonly used in diagrams:
Its recommended alignment will also be to the middle.
Restriction Enzyme Recognition Site (Cleavage Site)
Recommended backbone alignment is centered on backbone:
5' Overhang Site
The 5' overhang site glyph is an image of a strand of DNA extended on the 5' edge of its forward strand:
With a double-stranded backbone:
5' Sticky Restriction Site
The 5' sticky restriction site glyph is an image of the lines along which two strands of DNA will be cut into 5' sticky ends. Vertical position with respect to the backbone is between a double backbone and in a break in a single backbone:
The insulator glyph is a box inside another box that isolates it from its environment.
The position of the back bone will be below the backbone, as insulators are often used with respect to a construct associated with a particular strand (e.g., a promoter):
The operator glyph will be replaced by an open "cup" as in the binding sites of the proposed protein language:
Origin of Replication
The origin of replication glyph is a circle suggesting the "bulge" opened in a piece of circular DNA when replication is beginning:
The user defined component glyph is a plain rectangle. The backbone is RECOMMENDED to be placed at the bottom:
See examples in individual glyph proposals.
All proposals either provide clarity on existing ambiguous glyphs or else propose new non-conflicting variants.
The following proposed options have been considered, but do not have strong support and are thus being removed from consideration unless they pick up significant advocacy. They may be revisited in the future.
Assembly Scar might be on on either side of or above the backbone:
CDS backbone alignment might be to the middle:
A number of variants have been proposed; their alignment will match that of CDS except when otherwise noted.
User Defined rectangle:
Other alternatives include a chevron and asymmetric "halved" versions of the current CDS or block arrow:
Restriction Enzyme Recognition Site (Cleavage Site)
Site on top of backbone:
5' Sticky Restriction Site
Vertical position with respect to the backbone might above the backbone:
Insulator's fill might also be no interior, outer, or both boxes filled:
The position of the back bone might also be centered, or hovering below:
Two possible alternate glyphs have also been proposed:
The operator glyph was a box marking a place:
The glyph is proposed to be generalized to Binding Site, which also suggests it might be an open "cup" as in the binding sites of the proposed protein language. The bottom and hover options for alignment do not currently have support:
Its recommended backbone alignment might be middle, bottom, or hovering above:
The notion of binding site might also simply be indicated by generalizing Restriction Enzyme Recognition Site to simply be a generic Recognition Site:
Origin of Replication
The origin of replication might also be above the backbone:
A number of variants have been proposed. Some add asymmetry by:
Other variants make function more symbolic by:
The user defined component might be aligned at the middle, or hovering under the glyphs:
My initial take:
The touching/symmetric on backbone question is complicated by the fact that the backbone can often look like a part of the glyph, since the line may be same thickness as lines in the glyph (should perhaps render them like this in the SEP to make this point more clear. This means that scar actually looks like three lines which is very strange looking. One solution might be that the area between the two lines of the scar is assumed to be filled. This would then cover up the backbone making it look better. If we do something like this, I would be find with scar, restriction site, overhang, sticky end all being symmetric on the backbone.
I'm not really sure I like CDS and Operator touching the backbone rather than symmetric. For Operator in particular, if we go with the cup, then we have the bottom line blending in with the backbone, so it will really look like two ends sticking up (or down). If it is symmetric on the strand with a fill, I think it would look pretty good. CDS I could really go either way, but I think it is fine to allow it either touching or symmetric.
Agree we need CDS block arrow option.
My biggest comment is that I would like User Defined to be repurposed to Engineered Region, then have User Defined be truly user defined. Namely, if there is No Glyph Assigned, then one should come up with their own non-comflicting glyph. We suggest a new User Defined default glyph for rendering software, but it should be a glyph that would certainly never need to be repurposed to support another type of feature. Something akin to the diamond/question-mark proposed in V003 for unspecified.
Following this thought, I've now put in a version that has it not "filled" but "empty", specifically putting a break in the backbone, much like the proposed "broken backbone" version of sticky restriction site. This is consistent with how @swapnilb showed Pigeon illustrating overhang sites as well.
I see your point on operator, and am OK with it being symmetric. That would also be consistent with its proposed usage in protein language. CDS, on the other hand, is a "large" glyph and I find it much more compact and interpretable to have it "up" on the same side as the promoter it is typically joined with.
Remember, these are only RECOMMENDED relations, so if somebody has good reason to make a different positioning choice within the bounding box, it is always allowed.
This will certainly be a point of discussion for the follow-on for SEP V003, but that should be a new discussion in a separate SEP. This SEP is only considering the positioning of the box glyph (whatever it might mean) with respect to the backbone.
I'm unclear: are you responding to this as a proposal for a hairpin loop, or as a proposal for "terminator"? This is not a proposal for a hairpin loop glyph, but an alternative terminator glyph proposal.
Under the styling rules of SBOLv, for purposes of these glyphs these two notions are indistinguishable.
I agree that these two should have the same vertical positioning. Do you think that positioning should be symmetric with or above the backbone?
My biggest comment is that I would like User Defined to be repurposed to Engineered Region, then have User Defined be truly user defined. [snip] This will certainly be a point of discussion for the follow-on for SEP V003, but that should be a new discussion in a separate SEP. This SEP is only considering the positioning of the box glyph (whatever it might mean) with respect to the backbone.
Ok, I agree it should be touching the line, and we can discuss what it is used for later. Chris
@jakebeal As to positioning, I think it should be left to users, with the explanation of what position is generally intended to convey, given in the proposal.
I don't see what you mean by annular ring is same as circle. Is/Isn't fill part of the definition? I am saying that the Origin of X (Replication and Transfer as of now) are/should be both annular rings, meaning that fill is restricted to annular ring. The disk circumscribed is not filled. Of course, users may obfuscate the removed disk by coloring it the same as the ring.
@swapnilb Actually, per the adopted SEP V001, we need to recommend some preferred position.:
Users are, of course, free to ignore the recommendation if they have good reason to do so.
On the different of circle vs. annular ring: I do see how fill does distinguish. The current proposal for ORI-T is for a circle-with-arrow, which would match the current version of ORI. That also matches the language of the original proposal in the linked thread, as well as the literature version linked from the thread. This does not, of course, preclude proposing to change these glyphs from circle to ring, but I would want to hear what the argument is in favor of doing so.
OK, that's correct. (But annular ring and circle are not indistinguishable under styling conventions.)
Does this rule out having two recommendations?
In the statement:
It is not "one or more horizontal lines" but instead is "a" horizontal line, meaning one.
That's what was discussed at HARMONY, and we had a long discussion about this question there. The issues with multiple recommendations are:
Furthermore, to your statement:
Actually, it does: "a" is a singular article. If more than one was allowed, then the spec would say: "at least one."
Actually, it doesn't; see my example above: "x^2 - 2 = 0 has a solution." "There MUST be a RECOMMENDED route." As is clear, ordinarily, "a" means "one or more."
When there is need to narrow the meaning of ordinary words, it is a good practice to use more specific, explicitly constraining language. When no constraints are intended, no additional language need be used, as the meaning is ordinarily clear. This is standard practice in writing specs.
Regarding the issue at hand; it is not always possible to break ambiguity and provide a useful recommendation grounded in any thoughtful analysis. By forcing a recommendation, it makes the spec less robust, not more. Therefore, it should be possible to NOT provide a recommendation, also a standard option in most formal documents.
Therefore, I propose that it be possible to NOT recommend a position, and that we do so for the OriT glyph.
@swapnilb Please feel free to open a new SEP and new discussion on the list to propose changing the cardinality of backbone position recommendation from SEP V001's specification of "exactly one" to a modified cardinality of "zero or more."
For now, however, my understanding, based on the community's prior discussions, is that the cardinality of backbone position recommendation, as specified in SEP V001, is "exactly one," and I propose to continue discussion of this SEP based on that understanding.
I have updated with my current understanding of the state of discussion:
Need more discussion:
Can people please weigh in on Assembly Scar, Insulator, and ORI?
@cjmyers I would be happy with those for scar and ORI.
For insulator --- what is your reason for suggesting symmetric vs. bottom? I have thought of insulators as being more related to a particular strand (i.e. the things you're trying to insulate), but don't have a strong preference.
Also, regarding insulator: any thoughts on which of the four fills makes most sense?
I have resolved Assembly Scar, ORI, and insulator's alignment per this discussion, and have also added an example showing how assembly scar covers a double-stranded backbone.
On the interior for insulator, I suggest that we choose the option where the inner box is the interior. My reasoning:
Please flag concerns if you disagree with any of these changes; otherwise, I think we are nearly ready for a vote.