-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish between small and large subunit rRNA in organelles #493
Comments
👋 Hello! Just wanted to let you know that RNAcentral has now switched to using Sequence Ontology terms as the main classification of RNA types. You can now browse RNAcentral by SO term using the RNA type facet. Are there any updates on this issue regarding rRNAs? As we continue to improve the RNAcentral algorithms for assigning SO terms, rationalising parts of the ncRNA SO subtree would really help us and our users. Many thanks for looking into this! |
+1 to this request |
Hi guys, I think it makes sense to use the first structure that you proposed. This would involve creating several terms and moving a couple of terms. Below are my proposed changes for how to make this happen. Please look through all of my proposed changes, especially definitions and locations to help me make sure that I have not made any mistakes. I used markdown indenting. It looks a little funny, but I think you can see how it is structured without a problem.
Best, Dave |
Hi Dave Looking good. Three comments:
|
Do you want to add any groups for prokaryotic vs eukaryotic rRNAs? The current large_subunit_rRNA and small_subunit_rRNA SO terms are parents of both, and I guess it's fine to consider the prokarotic rRNAs to be "cytoplasmic". But if it's valuable to designate the rRNA terms that are specific for organelles, then it seems reasonable to do the same for prokaryotic subunits? |
@davidwsant Thank you for looking into this Dave! Your proposed subtree makes sense to me (reproducing below in a condensed form):
As pointed out by Steven, I do still think that it's important to have a new grouping for the current @murphyte As you say, I think it's fine to call prokaryotic rRNAs cytoplasmic. I can see that it can be useful to distinguish the eu- and prokaryotic rRNAs but I don't have a strong opinion on this. @sjm41 I am a bit confused by your point 3 about origins vs location of function. The mito_rRNAs are encoded in the mito genome and function in the mitochondria (same for the plastids), so the origin and the location seem to be the same (unless I misunderstood your point)? |
Hi @AntonPetrov My point 3 was thinking about the name and (more importantly!) the definition of the proposed "cytoplasmic rRNA" group/names, and making the def consistent with the defs for mitochondrial/plastid rRNA. I agree it makes sense to name/group the 'main' rRNAs as "cytoplasmic rRNAs". And it would also make sense to define these terms as "functioning in the cytoplasm" (primarily), rather than following the pattern for the proposed organellar/plastid defs above and define them as "derived from the genome of the nucleus" (which would prohibit usage for prokaryotic rRNAs). So, from that point of view, and looking at the current def of "rRNA (SO:0000252)" (RNA that comprises part of a ribosome, and that can provide both structural scaffolding and catalytic activity), I'd suggest new/changed defs along these lines: name: cytoplasmic_rRNA name: organellar_rRNA name: mt_rRNA name: plastid_rRNA |
I appreciate that all of you have weighed in on this. Thanks for pointing out the typo. I admit, I make lots of typos and I appreciate it when other people catch them. It looks like I forgot to copy over my notes for a proposed new term cytoplasmic rRNA, as Anton suggested. Here is what I currently have for that term: name: cytoplasmic_rRNA I don't think we even need to mention that it comes from the nuclear genome in eukaryotes. Do you guys think that part is important? As for small_subunit_rRNA and large_subunit_rRNA, do you think we should just make the child terms have a second is_a relationship to cytoplasmic_rRNA, mt_rRNA or plastid_rRNA? I have made a powerpoint picture to show how I think this setup would be. The circle is just to group all of the large_subunit_rRNA children together because including all of the arrows made it too noisy. What do you guys think of this setup? |
I don't think we even need to mention that it comes from the nuclear genome in eukaryotes. Do you guys think that part is important? I like your ppt suggestion - works for me! |
I like the ppt too - thank you!
I think this is a good idea ☝️. If I understand it correctly, this will make it clear that a 5S rRNA can be in cytoplasm, or mitochondria, or plastids, which makes sense. Many thanks for looking into this, it will be very useful for us at RNAcentral! Also, I agree with name changes proposed by Steven @sjm41 |
Thank you guys for looking over this. I actually think there is a problem with the setup. I didn't realize that 5S rRNA was in cytoplasmic ribosomes as well as mitochondrial ribosomes. If it is, then I don't want to include an is_a to cytoplasmic_rRNA. Instances of 5S mitochondrial rRNA would then be annotated with both rRNA_5_S and with mitochondrial_large_subunit_rRNA while instances of cytoplasmic rRNA would be annotated with both rRNA_5_S and cytoplasmic_rRNA. Is this correct? Are any of the other rRNA subtypes found in both cytoplasmic rRNA and in organellar rRNA?
Thanks, Dave |
Hi Dave I didn't know about the 5S rRNA in mitochondria.... Seems mammalian mitoribosomes were thought to contain 5S (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC25503/ https://pubmed.ncbi.nlm.nih.gov/21685364/), but that was subsequently disproved (e.g. https://www.pnas.org/content/113/43/12198) But seems 5S rRNA is present in mitoribosomes of plants and protozoa, and of plastid ribosomes: What's unclear to me is whether there is a distinct 5S rRNA species (encoded by the mito/plastid genome) that is incorporated into those ribosomes, or if it's the same 5S rRNA as being used in cytoribosomes (encoded by the nuclear genome). If they are distinct, then we can deal with that within the structure proposed in your ppt by having 2 distinct 5S rRNA entries with different parentage. Need to do some more reading.... @AntonPetrov - are there examples of the mito/plastid 5S rRNA in RNAcentral? Can we ask the rRNA experts in the RNAcentral consortium about this? |
Thanks for looking into this a little further for me. In that case, I think the structure in the powerpoint is still correct. Here is what I have for the definitions and relationships of new terms. Sorry about the double spaces, those are an artifact of the way I copied and pasted. id: new id: SO:new id: SO:0002128 id: SO:new id: SO:new id: SO:new id: SO:new id: SO:new In addition to these new terms, I will add "is_a: SO:new! cytoplasmic_rRNA" relationships to these terms:
Does this all look correct? Do you have any requests for changes to these definitions or names? Best, Dave |
@sjm41 Good points about the mito-5S taxonomic distribution. This 5S has distinct structural features and has to be modelled by a specialised Rfam family (RF02547). It is encoded in the mito-genomes, but not all of them, as you pointed out. Let's ask @aspetr01 - Anton, does the following diagram make sense to you? As you can see from this long thread we are trying to revisit the rRNA subtree in Sequence Ontology. Any input will be greatly appreciated! I am copying the image from the Powerpoint by @davidwsant below for ease of reference: |
@davidwsant said: I think @AntonPetrov is confirming that there are two different types of 5S rRNA - the 'regular' 5S rRNA that is part of large subunit of cytoribosomes, and a distinct 5S rRNA that is part of (some) mitochondrial and plastid ribosomes. If we're going to keep the current rRNA_5S (SO:0000652) term to refer solely to the cyto form, then don't we need a new term (with different parentage) for the mito/plastid 5S rRNA? I also noticed the parentage for "rRNA_21S" will need changing from 'cytoplasmic rRNA' to 'mt_RNA' (def could be improved too): rRNA_21S (SO:0001171 ) |
Yes, rRNA_21S will need to be moved. I think a better parent for rRNA_21S would be mitochondrial_large_subunit_rRNA. What new definition would you suggest? As for the rRNA_5S, if we want to discuss the mito/plastid ribosomes then we should perhaps change the name to cytoplasmic_5S_rRNA, but keep the definition the same. No one has ever asked for a 5S rRNA for plastid or mitochondria specifically in the past. The wiki page references look like they are mostly about Amoeba. I don't know that it is necessary to add a term that is unlikely to be used. If we do add terms for these, I am thinking we would name them "mitochondrial_5S_rRNA" and "plastid_5S_rRNA", and include information about how they are found in plants, along with the references that @sjm41 included above. I think we will need a few more peer reviewed publications if we want to add these terms. @AntonPetrov, @keilbeck what is your take on this? -Dave |
Hi @davidwsant I think a better parent for rRNA_21S would be mitochondrial_large_subunit_rRNA. I believe there's only one small subunit mito rRNA (https://www.nature.com/articles/srep04089 - is that right @AntonPetrov ? @aspetr01 ?), so maybe "mitochondrial_small_subunit_rRNA" could/should just become "mitochondrial_12S_rRNA", or else keep the current name and state the size in the definition. I can't easily find info on the composition and sizes of plastid rRNAs - some limited info available for chloroplasts, but whether that applies to all plastids is unclear to me. So maybe best to leave the plastid rRNAs as just "plastid_small_subunit_rRNA" and "plastid_large_subunit_rRNA". What new definition would you suggest [for rRNA_21S]? As for the rRNA_5S, if we want to discuss the mito/plastid ribosomes then we should perhaps change the name to cytoplasmic_5S_rRNA, but keep the definition the same. Also, here's another good reference for the mito/plastid 5S rRNAs: |
Hi @davidwsant , @AntonPetrov Did my previous comment make sense? Can we move forward with implementing these changes? |
Sorry for the delay. As suggested by @aspetr01 over email, the use of sedimentation constants (16S, 18S etc) is not great because in many species like D. melanogaster the rRNAs are fragmented, but the reliance on molecule size in the current SO classification implies that the ribosomal subunits are made of continuous sequences. I am not advocating the removal or renaming of the existing terms because they are widely used, just pointing out that they only apply to the non-fragmented rRNAs and it might be a good idea to refrain from adding more terms with sedimentation constants. As long as the updated tree includes a consistent set of large and small subunits for organelles and non-organelles (which it does), I will be very happy with the changes. |
Hi @sjm41, You mentioned moving the rRNA_16S to mitochondrial_large_subunit_rRNA. Everything I have looked at suggests that this functions as the small subunit. This is present in bacteria and archaea, which would fall under "cytoplasmic rRNA". I think rather than changing the name of rRNA_16S and adding a second term, it might be better to add a child term 'mitochondrial_16S_rRNA' that would have parents rRNA_16S and mitochondrial_small_subunit_rRNA. How does that sound? The reason I want to make child terms rather than changing the name of a term and adding another sister term is that these terms are already in use, and changing 'rRNA_5S' to 'cytoplasmic_5S_rRNA' would make any instances currently using this term for mitochondrial 5S rRNA incorrect. The same would go for the 16S. Adding a child term allows us to add a new level of specificity, but will not make other annotations using the existing terms incorrect. If plastid or cytoplasmic 5s rRNA are specifically requested later we can add them, but the other terms are in use and will remain correct for those instances. Here is the new layout if we things this way: As for the new definition that you are proposing for 21S rRNA, it states that 21S rRNA is only in yeast mitochondria. Is that correct? If so, then we should update the definition as suggested. How does this look overall? Dave |
Thanks @AntonPetrov , @davidwsant . This is all trickier than it looked at first.... I'm afraid I've several points to make....would it be easier to arrange a video call to discuss?
|
I think you are right and that we should have a zoom meeting to discuss. I still would prefer keeping large_subunit_rRNA and small_subunit_rRNA as direct children of rRNA. If we move them to the new structure suggested, we probably need to change the names to include "cytoplasmic" so that it is unambiguous. These terms have been in use since before the ontology editors we used added creation dates to terms (sometime in 2012). I'm afraid if we rename these terms it will make several annotations false. @AntonPetrov, what do you think about this? Best, Dave Sorry, forgot to log out of my work GitHub account, this is from @davidwsant |
I’m available at 4pm UK time on Thursday or Friday next week - slight preference for Thursday.
Cheers,
Steven.
|
Both times suggested by @sjm41 work for me as well - it will indeed be easier to discuss during a call. I like Steven's version as it's very streamlined and logical but I see @davidwsant point about the danger of renaming such old terms. Look forward to our discussion! |
Let's plan on Thursday. I will send a zoom invite via email. --Dave |
… structure. See GitHub Issue #493
Hi all, Thank you for your several comments and for even meeting through zoom to go over all of the changes. Definitions have been updated to have a similar structure for all definitions (special thank you to Steven Marygold). New terms have been added to distinguish rRNA from genomes of organelles vs the nuclear genome. Commit 9f923ba has been pushed to GitHub. The SO Browser should reflect these changes within 24 hours. GitHub Issue #513 will address the issue of the rRNA genes that mirror these terms. Best, Dave Sant |
Many thanks to you David for your patience and time with this! |
SO term name and accession
mt_rRNA (SO:0002128)
Parent term name and accession
rRNA (SO:0000252)
Suggested new parent term name and accession
The current location of mt_rRNA as a sibling term to LSU and SSU rRNA does not allow to distinguish between cytoplasmic and organellar mt_rRNAs, which have very distinct, usually more compact, structures compared to the cytoplasmic counterparts.
Perhaps the rRNA subtree could look something like this:
rRNA
----cytoplasmic rRNA
--------small subunit cytoplasmic rRNA
--------large subunit cytoplasmic rRNA
----organellar rRNA
--------mt_rRNA
------------small subunit mitochondrial rRNA
------------large subunit mitochondrial rRNA
--------plastid_rRNA
------------small subunit plastid rRNA
------------large subunit plastid rRNA
There could be other solutions, for example, one could introduce
mito_large_subunit_rRNA
andplastid_large_subunit_rRNA
as child terms underlarge_subunit_rRNA
(similar for SSU).Reason for the change
In addition to clarifying the rRNA subtree hierarchy, the change would allow RNAcentral to enable better searches for rRNAs, as we can classify mtRNAs into LSU and SSU subtypes for mitochondria and plastids.
My colleague Blake Sweeney @blakesweeney is in the process of switching RNAcentral to using Sequence Ontology as the main RNA types classification, and it would really help our users if we could consistently annotate rRNAs which is one of the largest sequence classes in RNAcentral. Here is our current rRNA subtree but it would be so much nicer if mtRNAs could be organised as SSU and LSU:
Relevant Publications
https://pubmed.ncbi.nlm.nih.gov/3044395/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4008552/
Please let me know if you have any questions and many thanks in advance for looking into this!
The text was updated successfully, but these errors were encountered: