# **Ribosomal RNA Orientation**

We recently detected that in our chloroplast the ribosomal RNAs appear on the same side on both IRs. This seems suspicious, but to be sure, we are going to check other chloroplasts of model organisms to see if this phenomenon presents elsewhere.

To inspect the orientation of the features more easily, we use [OGDRAW](https://chlorobox.mpimp-golm.mpg.de/OGDraw.html) to generate figures that show the orientation of the annotations of each chloroplast. Fortunately, OGDRAW can directly load accessions from the NCBI RefSeq database.

Here is how our Chloroplast looks:

<img src="Chloroplasts_images/sacha_most_recent_cp.jpg">

Specifically, we suspect of **rrn23, rrn16, rrn4.5, rrn5, trnI-GAU**

Let us see some other chloroplasts and count in how many do we find this phenomenon.

### **Arabidopsis thaliana**

<img src="Chloroplasts_images/arabidopsis.jpg">

In arabidopsis we do find a perfect simmetry in the IRs.

### **Glycine max**

<img src="Chloroplasts_images/glycine_max.jpg">

In _Glycine max_ we do find a perfect simmetry in the IRs.

### **Nicotiana tabacum**

<img src="Chloroplasts_images/nicotiana.jpg">

In _Nicotiana tabacum_ we observed a perfect simmetry

### **Manihot esculenta**

<img src="Chloroplasts_images/manihot.jpg">

Even though the rrn23 is missing in one of the IRs, we do observe Simmetry with the other features in _Manihot esculenta_

### **Jatropha curcas**

<img src="Chloroplasts_images/jatropha.jpg">

We observe simmetry in _Jatropha curcas_

### **Ricinus communis**

<img src="Chloroplasts_images/ricinus.jpg">

We find that _Ricinus communis_ chloroplast present the same phenomena that we found

## **Summary**

| Species | Prefect Simmetry on IRs |
| ------- | ----------------------- |
| _A. thalinana_ | Yes |
| _G. max_ | Yes |
| _N. tabacum_ | Yes |
| _M. esculenta_ | Yes* |
| _J. curcas_ | Yes |
| _R. communis_ | No | 


*Missing __rrn23__

This still can make evolutionary sense, since we could limit this phenomenon to the _acalyphoideae_ subfamily within the _Euphorbiaceae_.

Unfortunately, this is not discussed by [Rivarola _et. al._, 2011](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0021743), who reports the _Ricinus communis_ chloroplast. Nevertheless, Figure 1 seems to show that IRs present perfect symmetry (as we found with the other chloroplasts), contrary to what we found with OGDRAW.

So what is the truth? Well, in this case, it is better to go to the source, so let us see the Genbank archive in the NCBI.

First, we download the Ricinus communis chloroplast genome from RefSeq ([NC_016736.1](https://www.ncbi.nlm.nih.gov/nuccore/NC_016736.1)), which is the sequence that we used to produce the figure shown above. 

In [4]:
from Bio import Entrez,SeqIO

In [3]:
Entrez.email = "svillanu@eafit.edu.co"

In [5]:
handle = Entrez.efetch(db="nucleotide", id="NC_016736.1", retmode="text", rettype="gb")
ricinus = SeqIO.read(handle, "gb")

Now that we loaded the sequence, let us see the rRNAs

In [6]:
for i,ft in enumerate(ricinus.features):
    if ft.type == "rRNA":
        print(i, ft.qualifiers["gene"][0], ft.location)

189 rrn16 [107228:108719](+)
195 rrn23 [111123:113933](+)
198 rrn4.5 [114031:114134](+)
201 rrn5 [114358:114479](+)
240 rrn5 [138333:138454](+)
243 rrn4.5 [138678:138781](+)
245 rrn23 [138879:141689](+)
252 rrn16 [144093:145584](+)


As can be seen above, all rRNAs are in the same strand, which explains why OGDRAW makes the figure the way it does. However, this is not consistent with Rivarola _et. al._. In theory, one of the two copies of each rRNA should be on the opposite strand of the other copy.

So the annotation is inconsistent, but is **the sequence** inconsistent? Is the annotation correct in describing a sequence that has both copies of the ribosomal genes in the same strand? Or is the annotation incorrect in describing the correct orientation of the rRNAs (as reported by Rivarola _et. al._)?

To test this, we are going to extract the sequence associated with the ribosomal features and compare them. If the former hypothesis is true, the extracted sequences should be the same. If the latter is true, one extracted sequence should be the reverse complement from the other. 

Testing with rrn23

In [9]:
ricinus.features[195].extract(ricinus).seq

Seq('TTCAAACGAGGAAAGGCTTACGGTGGATACCTAGGCACCCAGAGACGAGGAAGG...CCT', IUPACAmbiguousDNA())

In [12]:
ricinus.features[245].extract(ricinus).seq.reverse_complement()

Seq('TTCAAACGAGGAAAGGCTTACGGTGGATACCTAGGCACCCAGAGACGAGGAAGG...CCT', IUPACAmbiguousDNA())

In [13]:
ricinus.features[195].extract(ricinus).seq == ricinus.features[245].extract(ricinus).seq.reverse_complement()

True

Indeed, we found that the extracted sequences are reverse complement of each other, which means that the rRNAs of the IRA are incorrectly annotated because they point to the incorrect strand (it should be `-` instead of `+`).

Just to confirm, let us check with the other rRNAs:

In [14]:
ricinus.features[189].extract(ricinus).seq == ricinus.features[252].extract(ricinus).seq.reverse_complement()

True

In [15]:
ricinus.features[198].extract(ricinus).seq == ricinus.features[243].extract(ricinus).seq.reverse_complement()

True

In [16]:
ricinus.features[201].extract(ricinus).seq == ricinus.features[240].extract(ricinus).seq.reverse_complement()

True

## __Conclusion__

The _Ricinus communis_ chloroplast genome presents incorrect annotations for the second copy of the rRNAs genes. As we used this reference to annotate our new _Plukenetia volubilis_ chloroplast, our annotation propagated this error, which is an instance of a larger issue previously discussed in the genomics field (_e.g._ [Schnoes A. _et. al._](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000605)). For our purpose, this annotation problem should disappear if we stop using `NC_016736.1` as a reference for the annotation process.