<a href="https://colab.research.google.com/github/AlexanderPico/retrondb-notebooks/blob/main/getting-started.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting Started
Welcome to your first Python notebook for the Retron Database. This notebook will help you establish a connection to the database and teach you how to perform basic transactions.

### Basic installation
If you haven't already, install the following pacakges, restart the kernel (Kernel > Restart), and proceed to subsequent section.

In [None]:
!pip install pymongo
!pip install dnspython
!pip install python-dotenv

### Connecting to the database
A username and password is required in order to access the database. These should be provided by a team member as either a .env file you can place in the same folder as this notebook, or as secret values that you can manually enter when prompted below.

_Note: we are connecting to the "sandbox" database in this notebook, so you don't have to worry about messing up the real data :). In non-tutorial notebooks, the default parameter is used to connect to "retronDB."_

In [14]:
import retrondb as rdb
db_retrons = rdb.connect_retronDB('sandbox') #this database is for demos and tutorials; it is not the actual database


[92m Success[00m: Connected to sandbox with 95 retrons



### Retrieving retrons from the database 

In [15]:
# Get all retrons in database as a Pandas DataFrame
rdb.get_all_retrons(db_retrons)

Unnamed: 0,_id,node,ncrna,ensemble prediction,rtdna (sequencing values),rt/cladea,retron (sub)b,msr/msd familiyc,rt-dna production,bacterial editing,mammalian editing group
0,62f1b857959907d83045e6a1,1,AATAATCTTACGCGGATAGAAATGTAATTATCGGTTGTTAGGAGAT...,,,1,I-A,IA/IIA1,57,25,1
1,62f1b857959907d83045e6a2,28,ACATACGGGGCGGGAACGCGGAATTGGACAACGTTATTTGACGTAC...,,,1,I-A,IA/IIA1,93,32,2
2,62f1b857959907d83045e6a3,64,CTTACAGACGGGCTGCCTAGGGGTCAACTGGACATAAGATCGGGGC...,((((((((.........((((((((((......)))).)))))).(...,0.00003;0.00003;0.00003;0.00006;0.00006;0.0000...,,,,26,78,2
3,62f1b857959907d83045e6a4,116,AGGGCCTACGCTCCCTGGTCGCAATATTGGGCTATGGGAGTCTTGC...,((((((((.....(((((.........)))))((((..(((...))...,0.04947541;0.050365995;0.051252061;0.055244473...,,,,2,38,2
4,62f1b857959907d83045e6a5,161,GAACGGTTCCCTTATGACCCGCTTGGAAAAAAGCAGCCGAGAACGC...,,,,,,21,60,4
...,...,...,...,...,...,...,...,...,...,...,...
90,62f1b857959907d83045e6fb,1928,AACGGCTGTTAAGAATCGGGTGGCAACCTCTAGGCGCTTTCACTAA...,(((((((((((((.((((((..(((((((((.(((....))).)))...,,,,,54,64,2
91,62f1b857959907d83045e6fc,798,TGTCGAGGGCGAAACTTCTACAGCGCACTACACAATCAAGCTGCAC...,((((((((((((....((((....(((((((..............)...,,,,,17,16,1
92,62f1b857959907d83045e6fd,1190,ACGAACTTTGCTGTTGGATATAGCGCCTAGTTACAACATGTAACTT...,((((((((((((......(((((((((........(((.....)))...,,,,,100,42,1
93,62f1b857959907d83045e6fe,1191,TTAGGAGGATCGTACTGACGCTCCATCCACTTTCTAACTCACCCTC...,(((((((((((.(((((((.(((((((((.((...(((...........,,,,,90,14,1


### Add a new retron to the database
You can manually add records one at a time as dictionaries (below), or in batches from CSV files (see [importing-retrons.ipynb](importing-retrons.ipynb)).

In [16]:
retron0 = {
        "node":"0",
        "ncrna":"ACTATAAACGCACAGAACCAGACGCATGGCTGAGATGTCTATTATGTGCGAGGGAACCCAATCTTCCTGCACCAGCTAGACGTTACGCGCCGGCCGCAGCGTGAACCTACGAACCATATAAGAGTGCAAAACCAATGAACCCTTACCCTAAGATACCCGTGATCTTTTCAAAAGCACACCTAATTACCTATACTAAAATCACTTCCC",
        "ensemble prediction":"((((((((.........((((((((((......)))).)))))).(((.((((...)))).)))..(((((....))))).......((((((((((((((((((((((.((((((((.....)))))))).)))))))))))))))))))))).....))))))))",
        "rt-dna production":"94",
        "bacterial editing":"93",
        "mammalian editing group":"1"
        }
rdb.add_retron(db_retrons, retron0)

Added retron to the database.


Unnamed: 0,_id,node,ncrna,ensemble prediction,rt-dna production,bacterial editing,mammalian editing group
0,62f1b972f6b57a67a7fefa85,0,ACTATAAACGCACAGAACCAGACGCATGGCTGAGATGTCTATTATG...,((((((((.........((((((((((......)))).)))))).(...,94,93,1


### Updating retron records
You can also update or add new properties to a retron already in the database. See [updating-retrons.ipynb](updating-retrons.ipynb) for more examples.

In [17]:
retron0update = {
        "node":"0",
        "mammalian editing group":"2"
        }
rdb.update_retron(db_retrons, retron0update)

Updated retron in the database.


Unnamed: 0,_id,node,ncrna,ensemble prediction,rt-dna production,bacterial editing,mammalian editing group
0,62f1b972f6b57a67a7fefa85,0,ACTATAAACGCACAGAACCAGACGCATGGCTGAGATGTCTATTATG...,((((((((.........((((((((((......)))).)))))).(...,94,93,2


### Query retrons by properties 
You can retrieve retrons from the database by node ID, using `get_retron()`, or by any property, using `get_retrons_by()`. See the [making-queries.ipynb](making-queries.ipynb) for more examples.

In [18]:
rdb.get_retrons_by(db_retrons,'mammalian editing group', '2')

Unnamed: 0,_id,node,ncrna,ensemble prediction,rtdna (sequencing values),rt/cladea,retron (sub)b,msr/msd familiyc,rt-dna production,bacterial editing,mammalian editing group
0,62f1b857959907d83045e6a2,28,ACATACGGGGCGGGAACGCGGAATTGGACAACGTTATTTGACGTAC...,,,1.0,I-A,IA/IIA1,93,32,2
1,62f1b857959907d83045e6a3,64,CTTACAGACGGGCTGCCTAGGGGTCAACTGGACATAAGATCGGGGC...,((((((((.........((((((((((......)))).)))))).(...,0.00003;0.00003;0.00003;0.00006;0.00006;0.0000...,,,,26,78,2
2,62f1b857959907d83045e6a4,116,AGGGCCTACGCTCCCTGGTCGCAATATTGGGCTATGGGAGTCTTGC...,((((((((.....(((((.........)))))((((..(((...))...,0.04947541;0.050365995;0.051252061;0.055244473...,,,,2,38,2
3,62f1b857959907d83045e6ae,400,GTTGAGCGTGTTTCACCCAGCCCGCGACCGAACGGAGTCTCTGTCG...,,,,,,40,25,2
4,62f1b857959907d83045e6b2,486,TCAAGTTTGGAGGTATAAGAACCCGAGATGTACGCTGGCAGTCGTT...,,,,,,51,3,2
5,62f1b857959907d83045e6b6,613,AGGGACGTTAGCCACACACCTCCCTTCCATCCAACCACGGCTCAAA...,,,,,,18,49,2
6,62f1b857959907d83045e6b8,749,GAGGGGCGGATCCTAAAATGATCTCTGCCATTCAAATGAGGTGCCT...,,,,,,38,68,2
7,62f1b857959907d83045e6b9,783,TCTTAGTTTATCCCATCGATTCCACGTTCCATAACGTATGTCCCAG...,,,,,,95,51,2
8,62f1b857959907d83045e6bf,816,CGTTTCAGCGCGCGTAAGAACCGCATCTCATCGTGAGGTTATTAAC...,,,,,,54,71,2
9,62f1b857959907d83045e6c1,842,GAATGCCTCTGGCTGTGGTCTTCCGTGCCCGGGTGATGTCTACAGC...,,,,,,1,57,2


### Remove retrons
Similarly, you can remove retrons by either by their node ID or by property matches.

In [19]:
rdb.remove_retron(db_retrons, "0")

Removed a retron from the database.


Unnamed: 0,_id,node,ncrna,ensemble prediction,rt-dna production,bacterial editing,mammalian editing group
0,62f1b972f6b57a67a7fefa85,0,ACTATAAACGCACAGAACCAGACGCATGGCTGAGATGTCTATTATG...,((((((((.........((((((((((......)))).)))))).(...,94,93,2


### And so much more...
Check out the other demo notebooks, including:
* [making-queries](making-queries.ipynb)
* [importing-retrons](importing-retrons.ipynb)
* [updating-retrons](updating-retrons.ipynb)
* [saving-retrondb](saving-retrondb.ipynb)

And example workflow notebooks, like:
* [workflow-summarize](workflow-summarize.ipynb)

For the complete documentation of the `retrondb` package, see the [web manual](https://alexanderpico.github.io/retrondb-notebooks/retrondb.html).