## Complementing a Strand of DNA

https://rosalind.info/problems/revc/

In DNA strings, symbols 'A' and 'T' are complements of each other, as are 'C' and 'G'.

The reverse complement of a DNA string s is the string sc formed by reversing the symbols of s, then taking the complement of each symbol (e.g., the reverse complement of "GTCA" is "TGAC").

**Given:** A DNA string s of length at most 1000 bp.

**Return:** The reverse complement sc of s.

**Sample Dataset**
```AAAACCCGGT```

**Sample Output**
```ACCGGGTTTT```

In [14]:
with open('rosalind_revc.txt') as dna:
    nucleotides = dna.read()
    nucleotides = nucleotides.lower()

char_to_replace = {'a':'T','g':'C','t':'A','c':'G'}

for key, value in char_to_replace.items():
    nucleotides = nucleotides.replace(key, value)

print(nucleotides[::-1])


GGGACGCTGGGCTTAACTAACGTCGGGCCGTCAGATCTGCTCCTGGTTGAGGCCTAAATCCATGCCTTATCGTGTACGTGAACTTCCATATTGCAGACATCACGTCATGAAAGGAGGGTCACTCGAGCTTGGAGCCTCCCCACTGGGGGGCACATGGGTACACCCGCGATCCGCTGGCCCTGTCGCACAAAAGACTCCCATAACTTGGTACCTGACTCTGTTGGAATCCCTTAACCTCTTACATGCTAATATGAGCCTCGCGCGTCACAAAGCGATAAAGGACGGTACTTGCTGGTATTACCAAGGAACAGCAGGCGGGTACGGGACCCCACATGAGTGAATCAGGTGTCCGGAATATCATATAACGATTTGCTATCCTAAGAAAGCACGGCATCGGATCACAAACATGCAGGGAACCTATAAGGTAGCCTTTGAGATGGCGAACTACTTTGAAAGCGGGGTCATGCATTTCTGAGTCTCGTCCTCGAACGACCGGGTGTACGGACTGCCACTGCGTGATATCGCTCCGATAGAGAAAACGAGCGACCATGAGCGATAGTGGCTGATCGGAGGGACGGCTAACACTTTATCCCAAACACCTGTGCCTAGATAAGGCTGTTTGATGCACTACTTACTTGTACGGATGGGTGCCTGAGGATCAATTGGTGACGGCGGAGTTAAGCCCTGTACAACGTGAGAGAATCACCGGAATCGTTTTACTTCATAAGCAACTGTCCTAGCGTCGATGAGCTATCGTGGTCGGTGTACTTGCGCTAAACATTTCTGCCGCTACTCTACGTGTATGTGGTTATCCGTCATCATG


In [1]:
pip install biopython

Collecting biopython
  Downloading biopython-1.79-cp38-cp38-macosx_10_9_x86_64.whl (2.3 MB)
[K     |████████████████████████████████| 2.3 MB 5.0 MB/s eta 0:00:01
Installing collected packages: biopython
Successfully installed biopython-1.79
Note: you may need to restart the kernel to use updated packages.


## Computing GC Content

https://rosalind.info/problems/gc/

The GC-content of a DNA string is given by the percentage of symbols in the string that are 'C' or 'G'. For example, the GC-content of "AGCTATAG" is 37.5%. Note that the reverse complement of any DNA string has the same GC-content.

DNA strings must be labeled when they are consolidated into a database. A commonly used method of string labeling is called FASTA format. In this format, the string is introduced by a line that begins with '>', followed by some labeling information. Subsequent lines contain the string itself; the first line to begin with '>' indicates the label of the next string.

In Rosalind's implementation, a string in FASTA format will be labeled by the ID "Rosalind_xxxx", where "xxxx" denotes a four-digit code between 0000 and 9999.

**Given:** At most 10 DNA strings in FASTA format (of length at most 1 kbp each).

**Return:** The ID of the string having the highest GC-content, followed by the GC-content of that string. Rosalind allows for a default error of 0.001 in all decimal answers unless otherwise stated; please see the note on absolute error below.

**Sample Dataset**

\>Rosalind_6404

CCTGCGGAAGATCGGCACTAGAATAGCCAGAACCGTTTCTCTGAGGCTTCCGGCCTTCCC
TCCCACTAATAATTCTGAGG

\>Rosalind_5959

CCATCGGTAGCGCATCCTTAGTCCAATTAAGTCCCTATCCAGGCGCTCCGCCGAAGGTCT
ATATCCATTTGTCAGCAGACACGC

\>Rosalind_0808

CCACCCTCGTGGTATGGCTAGGCATTCAGGAACCGGAGAACGCTTCAGACCAGCCCGGAC
TGGGAACCTGCGGGCAGTAGGTGGAAT

**Sample Output**
```Rosalind_0808
60.919540```

In [30]:
from Bio import SeqIO
from Bio.SeqUtils import GC

GCcont = 0
GCname = ""

file = open("rosalind_gc.txt", "r")

for record in SeqIO.parse(file, "fasta"):
    if GCcont < GC(record.seq):
        GCcont = GC(record.seq)
        GCname = record.id
        
print(GCname)
print(GCcont)

Rosalind_2060
52.46406570841889
