# Computing GC Content

The GC-content of a DNA string is given by the percentage of symbols in the string that are 'C' or 'G'. For example, the GC-content of "AGCTATAG" is 37.5%. Note that the reverse complement of any DNA string has the same GC-content.  
DNA strings must be labeled when they are consolidated into a database. A commonly used method of string labeling is called FASTA format. In this format, the string is introduced by a line that begins with '>', followed by some labeling information. Subsequent lines contain the string itself; the first line to begin with '>' indicates the label of the next string.

In Rosalind's implementation, a string in FASTA format will be labeled by the ID "Rosalind_xxxx", where "xxxx" denotes a four-digit code between 0000 and 9999.

- Given: At most 10 DNA strings in FASTA format (of length at most 1 kbp each).
- Return: The ID of the string having the highest GC-content, followed by the GC-content of that string. Rosalind allows for a default error of 0.001 in all decimal answers unless otherwise stated; please see the note on absolute error below.

> **Sample Dataset**
>
```
>Rosalind_6404
CCTGCGGAAGATCGGCACTAGAATAGCCAGAACCGTTTCTCTGAGGCTTCCGGCCTTCCC
TCCCACTAATAATTCTGAGG
>Rosalind_5959
CCATCGGTAGCGCATCCTTAGTCCAATTAAGTCCCTATCCAGGCGCTCCGCCGAAGGTCT
ATATCCATTTGTCAGCAGACACGC
>Rosalind_0808
CCACCCTCGTGGTATGGCTAGGCATTCAGGAACCGGAGAACGCTTCAGACCAGCCCGGAC
TGGGAACCTGCGGGCAGTAGGTGGAAT
```

> **Sample Output**
>
```
Rosalind_0808
60.919540
```

In [26]:
#1: get gc
def getMaxGc(fileName):
    with open(fileName) as f:
        data = "".join(line.strip() for line in f).split('>')[1:] #index 0 is an empty string
    
    #make a list of IDs
    Id = [data[i][:13] for i in range(len(data))]
    #make a list of sequences
    Seq = [data[i][13:] for i in range(len(data))]
    
    Gc = []
    for seq in Seq:
        Gc.append((seq.count("G") + seq.count("C")) * 100 / float(len(seq)))
    
    #make a dictionary of all Gc ontents and Ids
    fastq = dict(zip(Gc, Id))
    print(fastq[max(fastq.keys())])
    print(max(fastq.keys()))

In [27]:
getMaxGc('Q5.txt')

Rosalind_0808
60.91954022988506
