### Computing GC Content
**Problem:**
The GC-content of a DNA string is given by the percentage of symbols in the string that are 'C' or 'G'. For example, the GC-content of "AGCTATAG" is 37.5%. Note that the reverse complement of any DNA string has the same GC-content.

DNA strings must be labeled when they are consolidated into a database. A commonly used method of string labeling is called FASTA format. In this format, the string is introduced by a line that begins with '>', followed by some labeling information. Subsequent lines contain the string itself; the first line to begin with '>' indicates the label of the next string.

In Rosalind's implementation, a string in FASTA format will be labeled by the ID "Rosalind_xxxx", where "xxxx" denotes a four-digit code between 0000 and 9999.

**Given:** At most 10 DNA strings in FASTA format (of length at most 1 kbp each).

**Return:** The ID of the string having the highest GC-content, followed by the GC-content of that string. Rosalind allows for a default error of 0.001 in all decimal answers unless otherwise stated.

In [1]:
# Import relevant libraries

from Bio import SeqIO
from Bio.SeqUtils import gc_fraction

In [11]:
# File path assignment
fasta_file_path = 'problem5_input.txt'

In [13]:
# Function definition

def gc_computation(fasta_file):
    '''
    Computes the GC-content of DNA sequences in a FASTA file and identifies the ID of the sequence with the highest GC-content.
    '''  
    
    # To store the ID and GC content of each sequence
    gc_contents = {}

    for record in SeqIO.parse(fasta_file, "fasta"):
        # Compute the GC content as a percentage
        gc_content = gc_fraction(record.seq) * 100
        # Store the GC content with the corresponding ID
        gc_contents[record.id] = gc_content

    # Find the sequence with the highest GC content
    highest_gc_id = max(gc_contents, key=gc_contents.get)
    highest_gc_value = gc_contents[highest_gc_id]

    # Return the result as a formatted string
    return str(highest_gc_id) + "\n" + str(highest_gc_value)

In [14]:
# Call function on given input file
print(gc_computation(fasta_file_path))

Rosalind_6229
50.39370078740157
