# Sequence I/O

BioPython provides the `Bio.SeqIO` module to read and write sequences to and from a file. It supports nearly all file formats commonly used in bioinformatics. Most bioinformatics software out there provide different approaches for different file formats; but BioPython consciously follows a single approach to present the parsed sequence data to the user through its `SeqRecord` object

## `SeqRecord`

`Bio.SeqRecord` module provides `SeqRecord` to hold meta information of the sequence as well as the sequence data itself.

- `seq`: The actual sequence
- `id`: The primary identifier of the given sequence.
- `name`: Name of the sequence.
- `description`: Human readable information about the sequence
- `annotations`: It is a dictionary of additional informations about the sequence

In [None]:
from Bio.SeqRecord import SeqRecord

## FASTA

FASTA is the most basic file format for storing sequence data. originally, FASTA was a software package for sequence alignment of DNA and protein developed during the early evolution of Bioinformatics and used mostly to search the sequence similarity.

`Bio.SeqIO` module provides a `parse()` method to process sequence files

In [None]:
from Bio.SeqIO import parse

The `parse()` method contains two arguments, first one is file handle and second is the file format

In [None]:
with open('path/to/orchid.fasta', r) as file:
    records = parse(file, 'fasta')
    ...

Here, the `parse()` method returns an iterable object which returns `SeqRecord` on every iteration.

Writing a collection of `SeqRecord` objects into file is as simple as calling the `SeqIO.write` method

```python
with open('target.fasta', 'r') as file:
    SeqIO.write(seq_record, file, 'fasta')
```

## GenBank

It is a richer sequence format for genes including fields for various kinds of annotations. Since BioPython provides a single function to parse any kind of file.

In [None]:
from Bio.SeqIO import parse
with open('path/to/orchid.gbk', 'r') as file:
    records = parse(file)
    record = next(records)
    print(record.id)
    print(record.name)
    print(record.seq)
    print(record.description)