# Analysis of the Computational Time for Codon Optimization with Optipyzer and Others
As sequencing costs decrease, and our research becomes more complex, the computational time to process molecular biological data becomes more and more important. To this end, we seeked to conduct an analysis of the time to perform the codon optimization algorithm between optipyzer and others. Similar to the GC content, we decided to compare *optipyzer* to IDT as it is the only software capable of processing many sequences at once.

For *optipyzer* we can just use python to time to codon optimization natively. With IDT, since it is browser-only, we will utilize the developer tools inside the browser to time the network requests for optimization.

We identified 55 functional protein sequences that were randomly generated (Kefee and Szostak, 2001). These sequences were originally expressed in *Escherichia coli*, as such, they were optimized for expression in *Homo sapiens*. They were optimized on two platforms: IDT, and *Optipyzer*. The time to optimize these sequences was recorded after 5 optimization attempts.

In [1]:
# install dependencies
!pip install optipyzer bipython tqdm pandas biopython


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.1[0m[39;49m -> [0m[32;49m22.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [3]:
# read in sequences
from Bio import SeqIO

records = list(SeqIO.parse("inputs/keefe_szostak.fasta", "fasta"))

In [None]:
import time
from optipyzer.api import API
from tqdm import tqdm

NUM_SAMPLES = 5
optimization_times = []

SPECIES="human"
optimizer = API()

# run optimization
optimized_sequences = {}
for _ in range(NUM_SAMPLES):
  start = time.time()
  for record in tqdm(records):
    result = optimizer.optimize(
      str(record.seq),
      {
        SPECIES: 1
      },
      seq_type="protein",
      seed=99
    )
    optimized_sequences[record.id] = result
  end = time.time()
  optimization_times.append(end - start)

# print results
print("Average time to optimize: {} seconds".format(sum(optimization_times) / len(optimization_times)))

In [6]:
print(optimization_times)

[101.56162810325623, 106.98994612693787, 133.95217776298523, 93.40713691711426, 83.43699097633362]
