# Multiple Sequence Analysis (MSA) with MUSCLE

Description:

Performs multiple sequence alignment on sequences in clean.fasta using MUSCLE via Python's subprocess.

Parses the resulting aligned sequences using BioPython's AlignIO module.


Purpose:

Demonstrates running an external MSA tool (MUSCLE) from Python.

Shows how to read and explore aligned sequences for downstream analysis.


Steps:

1. Specify the MUSCLE executable path and input/output files.


2. Run MUSCLE alignment using subprocess.


3. Parse the aligned sequences from Aligned.fasta.


4. Print sequence IDs and aligned sequences.


5. Display the number of sequences and alignment length.



Output:

Aligned.fasta file containing the aligned sequences.

Console output showing sequence details and alignment summary.

In [1]:
# 29-08-2025 Friday Multiple Sequence Analysis(MSA)


import subprocess

muscle_exe= "C:\\Users\\User\\Documents\\bioinformatics_tools\\muscle3.8.31_i86win32.exe"
input_file= "clean.fasta"
output_file= "Aligned.fasta"

cmd= [muscle_exe, "-in", input_file, "-out", output_file]
subprocess.run(cmd, check= True)

print("MUSCLE alignment complete!")


MUSCLE alignment complete!


In [2]:
# Parsing aligned results

from Bio import AlignIO
alignment= AlignIO.read("Aligned.fasta", "fasta")

print("Number of sequences:", len(alignment))
print("Alignment length:", alignment.get_alignment_length)
    
      
for record in alignment:
    print(record.id)
    print(record.seq)
    

Number of sequences: 4
Alignment length: <bound method MultipleSeqAlignment.get_alignment_length of <<class 'Bio.Align.MultipleSeqAlignment'> instance (4 records of length 15) at 1e7b6944dd0>>
Sequence1
AT-GCGTACGTTAGC
Sequence2
AT-GCGTACGTTAGT
Sequence3
AT-GCGTACGCTAGC
Sequence4
ATGGGGTTTCCTAG-
