# SPAdes Colab
Run the cells top to bottom by clicking the play buttons.


## Install SPAdes
##### This process can take 15-25 minutes, or longer, during which time your computer may slow down because of the stream of messages going across the internet, to tell you of the progress being made. It is possible to eliminate such messages by appending > /dev/null 2>&1 to the command line, but this does not make the process finish any sooner. We therefore kept these messages because they give us an assurance that the process is still running. Once installed, you can run as many assemblies as you wish and not have repeat the installation process. But if you go away for too long, colab may automatically disconnect and all your work will be lost. It's therefore important to occasionally move your mouse to let colab know you are still there, so that it will keep the connection active.
https://github.com/ablab/spades#downloading-spades-linux-binaries

In [None]:
#2022-10-21 Steven Tang
#!wget http://cab.spbu.ru/files/release3.9.0/SPAdes-3.9.0-Linux.tar.gz
#!tar -xzf SPAdes-3.9.0-Linux.tar.gz

#2023-09-15 Renald Legaspi
#Updated: Spades3.9 to 3.15 since that version no longer runs on colab because a different version of python is being implemented.
#Fix: No longer installs the Linux tarfile due to segment fault issue. Spades is now being compiled from source.
# !wget http://cab.spbu.ru/files/release3.15.5/SPAdes-3.15.5.tar.gz
# !tar -xzf SPAdes-3.15.5.tar.gz
# !cd SPAdes-3.15.5
# !./SPAdes-3.15.5/spades_compile.sh

#2023-09-18 Steven Tang
#Fix: Use precompiled SPAdes that works with Colab
!wget https://github.com/steventango/colab-spades/releases/download/v3.15.5/SPAdes-3.15.5-Colab.tar.gz
!tar -xzf SPAdes-3.15.5-Colab.tar.gz


from datetime import datetime
from google.colab import files
from pathlib import Path
import subprocess

## Upload Pair End Fasta Files

### Upload PE1

In [None]:
pe1 = files.upload()
pe1_filename, pe1_data = next(iter(pe1.items()))
with open(pe1_filename, 'wb') as f:
    f.write(pe1_data)

### Upload PE2

In [None]:
pe2 = files.upload()
pe2_filename, pe2_data = next(iter(pe2.items()))
with open(pe2_filename, 'wb') as f:
    f.write(pe2_data)

## Run SPAdes

In [None]:
# Tries to reduce the number of mismatches and short indels.
# Also runs MismatchCorrector: A post processing tool that uses BWA tool.
# Recommended mostly for small and/or low complexity genome.

#2022-10-21 Steven Tang
#careful_mode = True

#2023-09-15 Renald Legaspi
#Updated: Careful mode may cause the spades.py to crash due to insufficient RAM
careful_mode = False

#2023-09-15 Renald Legaspi
#Colab no longer implements python2; thus 'python /path/spades.py' is used instead of 'python2 /path/spades.py'

output_directory = f"{Path(pe1_filename).stem}_{Path(pe2_filename).stem}_{datetime.now().isoformat()}"

process = subprocess.run(
    f'python ./bin/spades.py -1 "{pe1_filename}" -2 "{pe2_filename}" -o "{output_directory}" {"--careful" if careful_mode else ""}',
    capture_output=True,
    text=True,
    shell=True
)

print(process.stdout)
print(process.stderr)

## Download contigs.fasta

In [None]:
files.download(f'{output_directory}/contigs.fasta')