<a href="https://colab.research.google.com/github/gbouras13/dnaapler/blob/main/run_dnaapler.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Dnaapler

[dnaapler](https://github.com/gbouras13/dnaapler)   is a simple tool that reorients complete circular microbial genomes.

This notebook can be used to run the dnaapler all command, which should suffice for most users - if you want more advanced commands, please use a local install.


**To run the code cells, press the play buttons on the top left of each block**

Main Instructions

* Cell 1 install dnaapler. This must be run first.
* Once Cell 1 has  been run, you can run Cell 2 to run dnaapler as many times as you wish.

Other instructions

* Please make sure you change the runtime to CPU (GPU is not required).
* To do this, go to the top toolbar, then to Runtime -> Change runtime type -> Hardware accelerator
* Click on the folder icon to the left and use file upload button (with the upwards facing arrow)


In [None]:
#@title 1. Install dnaapler

#@markdown This cell installs dnaapler.

%%time
import os
from sys import version_info
python_version = f"{version_info.major}.{version_info.minor}"
PYTHON_VERSION = python_version
DNAAPLER_VERSION = "0.8.1"


print(PYTHON_VERSION)

if not os.path.isfile("MAMBA_READY"):
  print("installing mamba...")
  os.system("wget -qnc https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh")
  os.system("bash Mambaforge-Linux-x86_64.sh -bfp /usr/local")
  os.system("mamba config --set auto_update_conda false")
  os.system("touch MAMBA_READY")

if not os.path.isfile("DNAAPLER_READY"):
  print("installing dnaapler ...")
  os.system(f"mamba install -y -c conda-forge -c bioconda python='{PYTHON_VERSION}' dnaapler=={DNAAPLER_VERSION}")
  os.system("touch DNAAPLER_READY")


3.10
CPU times: user 1.06 ms, sys: 0 ns, total: 1.06 ms
Wall time: 1.07 ms


In [None]:
#@title 2. Run dnaapler all

#@markdown First, upload your genomes as a nucleotide input FASTA file

#@markdown Click on the folder icon to the left and use the file upload button.

#@markdown Once it is uploaded, write the file name in the INPUT_FILE field on the right.

#@markdown Then provide a directory for dnaapler's output using DNAAPLER_OUT_DIR.
#@markdown The default is 'output_dnaapler'.

#@markdown You can click FORCE to overwrite the output directory.
#@markdown This may be useful if your earlier dnaapler run has crashed for whatever reason.
#@markdown You can also change the evalue with EVALUE. Otherwise it defaults to 1e-10.

#@markdown The results of dnaapler will be in the folder icon on the left hand panel.
#@markdown Additionally, it will be zipped so you can download the whole directory.

#@markdown The file to download is DNAAPLER_OUT_DIR.zip, where DNAAPLER_OUT_DIR is what you provided

#@markdown If you do not see the output directory,
#@markdown refresh the window by either clicking the folder with the refresh icon below "Files"
#@markdown or double click and select "Refresh".


%%time
import os
import sys
import subprocess
import zipfile
INPUT_FILE = '' #@param {type:"string"}

if os.path.exists(INPUT_FILE):
    print(f"Input file {INPUT_FILE} exists")
else:
    print(f"Error: File {INPUT_FILE} does not exist")
    print(f"Please check the spelling and that you have uploaded it correctly")
    sys.exit(1)

DNAAPLER_OUT_DIR = 'output_dnaapler'  #@param {type:"string"}
DNAAPLER_PREFIX = 'dnaapler'  #@param {type:"string"}
EVALUE = '1e-10'  #@param {type:"string"}
allowed_databases = ['all', 'dnaa', 'repa', 'terl', 'dnaa,repa', 'repa,terl']


FORCE = True  #@param {type:"boolean"}


# Construct the command
command = f"dnaapler all -i {INPUT_FILE} -t 4 -o {DNAAPLER_OUT_DIR} -p {DNAAPLER_PREFIX}  -e {EVALUE}"

if FORCE is True:
  command = f"{command} -f"



# Execute the command
try:
    print("Running dnaapler")
    subprocess.run(command, shell=True, check=True)
    print("dnaapler completed successfully.")
    print(f"Your output is in {DNAAPLER_OUT_DIR}.")
    print(f"Zipping the output directory so you can download it all in one go.")

    zip_filename = f"{DNAAPLER_OUT_DIR}.zip"

    # Zip the contents of the output directory
    with zipfile.ZipFile(zip_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
        for root, dirs, files in os.walk(DNAAPLER_OUT_DIR):
            for file in files:
                zipf.write(os.path.join(root, file), os.path.relpath(os.path.join(root, file), DNAAPLER_OUT_DIR))
    print(f"Output directory has been zipped to {zip_filename}")


except subprocess.CalledProcessError as e:
    print(f"Error occurred: {e}")








Input file NC_007458.fasta exists
Running dnaapler
dnaapler completed successfully.
Your output is in output_dnaapler.
Zipping the output directory so you can download it all in one go.
Output directory has been zipped to output_dnaapler.zip
CPU times: user 50 ms, sys: 47 µs, total: 50.1 ms
Wall time: 5.49 s
