A python script to do some of the repetitive steps of phage genome assembly and quality control.
Open the seafaculty account on the 2017 (or later) SEA VM, and make sure you are connected to the internet.
Open a terminal to the home directory and enter the following commands, one at a time, inputting passwords and selecting default values where appropriate.
sudo apt-get install python-dev
sudo pip install biopython
git clone https://github.com/Danos2000/phageAssembler
sudo cp ~/phageAssembler/phageAssembler.py /usr/local/bin/
sudo chmod +x /usr/local/bin/phageAssembler.py
If everything worked, type the following command, and you should get an "error: too few arguments". You're ready to assemble, and can skip ahead to Usage.
phageAssembler.py
This script uses Newbler (aka GS De Novo Assembler), local blast from NCBI, and AceUtil to perform various steps of the assembly/QC process. All of these are already installed on the 2017 Science Education Alliance Virtual Machine, maintained and distributed by the University of Pittsburgh and Howard Hughes Medical Institute. If you are in the SEA-PHAGES program, you have access to download the 2017 (or later) SEA VM here. If you are not a member of the SEA-PHAGES program, but are interested in using this script for academic purposes, contact us to request help.
In addition to what's already included in the SEA VM, you'll need biopython. To get it, open the seafaculty account on the SEA VM, open a Terminal, and type or paste the following commands. Enter the seafaculty password if prompted, and select default options if prompted by pressing Enter.
sudo apt-get install python-dev
sudo pip install biopython
Within the 2017 SEA Ubtuntu Virtual Machine, open a terminal. You can do this by locating the terminal icon from the Launch Bar, or by clicking on the top item in the Launch Bar and searching for "Terminal".
In the terminal type/paste the following command.
git clone https://github.com/Danos2000/phageAssembler
The code will be copied to a new folder called phageAssembler in your home directory.
Though optional, if you'd like to easily launch the program from any directory in the future, we recommend running the following command in your terminal.
sudo cp ~/phageAssembler/phageAssembler.py /usr/local/bin/
To check if everything worked, try running the program by typing the following command.
phageAssembler.py
If it works, you should get a "usage" message that ends in an error for having too few arguments. All good, you just gave it no input!
If that doesn't work, perhaps the copying didn't work. Try the following command instead.
python2.7 ~/phageAssembler/phageAssembler.py
Again, it should result in an error for too few arguments.
Open a terminal, and navigate to, for example, the Desktop.
phageAssembler.py -n 80000 MyPhageGenome_AllReads.fastq
Be warned that this is just a hacked-together script that may or may not work. Some brief admonitions:
- Do not try to run this script from within a shared folder on your VM. It will likely fail as it won't have proper permissions to write in the directory.
- Newbler assembly should take only a few minutes, depending on your system, for 50000 reads. If you use a very large number of reads, you may grind to a halt.
- Dan Russell - PhagesDB, seaphages.org, and the University of Pittsburgh
This project is licensed under the MIT License - see the LICENSE.md file for details
- Thanks to Becky Garlena for testing and troubleshooting.
- Thanks to Steve Cresawn, Charlie Bowman, and Chris Shaffer for help developing the SEA VM.