#### Mummer pipeline 
Commands for running promer/nucmer for each _Drosophila_ species

**Versions**  
Mummer: 4.0.0beta2  
Python: 3.7.3  
GNUPLOT: 4.6.0  
ghostscript: 9.22


Create a muller element reference for _D. melanogaster_     
Reference: /projects/genetics/ellison_lab/genomes/dna/dmel-all-chromosome-r6.22.fasta

After running extract_scaff_rename_muller --  
dmel-all-chromosome-r6.22_extract_ABCDEF_renamed.fasta **(This file used for downstream analysis)**

**_D. ananassae_**  
Reference: dana_pacbio_GCA_003285975.fa  
Final files: <tt> /projects/genetics/ellison_lab/nicole/mummer_projects/species/dana_new/ <tt>

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_3|
Muller_B |HiC_scaffold_5|
Muller_C |HiC_scaffold_4|
Muller_D |HiC_scaffold_7|
Muller_E |HiC_scaffold_6|
Muller_F |HiC_scaffold_2|

In [None]:
python extract_scaff_rename_muller_plus_other_scaffs.py -i dana_pacbio_GCA_003285975.FINAL.fasta -a HiC_scaffold_3 --rcMB HiC_scaffold_5 --rcMC HiC_scaffold_4 --rcMD HiC_scaffold_7 -e HiC_scaffold_6 -f HiC_scaffold_2 --o2 dana_muller_o.fasta -s dana_other_scaffolds.fasta

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dana_new/slurm-%A_%a.out

source activate MUMMER
promer --mum -b 10 -l 50 --coords -p promer_mum_dana_rcMCMD /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dana_muller_o.fasta

mummerplot -p promer_mum_dana_rcMCMD -t postscript promer_mum_dana_rcMCMD.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot promer_mum_dana_rcMCMD.gp
ps2pdf promer_mum_dana_rcMCMD.ps

**_D.biarmipes_**  
Reference: Dbia_r2_illumina_patched.fa  
Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dmel_dbia/<tt>

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_10|
Muller_B |HiC_scaffold_7|
Muller_C |HiC_scaffold_6|
Muller_D |HiC_scaffold_3|
Muller_E |HiC_scaffold_2|
Muller_F |HiC_scaffold_9|

In [None]:
python extract_scaff_rename_muller.py -i Dbia_r2_illumina_patched.scaffolds.fasta -o dbia_muller_o.fasta -t dbia_muller_o.fasta -a HiC_scaffold_10 -b HiC_scaffold_7 -c HiC_scaffold_6 -d HiC_scaffold_3 -e HiC_scaffold_2 -f HiC_scaffold_9 

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dana_new2/slurm-%A_%a.out

source activate MUMMER
promer --mum -b 10 -l 50 --coords -p promer_mum_dana_rcMBMCMD /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dana_muller_o.fasta

mummerplot -p promer_mum_dana_rcMBMCMD -t postscript promer_mum_dana_rcMBMCMD.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot promer_mum_dana_rcMBMCMD.gp
ps2pdf promer_mum_dana_rcMBMCMD.ps

**_D. eugracilis_**   
Reference: Deug_r2_illumina_patched.fa  
Final files: /projects/genetics/ellison_lab/nicole/mummer_projects/species/dmel_deug

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_4|
Muller_B |HiC_scaffold_6|
Muller_C |HiC_scaffold_7|
Muller_D |HiC_scaffold_13|
Muller_E |HiC_scaffold_12|
Muller_F |HiC_scaffold_10|

In [None]:
python extract_scaff_rename_muller.py -i Deug_r2_illumina_patched.scaffolds.fasta -o deug_muller_o.fasta -t deug_muller_o.fasta -a HiC_scaffold_4 -b HiC_scaffold_6 -c HiC_scaffold_7 -d HiC_scaffold_13 -e HiC_scaffold_12 -f HiC_scaffold_10 

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dmel_deug/slurm-%A_%a.out

source activate MUMMER
nucmer --mum -t 3 -c 400 -L 400 -b 100 -p nucmer_mum_deug /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta deug_muller_ordered.fasta

mummerplot -p nucmer_mum_deug -t postscript nucmer_mum_deug.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot nucmer_mum_deug.gp
ps2pdf nucmer_mum_deug.ps

**_D. fic_**   

Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dmel_dfic<tt>

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_6|
Muller_B |HiC_scaffold_1|
Muller_C |HiC_scaffold_13|
Muller_D |HiC_scaffold_5|
Muller_E |HiC_scaffold_11|
Muller_F |HiC_scaffold_7|

In [None]:
python extract_scaff_rename_muller.py -i Dfic_r2_illumina.reformat.scaffolds.fasta -o dfic_muller_o.fasta -t dfic_muller_o.fasta -a HiC_scaffold_6 -b HiC_scaffold_1 --rcMC HiC_scaffold_13 --rcMD HiC_scaffold_5 -e HiC_scaffold_11 -f HiC_scaffold_7 

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dmel_dfic/slurm-%A_%a.out

source activate MUMMER
promer --mum -b 10 -l 50 --coords -p promer_mum_dfic_rcMCMD /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dfic_muller_o.fasta

mummerplot -p promer_mum_dfic_rcMCMD -t postscript promer_mum_dfic_rcMCMD.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot promer_mum_dfic_rcMCMD.gp
ps2pdf promer_mum_dfic_rcMCMD.ps

**_D.takahashii_**   
Reference: Dtak_r2_illumina_patched.fa   
Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dmel_dtak/<tt>

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_2|
Muller_B |HiC_scaffold_6|
Muller_C |HiC_scaffold_7|
Muller_D |HiC_scaffold_10|
Muller_E |HiC_scaffold_4|
Muller_F |HiC_scaffold_8|

In [None]:
python extract_scaff_rename_muller.py -i Dtak_r2_illumina.reformat.scaffolds.fasta -o dtak_muller_o.fasta -t dtak_muller_o.fasta --rcMA HiC_scaffold_2 --rcMB HiC_scaffold_6 --rcMC HiC_scaffold_7 --rcMD HiC_scaffold_10 --rcME HiC_scaffold_4 --rcMF HiC_scaffold_8 

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dmel_dtak/slurm-%A_%a.out

source activate MUMMER
promer --mum -b 10 -l 50 --coords -p promer_mum_dtak /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dtak_muller_o.fasta

mummerplot -p promer_mum_dtak -t postscript promer_mum_dtak.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot promer_mum_dtak.gp
ps2pdf promer_mum_dtak.ps


**_D.yakuba_**  
Reference: Dyak.pass.minimap2.racon.x3.pilon.x3.fasta  
Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dmel_dyak/<tt>

|Muller element | Scaffold |
|--- | ---
Muller_A |HiC_Scaffold_1|
Muller_B |HiC_scaffold_5|
Muller_C |HiC_scaffold_6|
Muller_D |HiC_scaffold_4|
Muller_E |HiC_scaffold_3|
Muller_F |HiC_scaffold_2|

In [None]:
python extract_scaff_rename_muller.py -i Dyak.pass.minimap2.racon.x3.pilon.x3.FINAL.fasta -o dyak_muller_o.fasta --o2 dyak_muller_o.fasta -a HiC_scaffold_1 -b HiC_scaffold_5 -c HiC_scaffold_6 --rcMD HiC_scaffold_4 --rcME HiC_scaffold_3 -f HiC_scaffold_2

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dmel_dyak/slurm-%A_%a.out

nucmer --mum -t 3 -c 400 -L 400 -b 100 -p nucmer_mum_dyak_rcMCMDME /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dyak_muller_o.fasta

mummerplot -p nucmer_mum_dyak_rcMCMDME -t postscript nucmer_mum_dyak_rcMCMDME.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot nucmer_mum_dyak_rcMCMDME.gp
ps2pdf nucmer_mum_dyak_rcMCMDME.ps

**_D. elegans_**  
Reference: Dele_r2_illumina_patched.fa   
Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dmel_dele/<tt>

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_4|
Muller_B |HiC_scaffold_11|
Muller_C |HiC_scaffold_3|
Muller_D |HiC_scaffold_1|
Muller_E |HiC_scaffold_10|
Muller_F |HiC_scaffold_6|

In [None]:
python extract_scaff_rename_muller.py -i Dele_r2_illumina_patched.scaffolds.fasta -o dele_renamed.fasta -t dele_muller_o.fasta -a HiC_scaffold_4 --rcMB HiC_scaffold_11 -c HiC_scaffold_3 -d HiC_scaffold_1 -e HiC_scaffold_10 --rcMF HiC_scaffold_6 

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dmel_dele/slurm-%A_%a.out

source activate MUMMER
promer --mum -b 10 -l 50 --coords -p promer_mum_dele_rcMBMF /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dele_muller_o.fasta

mummerplot -p promer_mum_dele_rcMBMF -t postscript promer_mum_dele_rcMBMF.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot promer_mum_dele_rcMBMF.gp
ps2pdf promer_mum_dele_rcMBMF.ps


**_D. triauraria_**  
Reference: tria_20181107_r2_illumina.reformat.scaffolds.fasta
Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dmel_dtria/<tt>

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_5|
Muller_B |HiC_scaffold_1|
Muller_C |HiC_scaffold_2|
Muller_D |HiC_scaffold_7|
Muller_E |HiC_scaffold_8|
Muller_F |HiC_scaffold_6|

In [None]:
python extract_scaff_rename_muller.py -i tria_20181107_r2_illumina.reformat.scaffolds.fasta -o dtria_muller_o.fasta -t dtria_muller_o.fasta --rcMA HiC_scaffold_5 --rcMB HiC_scaffold_1 -c HiC_scaffold_2 -d HiC_scaffold_7 -e HiC_scaffold_8 -f HiC_scaffold_6 


In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dmel_dtria/slurm-%A_%a.out

source activate MUMMER
promer --mum -b 10 -l 50 --coords -p promer_mum_dtria_rcMAMB /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dtria_muller_o.fasta

mummerplot -p promer_mum_dtria_rcMAMB -t postscript promer_mum_dtria_rcMAMB.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot promer_mum_dtria_rcMAMB.gp
ps2pdf promer_mum_dtria_rcMAMB.ps

**_D. erecta_**  
Reference: dere_pacbio_GCA_003286155.scaffolds.fasta  
Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dere_new/<tt>

|Muller element | Scaffold |
|--- | ---
Muller_ A|HiC_Scaffold_2|
Muller_B |HiC_scaffold_7|
Muller_C |HiC_scaffold_6|
Muller_D |HiC_scaffold_3|
Muller_E |HiC_scaffold_4|
Muller_F |HiC_scaffold_5|

In [None]:
python extract_scaff_rename_muller.py -i dere_pacbio_GCA_003286155.FINAL.fasta -o dere_muller_o.fasta --o2 dere_muller_o.fasta --rcMA HiC_scaffold_2 -b HiC_scaffold_7 -c HiC_scaffold_6 --rcMD HiC_scaffold_3 --rcME HiC_scaffold_4 --rcMF HiC_scaffold_5

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dere_new/slurm-%A_%a.out

source activate MUMMER
nucmer --mum -t 3 -c 400 -L 400 -b 100 -p nucmer_mum_dere /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dere_muller_o.fasta

mummerplot -p nucmer_mum_dere -t postscript nucmer_mum_dere.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot nucmer_mum_dere.gp
ps2pdf nucmer_mum_dere.ps

**_D. simulans_**  
Reference: Dsim.pass.minimap2.racon.x3.pilon.x3.fasta
Final files: <tt>/projects/genetics/ellison_lab/nicole/mummer_projects/species/dsim_new/<tt>

|Muller element | Scaffold |
|--- | ---
Muller_A |HiC_Scaffold_1|
Muller_B |HiC_scaffold_7|
Muller_C |HiC_scaffold_6|
Muller_D |HiC_scaffold_2|
Muller_E |HiC_scaffold_4|
Muller_F |HiC_scaffold_5|

In [None]:
python extract_scaff_rename_muller.py -i Dsim.pass.minimap2.racon.x3.pilon.x3.FINAL.fasta -o dsim_muller_o.fasta --o2 dsim_muller_o.fasta --rcMA HiC_scaffold_1 --rcMB HiC_scaffold_7 --rcMC HiC_scaffold_6 -d HiC_scaffold_2 --rcME HiC_scaffold_4 -f HiC_scaffold_5 

In [None]:
#!/bin/bash

#SBATCH --partition=main    # Partition (job queue)
#SBATCH --job-name=mummer         # Assign an 8-character name to your job, no spaces
#SBATCH --nodes=1                # Number of compute nodes
#SBATCH --ntasks=1               # Processes (usually = cores) on each node
#SBATCH --cpus-per-task=28       # Threads per process (or per core)
#SBATCH --export=ALL             # Export you current environment settings to the job environment
#SBATCH --time=12:00:00
#SBATCH --mem=20G
#SBATCH --output=/scratch/nt365/mummer/mummer_projects/dmel_dsim_rcMAMBMCME/slurm-%A_%a.out

source activate MUMMER
nucmer --mum -t 3 -c 400 -L 400 -b 100 -p nucmer_mum_dsim_rcMAMBMCME /home/nt365/projects/mummer/script/dmel-all-chromosome-r6.22_extract_ABCDEF.fasta dsim_muller_o.fasta
<tt>**/projects/genetics/ellison_lab/nicole/mummer_projects/script/**<tt>

instructions for "extract_scaff_rename_muller.py"
Script is to rename HiC scaffolds from Drosophila species to appropriate Muller elements and reverse complement the scaffold if necessary.

Be sure to input two output files, the first will print the Muller elements in the order they are found and the second will be the reordered Muller elements (A,B,C,D,E,F).

in bash shell type:
python extract_scaff_rename_muller.py -h 

Output will be 
Options:
  -h, --help            show this help message and exit
  -i FASTA              scaffolds.fasta
  -o OUTFILE            output1.fasta
  -t OUTFILE2, --o2=OUTFILE2
                        output2.fasta - this will be the correctly ordered
                        file
  -a MA                 scaffold corresp. to Muller A
  -b MB                 scaffold corresp. to Muller B
  -c MC                 scaffold corresp. to Muller C
  -d MD                 scaffold corresp. to Muller D
  -e ME                 scaffold corresp. to Muller E
  -f MF                 scaffold corresp. to Muller F
  -u MAR, --rcMA=MAR    scaffold corresp. to Muller A - use to reverse
                        complement scaff
  -v MBR, --rcMB=MBR    scaffold corresp. to Muller B - use to reverse
                        complement scaff
  -w MCR, --rcMC=MCR    scaffold corresp. to Muller C - use to reverse
                        complement scaff
  -x MDR, --rcMD=MDR    scaffold corresp. to Muller D - use to reverse
                        complement scaff
  -y MER, --rcME=MER    scaffold corresp. to Muller E - use to reverse
                        complement scaff
  -z MFR, --rcMF=MFR    scaffold corresp. to Muller F - use to reverse
                        complement scaff


After running the script, input OUTFILE2 name into the <OUTFILE2> slot in the nucmer_slurm.sh script in the folder. Update prefix for nucmer output files, the folder for the SLURM output, and the nucmer parameters as necessary.  

mummerplot -p nucmer_mum_dsim_rcMAMBMCME -t postscript nucmer_mum_dsim.delta

export GNUPLOT_PS_DIR=/home/nt365/pkg/gnuplot-4.6.0/term/PostScript 
gnuplot nucmer_mum_dsim_rcMAMBMCME.gp
ps2pdf nucmer_mum_dsim_rcMAMBMCME.ps

4.29.2020   
Remade mummer projects species folder, used references by the name indicated in "-i" from the following locations:
Dana, Dere, Dyak, Dsim from my 3D-DNA output. Dana ref from /projects/genetics/ellison_lab/nicole/3D-DNA/3D-DNA_dana10/new_assembly
Others from corresponding folder within /projects/genetics/ellison_lab/nicole/3D-DNA/
Tiru assembled from corresponding species folder in /projects/genetics/ellison_lab/tiru_results
 

#### Python script to extract scaffolds and rename as Muller element. Can reverse complement scaffolds as well if necessary.

<tt>**/projects/genetics/ellison_lab/nicole/mummer_projects/script/**<tt>

instructions for "extract_scaff_rename_muller.py"
Script is to rename HiC scaffolds from Drosophila species to appropriate Muller elements and reverse complement the scaffold if necessary.

Be sure to input two output files, the first will print the Muller elements in the order they are found and the second will be the reordered Muller elements (A,B,C,D,E,F).

in bash shell type:
python extract_scaff_rename_muller.py -h 

Output will be 
Options:
  -h, --help            show this help message and exit
  -i FASTA              scaffolds.fasta
  -o OUTFILE            output1.fasta
  -t OUTFILE2, --o2=OUTFILE2
                        output2.fasta - this will be the correctly ordered
                        file
  -a MA                 scaffold corresp. to Muller A
  -b MB                 scaffold corresp. to Muller B
  -c MC                 scaffold corresp. to Muller C
  -d MD                 scaffold corresp. to Muller D
  -e ME                 scaffold corresp. to Muller E
  -f MF                 scaffold corresp. to Muller F
  -u MAR, --rcMA=MAR    scaffold corresp. to Muller A - use to reverse
                        complement scaff
  -v MBR, --rcMB=MBR    scaffold corresp. to Muller B - use to reverse
                        complement scaff
  -w MCR, --rcMC=MCR    scaffold corresp. to Muller C - use to reverse
                        complement scaff
  -x MDR, --rcMD=MDR    scaffold corresp. to Muller D - use to reverse
                        complement scaff
  -y MER, --rcME=MER    scaffold corresp. to Muller E - use to reverse
                        complement scaff
  -z MFR, --rcMF=MFR    scaffold corresp. to Muller F - use to reverse
                        complement scaff


After running the script, input OUTFILE2 name into the <OUTFILE2> slot in the nucmer_slurm.sh script in the folder. Update prefix for nucmer output files, the folder for the SLURM output, and the nucmer parameters as necessary.  


In [None]:
#/projects/genetics/ellison_lab/nicole/mummer_projects/script/
from optparse import OptionParser
from Bio import SeqIO

parser = OptionParser()
parser.add_option("-i", dest="fasta", type="string", action="store", help = "scaffolds.fasta")
parser.add_option("-o", dest="outfile", type = "string", action = "store", help = "output1.fasta") 
parser.add_option("-t","--o2", dest="outfile2", type = "string", action = "store", help = "output2.fasta - this will be the correctly ordered file")
parser.add_option("-a", dest="MA", type = "string", help = "scaffold corresp. to Muller A")
parser.add_option("-b", dest="MB", type = "string", help = "scaffold corresp. to Muller B")
parser.add_option("-c", dest="MC", type = "string", help = "scaffold corresp. to Muller C")
parser.add_option("-d", dest="MD", type = "string", help = "scaffold corresp. to Muller D")
parser.add_option("-e", dest="ME", type = "string", help = "scaffold corresp. to Muller E")
parser.add_option("-f", dest="MF", type = "string", action = "store", help = "scaffold corresp. to Muller F")
parser.add_option("-u","--rcMA", dest="MAR", type = "string", action = "store", help = "scaffold corresp. to Muller A - use to reverse complement scaff")
parser.add_option("-v", "--rcMB", dest="MBR", type = "string", action = "store", help = "scaffold corresp. to Muller B - use to reverse complement scaff")
parser.add_option("-w", "--rcMC", dest="MCR", type = "string", action = "store", help = "scaffold corresp. to Muller C - use to reverse complement scaff")
parser.add_option("-x", "--rcMD", dest="MDR", type = "string", action = "store", help = "scaffold corresp. to Muller D - use to reverse complement scaff")
parser.add_option("-y", "--rcME", dest="MER", type = "string", action = "store", help = "scaffold corresp. to Muller E - use to reverse complement scaff")
parser.add_option("-z", "--rcMF", dest="MFR", type = "string", action = "store", help = "scaffold corresp. to Muller F - use to reverse complement scaff")

(options, args) = parser.parse_args()
f = open(options.outfile, 'w')

for record in SeqIO.parse(options.fasta, "fasta"):
	if record.id == options.MA:
		MA = record	
		f.write(">" + "Muller_A" + '\n' + str(MA.seq) + '\n')
	elif record.id == options.MB:
		MB = record
		f.write(">" + "Muller_B" + '\n' + str(MB.seq) + '\n')
	elif record.id == options.MC:
		MC = record
		f.write(">" + "Muller_C" + '\n' + str(MC.seq) + '\n')
	elif record.id == options.MD:
		MD = record
		f.write(">" + "Muller_D" + '\n' + str(MD.seq) + '\n')
	elif record.id == options.ME:
		ME = record
		f.write(">" + "Muller_E" + '\n' + str(ME.seq) + '\n')
	elif record.id == options.MF:
		MF = record
		f.write(">" + "Muller_F" + '\n' + str(MF.seq) + '\n')

for record in SeqIO.parse(options.fasta, "fasta"):
        if record.id == options.MAR:
                MAR = record
                f.write(">" + "Muller_A" + '\n' + str(MAR.seq.reverse_complement()) + '\n')
        elif record.id == options.MBR:
                MBR = record
                f.write(">" + "Muller_B" + '\n' + str(MBR.seq.reverse_complement()) + '\n')
        elif record.id == options.MCR:
                MCR = record
                f.write(">" + "Muller_C" + '\n' + str(MCR.seq.reverse_complement()) + '\n')
        elif record.id == options.MDR:
                MDR = record
                f.write(">" + "Muller_D" + '\n' + str(MDR.seq.reverse_complement()) + '\n')
        elif record.id == options.MER:
                MER = record
                f.write(">" + "Muller_E" + '\n' + str(MER.seq.reverse_complement()) + '\n')
        elif record.id == options.MFR:
                MFR = record
                f.write(">" + "Muller_F" + '\n' + str(MFR.seq.reverse_complement()) + '\n')

f.close()

r = open(options.outfile2, 'w')

for record in SeqIO.parse(options.outfile, "fasta"):
	if record.id == "Muller_A":
		A = record
	if record.id == "Muller_B":
		B = record
	if record.id == "Muller_C":
                C = record
	if record.id == "Muller_D":
		D = record
	if record.id == "Muller_E":
                E = record
	if record.id == "Muller_F":
                F = record



r.write(">" + str(A.id) + '\n' + str(A.seq) + '\n')
r.write(">" + str(B.id) + '\n' + str(B.seq) + '\n')
r.write(">" + str(C.id) + '\n' + str(C.seq) + '\n')   
r.write(">" + str(D.id) + '\n' + str(D.seq) + '\n')    
r.write(">" + str(E.id) + '\n' + str(E.seq) + '\n')
r.write(">" + str(F.id) + '\n' + str(F.seq) + '\n')
r.close()