Looping a script #325

peflanag · 2019-10-29T10:22:45Z

Hi,

I have quite a few file I want to run through Snippy. I made a file that I made executable using chmod u+x and then ran ./MyFile

However it seems to have failed saying "unknown option: mask" I have copied my executable below. It would be great if someone could tell me what I'm doing wrong.

Cheers.

#/bin/bash

sampleLoc=/home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Fastq
sampleName=( "IMRL28" "IMRL29" "IMRL30" "IMRL137" "IMRL138" "IMRL39" "IMRL140" "IMRL141" "IMRL142" "MABC143" "IMRL144" "IMRL145" "IMRL146" "IMRL147" "IMRL148" "IMRL149" "IMRL150" "IMRL151" "IMRL152" "IMRL153" "IMRL154" "IMRL156" "IMRL157" "IMRL158" "IMRL159" "IMRL160" "IMRL161" "IMRL162" "IMRL163" "IMRL164" "IMRL165" "IMRL167" "IMRL168" "IMRL169" "IMRL170" "IMRL171" "IMRL172" "IMRL173" "IMRL174" "IMRL175" "IMRL176" "IMRL177" "IMRL178" "IMRL179" "IMRL180" "IMRL181" "IMRL182" "IMRL183" "IMRL184" "IMRL185" "IMRL186" "IMRL187" "IMRL188" "IMRL189" "IMRL190" "IMRL190" "IMRL191" "IMRL192" "IMRL193" "IMRL194" "IMRL194" "IMRL195" "IMRL196" "IMRL197" "IMRL198" "IMRL199" "IMRL200" "IMRL201" "IMRL202" "IMRL203" "IMRL204" "IMRL205" "IMRL206" "IMRL207" "IMRL208" "IMRL209" "IMRL210" "IMRL211" "IMRL212" "IMRL213" "IMRL214" "IMRL215" "IMRL216" "IMRL217" "IMRL219" "IMRL220" "IMRL221" "IMRL222" "IMRL223" "IMRL225" "IMRL226" "IMRL228" "IMRL232" "IMRL233" "IMRL234" "IMRL235" "IMRL236" "IMRL237" "IMRL238" "IMRL239" "IMRL241" "IMRL243" "IMRL244" "IMRL251" "IMRL252" "IMRL253" "IMRL254" )

for sample in ${sampleName[*]}

do

snippy --outdir /home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Snippy_Output/ --ref /home/linux-biostation/Documents/Peter_F_Work/H37rv_Ref3.fasta --mask /home/linux-biostation/Documents/Peter_F_Work/Mtb_NC_000962.3_mask.bed --R1 $sampleLoc/${sample}R1.fastq.gz --R2 $sampleLoc/${sample}R2.fastq.gz

done

peflanag · 2019-10-29T10:26:31Z

Actually I just realised I dont add --mask to the snippy command first I use it after I run snippy! Sorry!

peflanag · 2019-10-29T11:11:59Z

Actually I need to reopen this. How do I tell snippy to add the output of each sample to the same output folder? It wants to replace the first file with the subsequent ones. Is there anyway to tell it in the script above to just keep adding?

Cheers

peflanag · 2019-10-29T11:29:19Z

Actually, figured it out. I use ${sample} as the end path in my --outdir command!

tseemann · 2019-10-29T20:49:02Z

@peflanag have you considered using snippy-multi ?

https://github.com/tseemann/snippy/blob/master/README.md#using-snippy-multi

Also make sure you are using snippy >= 4.4

peflanag · 2019-10-30T07:56:19Z

Hi @tseemann I haven't cause I had a .sh file with the names of the files in it so for ease of hand I didn't want to rewrite everything in excel to make the tab file. But also, I'm kind of uncertain how I go about it. the manual says the isolates have to be labelled Isolate1 Isolate1b and so on. But I have more samples than the number of the alphabet, over 120 samples. So, do i go from ...Isolate1z to Isolate2 Isolate2b...? But on the manual the Isolate2 seems to correspond to single end reads.

I'm basically trying to compare MTBseq and Snippy but cant seem to get either to finish locally or on a college cluster. I'm running snippy 4.4.3

Cheers,

P

tseemann · 2019-10-30T08:16:38Z

You do not need to name the isolates that way.
The documentation is just an example.
Usually we write scripts to generate the input.tab file automatically.

peflanag · 2019-10-30T08:20:20Z

Oh right. So I just write the "name" and then "path" in excel for each sample? I didn't know you can write scripts to generate the input.tab file automatically. Is that something you can share? Or is there somewhere online I can look to explain that? I'm not really tech savvy when it comes to writing scripts. I'm a microbiologist!

Cheers,

P

tseemann · 2019-10-30T08:27:19Z

For your case, you need a 3 column Excel spreadsheet

ID
full path to R1
full path to R2

Save it as 'tab delimited text' (usually .txt extension)
Once in Unix you may have to run dos2unix on it to fix the line endings.
Good luck.

PS. there is no magic script. you could achieve the same by using excel functions probably, as long as the file names are named consistently.

peflanag · 2019-10-30T08:30:30Z

Cheers. Just wondering dos2unix, is that if I am using windows? I'm using a Mac

peflanag · 2019-10-30T08:31:36Z

Oh and do I put the title names ID etc in the tops of the columns or leave them out?

tseemann · 2019-10-30T08:47:33Z

No titles. Just like the docs show.

peflanag · 2019-10-30T08:48:08Z

cheers!

peflanag · 2019-10-30T09:48:28Z

Hey @tseemann sorry to bother you again. I've made the tab file and I ran the script but it said the sample was an unreadable file. Any idea what might be wrong?

Cheers,

P

ClusterIDs.txt

peflanag · 2019-10-30T09:51:15Z

I should point out that this is the first file in the folder so I'm guessing all the others will have the same errors

peflanag · 2019-10-31T13:50:06Z

Hi @tseemann just wondering if you happened to see the message above?

All the best,

Peter

tseemann · 2019-10-31T19:57:59Z

Snippy can't read that file. Usually this is

a unix permissions problem (on that file or any of the directories enclosing it)
a bad symlink
trying to run on Windows 10 Linux Subsystem
a file 0 bytes long

what does stat <FULL PATH TO THE FILE> say?
what does snippy --version say?

peflanag · 2019-11-01T07:33:06Z

Hi @tseemann

I ran the stat /Path/To/File and attached a screenshot below. Snippy is running v4.4.3

I'm not sure what a Windows Linux Subsystem is. I have a macOS machine and a Ubuntu machine. Any of the PCs in work are restricted and I don't have command line access on them. I made the file on excel on my mac, saved as tab delimitated which made a text file and I copied it to the Linux system to run.

tseemann · 2019-11-03T05:55:09Z

I mean the "R2" file that it says is "unreadable", not the .txt file.
If you created the file in MacOS, you will need to do mac2unix on the Linux machine to fix the file endings. That is probably the cause of your error.

peflanag · 2019-11-04T10:25:37Z

Hi @tseemann I ran mac2unix but it didnt work. I have attached a screenshot of the MiniSeq compressed fastq file that I opened. I'm not sure how it should look for snippy though.

tseemann · 2019-11-05T02:17:39Z

No, i mean run mac2unix on your tab-separated file you are giving to snippy-multi

peflanag · 2019-11-05T06:21:03Z

I ran “mac2unix tab.file” and it didn’t work. The screenshot is just to show you what the compressed R1 file looks like in case there’s something wrong with that?

tseemann · 2019-11-06T01:50:34Z

It looks fine.
Maybe tell me

how you installed snippy
exactly the input files and commands you are running

peflanag · 2019-11-06T09:09:40Z

Hey @tseemann so i did the brew install version of snippy thats on the github page and then when running I ran this command,

snippy-multi Path/To/Tab.text --ref Path/To/Ref --mask Path/To/mask.bed > runme.sh

and I also tried without the --mask command.

I appreciate the help.

tseemann · 2019-11-09T09:22:11Z

Ok, i can see cr lf at the end of each line.
This means you are still in Windows/DOS text format.
Please run dos2unix XXXXXXX on that file so it is in Unix text format.

peflanag · 2019-11-11T08:47:16Z

Hey @tseemann so I tried that but again it seems to fail! I don't know what I'm doing wrong? should I try making the tab file on the linux machine instead?

peflanag · 2019-11-11T08:52:53Z

So it turns out making the file on the Linux machine works but know I have a new error! Any ideas what this means? I don't know what it means by missing read/contig data.

Cheers,

P

tseemann · 2019-11-11T20:20:51Z

You seem to have created a CSV file (comma separated) not TSV (tab separated).

This hack might fix it:

tr "," "\t" < ClusterIDs.csv > ClusterIDs.tab

peflanag · 2019-11-11T20:46:23Z

Hey @tseemann so I tried that but it still hasn't worked. I have attached a screenshot below. I have received a new iMac Pro that is solely for bioinformatics since the linux machine I am using is quite old. Once I set it up I am going to try that tomorrow and hope for the best!

But I thought I would attached the screenshot below to see if I am doing it right!

P

tseemann · 2019-11-11T20:56:08Z

You seem to have your old problem back.
I can't read screenshots very well.
Can you cut + paste the errors from now on please?

peflanag · 2019-11-11T21:03:50Z

Sure, its this:

~$ snippy-multi '/home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/ClusterIDs.tab' --ref '/home/linux-biostation/Documents/Peter_F_Work/H37rv_Ref3.fasta' --mask '/home/linux-biostation/Documents/Peter_F_Work/Mtb_NC_000962.3_mask.bed' > runme.sh
Reading: /home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/ClusterIDs.tab
ERROR: [IMRL28] unreadable file '/home/linux-biostation/Documents/Peter_F_Work/IMRL_Cluster_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz'
(base) linux-biostation@Linux-BioStation:~$

peflanag · 2019-11-11T21:04:30Z

I don't know why it has a line through it. I am logged into the linux machine from home so trying to paste it from there

peflanag · 2019-11-11T21:07:54Z

When I ran this command,

tr "," "\t" < ClusterIDs.csv > ClusterIDs.tab

there was no error. it just returned back to (base) linux-biostation@Linux-BioStation:$

tseemann · 2019-11-12T01:03:13Z

What does this say

ls -lsa /home/linux-biostation/Documents/Peter_F_Work/IMRL_Cluster_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz

peflanag · 2019-11-12T11:14:45Z

Hey @tseemann I ran the the script above and got this:

(base) linux-biostation@Linux-BioStation:~$ ls -lsa '/home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz' 
157236 -rw-r--r-- 1 linux-biostation linux-biostation 161006911 Oct 29 10:00 /home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz
(base) linux-biostation@Linux-BioStation:~$

Not sure what it means. I've attached a screenshot cause it was highlighted in red.

tseemann · 2019-11-13T01:57:24Z

The file seems ok. It's red because it's gzip compressed I think.
Did you remember to run mac2unix and dos2unix on the file you are providing as --input ?

peflanag · 2019-11-13T06:17:33Z

I’m running it on a Mac this time so I didn’t run mac2unix or dox2unix

tseemann · 2019-11-13T08:08:57Z

You still need to run mac2unix as you are running it on the BSD Unix system underneath the hood of MacOS.

Details: basically all your text files must have nl at the end of each line. Not a cr and not cr nl.

peflanag · 2019-11-13T08:39:20Z

HI @tseemann cheers for that! I didnt know I still had to do on a Mac. I tried mac2unix "file" and dos2unix" on the file on the mac but got the reply

-bash: mac2unix: command not found

and the same for dos2unix. Do I have to install something to run it?

Cheers

tseemann · 2019-11-13T08:51:26Z

Yes, you need to install those tools somehow (assuming Nullarbor is still not working). I don't know if they are in Conda or not. Sorry. Or copy the file to a Linux system and run it there.

peflanag · 2019-11-13T08:58:46Z

So I managed to install it through Homebrew and ran the snippy command again but I'm still getting the same error. I have pasted it below. Also when I ran od -a file | head -n 20 it looks like the mac2unix or dos2unix didnt work

(base) med176028:~ peterflanagan$ brew install dos2unix
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Formulae
cf-tool
==> Updated Formulae
abcmidi                azure-cli              dependency-check       flow                   highlight              lazygit                now-cli                suil                   vfuse
ansifilter             bitrise                dvc                    goreleaser             jfrog-cli-go           mariadb-connector-c    scc                    telegraf               xa
aws-cdk                bullet                 embulk                 haxe                   just                   mdbook                 sord                   tokei                  yadm
aws-google-auth        cheat                  exploitdb              hcloud                 kubernetes-helm        netlify-cli            starship               ungit

==> Downloading https://homebrew.bintray.com/bottles/dos2unix-7.4.1.catalina.bottle.tar.gz
######################################################################## 100.0%
==> Pouring dos2unix-7.4.1.catalina.bottle.tar.gz
🍺  /usr/local/Cellar/dos2unix/7.4.1: 24 files, 370.0KB
(base) med176028:~ peterflanagan$ mac2unix /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab 
mac2unix: converting file /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab to Unix format...
(base) med176028:~ peterflanagan$ conda activate Snippy
(Snippy) med176028:~ peterflanagan$ snippy --check
[08:51:14] This is snippy 4.4.5
[08:51:14] Written by Torsten Seemann
[08:51:14] Obtained from https://github.com/tseemann/snippy
[08:51:14] Detected operating system: darwin
[08:51:14] Enabling bundled darwin tools.
[08:51:14] Found bwa - /Users/peterflanagan/miniconda3/envs/Snippy/bin/bwa
[08:51:14] Found bcftools - /Users/peterflanagan/miniconda3/envs/Snippy/bin/bcftools
[08:51:14] Found samtools - /Users/peterflanagan/miniconda3/envs/Snippy/bin/samtools
[08:51:14] Found java - /Users/peterflanagan/miniconda3/envs/Snippy/bin/java
[08:51:14] Found snpEff - /Users/peterflanagan/miniconda3/envs/Snippy/bin/snpEff
[08:51:14] Found samclip - /Users/peterflanagan/miniconda3/envs/Snippy/bin/samclip
[08:51:14] Found seqtk - /Users/peterflanagan/miniconda3/envs/Snippy/bin/seqtk
[08:51:14] Found parallel - /Users/peterflanagan/miniconda3/envs/Snippy/bin/parallel
[08:51:14] Found freebayes - /Users/peterflanagan/miniconda3/envs/Snippy/bin/freebayes
[08:51:14] Found freebayes-parallel - /Users/peterflanagan/miniconda3/envs/Snippy/bin/freebayes-parallel
[08:51:14] Found fasta_generate_regions.py - /Users/peterflanagan/miniconda3/envs/Snippy/bin/fasta_generate_regions.py
[08:51:14] Found vcfstreamsort - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vcfstreamsort
[08:51:14] Found vcfuniq - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vcfuniq
[08:51:14] Found vcffirstheader - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vcffirstheader
[08:51:14] Found gzip - /usr/bin/gzip
[08:51:14] Found vt - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vt
[08:51:14] Found snippy-vcf_to_tab - /Users/peterflanagan/miniconda3/envs/Snippy/bin/snippy-vcf_to_tab
[08:51:14] Found snippy-vcf_report - /Users/peterflanagan/miniconda3/envs/Snippy/bin/snippy-vcf_report
[08:51:14] Checking version: samtools --version is >= 1.7 - ok, have 1.9
[08:51:14] Checking version: bcftools --version is >= 1.7 - ok, have 1.9
[08:51:14] Checking version: freebayes --version is >= 1.1 - ok, have 1.3
[08:51:14] Checking version: java -version is >= 1.8 - ok, have 1.8
[08:51:15] Checking version: snpEff -version is >= 4.3 - ok, have 4.3
[08:51:15] Checking version: bwa is >= 7.12 - ok, have 7.17
[08:51:15] Dependences look good!
(Snippy) med176028:~ peterflanagan$ snippy-multi /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab --ref /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/H37rv_Ref3.fasta --mask /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/Mtb_NC_000962.3_mask.bed --cpus 16 > runme.sh
Reading: /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab
'RROR: [IMRL28] unreadable file '/Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/Fastq/IMRL28_S15_L001_R2.fastq.gz
(Snippy) med176028:~ peterflanagan$ 
(Snippy) med176028:~ peterflanagan$

peflanag · 2019-11-13T09:04:36Z

I also tried it on the linux machine and I don't think it worked.

(base) linux-biostation@Linux-BioStation:~$ mac2unix '/home/linux-biostation/Desktop/MacClusterIDs.txt' 
mac2unix: converting file /home/linux-biostation/Desktop/MacClusterIDs.txt to Unix format ...
(base) linux-biostation@Linux-BioStation:~$ od -a '/home/linux-biostation/Desktop/MacClusterIDs.txt' | head -n 20
0000000   I   M   R   L   2   8  ht   /   h   o   m   e   /   l   i   n
0000020   u   x   -   b   i   o   s   t   a   t   i   o   n   /   D   o
0000040   c   u   m   e   n   t   s   /   P   e   t   e   r   _   F   _
0000060   W   o   r   k   /   I   M   R   L   _   C   l   u   s   t   e
0000100   r   _   S   n   i   p   p   y   /   F   a   s   t   q   /   I
0000120   M   R   L   2   8   _   S   1   5   _   L   0   0   1   _   R
0000140   1   .   f   a   s   t   q   .   g   z  ht   /   h   o   m   e
0000160   /   l   i   n   u   x   -   b   i   o   s   t   a   t   i   o
0000200   n   /   D   o   c   u   m   e   n   t   s   /   P   e   t   e
0000220   r   _   F   _   W   o   r   k   /   I   M   R   L   _   C   l
0000240   u   s   t   e   r   _   S   n   i   p   p   y   /   F   a   s
0000260   t   q   /   I   M   R   L   2   8   _   S   1   5   _   L   0
0000300   0   1   _   R   2   .   f   a   s   t   q   .   g   z  nl   I
0000320   M   R   L   2   9  ht   /   h   o   m   e   /   l   i   n   u
0000340   x   -   b   i   o   s   t   a   t   i   o   n   /   D   o   c
0000360   u   m   e   n   t   s   /   P   e   t   e   r   _   F   _   W
0000400   o   r   k   /   I   M   R   L   _   C   l   u   s   t   e   r
0000420   _   S   n   i   p   p   y   /   F   a   s   t   q   /   I   M
0000440   R   L   2   9   _   S   9   _   L   0   0   1   _   R   1   .
0000460   f   a   s   t   q   .   g   z  ht   /   h   o   m   e   /   l
(base) linux-biostation@Linux-BioStation:~$

tseemann · 2019-11-13T21:57:12Z

The file looks fine on the Unix.

Maybe I am wrong and you need to leave it Mac format when running on a Mac?
I am very confused.

Can you run snippy by itself on just ONE sample anywhere?

peflanag · 2019-11-14T07:45:43Z

Hey @tseemann so I ran that IMRL28 sample on the linux environment and it ran fine but when I go to run the snippy-multi it just doesn't like it! I haven't tried on the iMac Pro yet because I just discovered last night when trying to run MTBseq that with the macOS Catalina update, Apple seems to have done something where root permissions are only read and not write and data is on a separate "virtual?" HD. I don't quite understand it but from what I've been reading its messing up conda and conda envs. Needless to say I have wiped the iMac Pro to start everything from scratch.

But I dont understand why the file wont work on the Linux machine. I have about 125 samples to run. Just to check, I'm using excel to make the file as attached. Then I click say as and select tab-deliminated which saves as a text file then I'm using mac2unix "file" or dos2unix -c mac "file"

MacClusterIDs.xlsx

MacClusterIDs.txt

tseemann · 2019-11-14T21:28:26Z

If you are running it on MacOS can you try NOT using mac2unix?
Also, it would probably just be easier if you used nullarbor instead of manual snippy work?

conda create -n nullarbor_env nullarbor
conda activate nullarbor
nullarbor.pl --ref XXX --input MacClusterIDs.txt --name IMRL --outdir nullarbor
# etc

peflanag · 2019-11-14T21:32:09Z

Hey, I’ll try that tomorrow. I’ll save the excel file as a tab-delimimated file which will be .txt and run snippy-multi. If that doesn’t work then I’ll try nullarbor. I was having issues with conda after the macOS Catalina update so hopefully it’ll install. Currently running snippy manually in each sample!

peflanag · 2019-11-19T10:22:20Z

hi @tseemann so i tried what you suggested above and it still doesn't work. This is the print out of my terminal:

(nullarbor) peterflanagan@med176028 Clusters_Snippy % nullarbor.pl --ref /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/H37rv_Ref3.fasta --input /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/MacClusters.txt --name IMRL_Clusters --outdir /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Output
[10:17:24] Hello peterflanagan
[10:17:24] This is nullarbor.pl 2.0.20191013
[10:17:24] Send complaints to Torsten Seemann
[10:17:24] Scanning --ref for problematic sequence IDs...
[10:17:24] Using reference genome: /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/H37rv_Ref3.fasta
[10:17:24] ERROR: Isolate 'IMRL28' - can not read sequence #2 of 1 files: '/Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R2.fastq.g'
(nullarbor) peterflanagan@med176028 Clusters_Snippy %

I have however managed to write a .sh loop to run snippy and it works. The only problem is it doesnt generate a .vcf file and I dont know why? So I can't run snippy-core afterwards. I have attached teh .sh loop that I wrote. Maybe I left something out? If I run snippy manually on a sample then it makes the vcf file.

Peter

IMRL_Clusters.txt

tseemann · 2019-11-24T04:20:05Z

Your shell script seems fine. Are you saying NO .vcf files are generated at all when you run Snippy in that shell loop?
What does the {sample_name}/snippy.log file say?

peflanag · 2019-11-24T10:08:31Z

Yup! It generates everything in the folders for each sample except a .vcf file

tseemann · 2019-11-24T23:07:14Z

What does the {sample_name}/snippy.log file say?

peflanag · 2019-11-25T08:24:59Z

Hey @tseemann so theres no snippy.log file but there is a snps.log file? I have attached it below and a screenshot of what is in the folder.

echo snippy 4.4.5

cd /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy

/usr/local/bin/snippy --cpus 16 --outdir /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Output/IMRL28 --ref /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/H37rv_Ref3.fasta --R1 /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz -R2 /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R2.fastq.gz

samtools faidx reference/ref.fa

bwa index reference/ref.fa

[bwa_index] Pack FASTA... 0.03 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.86 seconds elapse.
[bwa_index] Update BWT... 0.03 sec
[bwa_index] Pack forward-only FASTA... 0.02 sec
[bwa_index] Construct SA from BWT and Occ... 0.28 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index reference/ref.fa
[main] Real time: 1.224 sec; CPU: 1.221 sec

mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa

ln -sf reference/ref.fa .

ln -sf reference/ref.fa.fai .

mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz

bwa mem -Y -M -R '@rg\tID:IMRL28\tSM:IMRL28' -t 16 reference/ref.fa /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R2.fastq.gz | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /private/var/folders/yw/9tmv96yx73ld2r712hclgjv00000gn/T --threads 15 -m 266M | samtools fixmate -m - - | samtools sort -l 0 -T /private/var/folders/yw/9tmv96yx73ld2r712hclgjv00000gn/T --threads 15 -m 266M | samtools markdup -T /private/var/folders/yw/9tmv96yx73ld2r712hclgjv00000gn/T -r -s - - > snps.bam

READ 3922924 WRITTEN 3768726
EXCLUDED 123679 EXAMINED 3799245
PAIRED 3766766 SINGLE 32479
DULPICATE PAIR 130732 DUPLICATE SINGLE 23466
DUPLICATE TOTAL 154198

samtools index snps.bam

fasta_generate_regions.py reference/ref.fa.fai 144681 > reference/ref.txt

File "/usr/local/bin/fasta_generate_regions.py", line 7
print "usage: ", sys.argv[0], " "
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("usage: ", sys.argv[0], " ")?

tseemann · 2019-11-25T21:58:46Z

You have a problem with python2 vs python3.

I think the only way you will get this working is to create a brand new miniconda3 installaiton on linux and install snippy from bioconda.

follow these precisely: https://bioconda.github.io/user/install.html#install-conda

peflanag · 2019-11-25T22:03:22Z

Cool I’ll uninstall miniconda 3 tomorrow and reinstall as above and test!

peflanag · 2019-11-26T09:17:11Z

@tseemann you're a legend! I dont want to jinx it but it seems to be working and making all the required files in the folders now!

Thanks for your help!

peflanag closed this as completed Oct 29, 2019

peflanag reopened this Oct 29, 2019

peflanag closed this as completed Oct 29, 2019

tseemann self-assigned this Oct 29, 2019

tseemann added the question label Oct 29, 2019

peflanag reopened this Oct 31, 2019

tseemann closed this as completed Nov 3, 2019

Looping a script #325

Looping a script #325

Comments

peflanag commented Oct 29, 2019

peflanag commented Oct 29, 2019

peflanag commented Oct 29, 2019

peflanag commented Oct 29, 2019

tseemann commented Oct 29, 2019 • edited

peflanag commented Oct 30, 2019

tseemann commented Oct 30, 2019

peflanag commented Oct 30, 2019

tseemann commented Oct 30, 2019 • edited

peflanag commented Oct 30, 2019

peflanag commented Oct 30, 2019

tseemann commented Oct 30, 2019

peflanag commented Oct 30, 2019

peflanag commented Oct 30, 2019

peflanag commented Oct 30, 2019

peflanag commented Oct 31, 2019

tseemann commented Oct 31, 2019

peflanag commented Nov 1, 2019

tseemann commented Nov 3, 2019

peflanag commented Nov 4, 2019

tseemann commented Nov 5, 2019

peflanag commented Nov 5, 2019 • edited

tseemann commented Nov 6, 2019

peflanag commented Nov 6, 2019

tseemann commented Nov 9, 2019

peflanag commented Nov 11, 2019

peflanag commented Nov 11, 2019

tseemann commented Nov 11, 2019

peflanag commented Nov 11, 2019

tseemann commented Nov 11, 2019

peflanag commented Nov 11, 2019 • edited by tseemann

peflanag commented Nov 11, 2019

peflanag commented Nov 11, 2019 • edited

tseemann commented Nov 12, 2019

peflanag commented Nov 12, 2019 • edited by tseemann

tseemann commented Nov 13, 2019

peflanag commented Nov 13, 2019

tseemann commented Nov 13, 2019

peflanag commented Nov 13, 2019

tseemann commented Nov 13, 2019

peflanag commented Nov 13, 2019 • edited by tseemann

peflanag commented Nov 13, 2019 • edited by tseemann

tseemann commented Nov 13, 2019

peflanag commented Nov 14, 2019

tseemann commented Nov 14, 2019

peflanag commented Nov 14, 2019

peflanag commented Nov 19, 2019

tseemann commented Nov 24, 2019 • edited

peflanag commented Nov 24, 2019

tseemann commented Nov 24, 2019

peflanag commented Nov 25, 2019

echo snippy 4.4.5

cd /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy

samtools faidx reference/ref.fa

bwa index reference/ref.fa

mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa

ln -sf reference/ref.fa .

ln -sf reference/ref.fa.fai .

mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz

samtools index snps.bam

fasta_generate_regions.py reference/ref.fa.fai 144681 > reference/ref.txt

tseemann commented Nov 25, 2019

peflanag commented Nov 25, 2019

peflanag commented Nov 26, 2019

tseemann commented Oct 29, 2019 •

edited

tseemann commented Oct 30, 2019 •

edited

peflanag commented Nov 5, 2019 •

edited

peflanag commented Nov 11, 2019 •

edited by tseemann

peflanag commented Nov 11, 2019 •

edited

peflanag commented Nov 12, 2019 •

edited by tseemann

peflanag commented Nov 13, 2019 •

edited by tseemann

peflanag commented Nov 13, 2019 •

edited by tseemann

tseemann commented Nov 24, 2019 •

edited