Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Looping a script #325

Closed
peflanag opened this issue Oct 29, 2019 · 57 comments
Closed

Looping a script #325

peflanag opened this issue Oct 29, 2019 · 57 comments
Assignees
Labels

Comments

@peflanag
Copy link

Hi,

I have quite a few file I want to run through Snippy. I made a file that I made executable using chmod u+x and then ran ./MyFile

However it seems to have failed saying "unknown option: mask" I have copied my executable below. It would be great if someone could tell me what I'm doing wrong.

Cheers.

#/bin/bash

sampleLoc=/home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Fastq
sampleName=( "IMRL28" "IMRL29" "IMRL30" "IMRL137" "IMRL138" "IMRL39" "IMRL140" "IMRL141" "IMRL142" "MABC143" "IMRL144" "IMRL145" "IMRL146" "IMRL147" "IMRL148" "IMRL149" "IMRL150" "IMRL151" "IMRL152" "IMRL153" "IMRL154" "IMRL156" "IMRL157" "IMRL158" "IMRL159" "IMRL160" "IMRL161" "IMRL162" "IMRL163" "IMRL164" "IMRL165" "IMRL167" "IMRL168" "IMRL169" "IMRL170" "IMRL171" "IMRL172" "IMRL173" "IMRL174" "IMRL175" "IMRL176" "IMRL177" "IMRL178" "IMRL179" "IMRL180" "IMRL181" "IMRL182" "IMRL183" "IMRL184" "IMRL185" "IMRL186" "IMRL187" "IMRL188" "IMRL189" "IMRL190" "IMRL190" "IMRL191" "IMRL192" "IMRL193" "IMRL194" "IMRL194" "IMRL195" "IMRL196" "IMRL197" "IMRL198" "IMRL199" "IMRL200" "IMRL201" "IMRL202" "IMRL203" "IMRL204" "IMRL205" "IMRL206" "IMRL207" "IMRL208" "IMRL209" "IMRL210" "IMRL211" "IMRL212" "IMRL213" "IMRL214" "IMRL215" "IMRL216" "IMRL217" "IMRL219" "IMRL220" "IMRL221" "IMRL222" "IMRL223" "IMRL225" "IMRL226" "IMRL228" "IMRL232" "IMRL233" "IMRL234" "IMRL235" "IMRL236" "IMRL237" "IMRL238" "IMRL239" "IMRL241" "IMRL243" "IMRL244" "IMRL251" "IMRL252" "IMRL253" "IMRL254" )

for sample in ${sampleName[*]}

do

snippy --outdir /home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Snippy_Output/ --ref /home/linux-biostation/Documents/Peter_F_Work/H37rv_Ref3.fasta --mask /home/linux-biostation/Documents/Peter_F_Work/Mtb_NC_000962.3_mask.bed --R1 $sampleLoc/${sample}R1.fastq.gz --R2 $sampleLoc/${sample}R2.fastq.gz

done

@peflanag
Copy link
Author

Actually I just realised I dont add --mask to the snippy command first I use it after I run snippy! Sorry!

@peflanag
Copy link
Author

Actually I need to reopen this. How do I tell snippy to add the output of each sample to the same output folder? It wants to replace the first file with the subsequent ones. Is there anyway to tell it in the script above to just keep adding?

Cheers

@peflanag peflanag reopened this Oct 29, 2019
@peflanag
Copy link
Author

Actually, figured it out. I use ${sample} as the end path in my --outdir command!

@tseemann tseemann self-assigned this Oct 29, 2019
@tseemann
Copy link
Owner

tseemann commented Oct 29, 2019

@peflanag have you considered using snippy-multi ?

https://github.com/tseemann/snippy/blob/master/README.md#using-snippy-multi

Also make sure you are using snippy >= 4.4

@peflanag
Copy link
Author

Hi @tseemann I haven't cause I had a .sh file with the names of the files in it so for ease of hand I didn't want to rewrite everything in excel to make the tab file. But also, I'm kind of uncertain how I go about it. the manual says the isolates have to be labelled Isolate1 Isolate1b and so on. But I have more samples than the number of the alphabet, over 120 samples. So, do i go from ...Isolate1z to Isolate2 Isolate2b...? But on the manual the Isolate2 seems to correspond to single end reads.

I'm basically trying to compare MTBseq and Snippy but cant seem to get either to finish locally or on a college cluster. I'm running snippy 4.4.3

Cheers,

P

@tseemann
Copy link
Owner

You do not need to name the isolates that way.
The documentation is just an example.
Usually we write scripts to generate the input.tab file automatically.

@peflanag
Copy link
Author

Oh right. So I just write the "name" and then "path" in excel for each sample? I didn't know you can write scripts to generate the input.tab file automatically. Is that something you can share? Or is there somewhere online I can look to explain that? I'm not really tech savvy when it comes to writing scripts. I'm a microbiologist!

Cheers,

P

@tseemann
Copy link
Owner

tseemann commented Oct 30, 2019

For your case, you need a 3 column Excel spreadsheet

  1. ID
  2. full path to R1
  3. full path to R2

Save it as 'tab delimited text' (usually .txt extension)
Once in Unix you may have to run dos2unix on it to fix the line endings.
Good luck.

PS. there is no magic script. you could achieve the same by using excel functions probably, as long as the file names are named consistently.

@peflanag
Copy link
Author

Cheers. Just wondering dos2unix, is that if I am using windows? I'm using a Mac

@peflanag
Copy link
Author

Oh and do I put the title names ID etc in the tops of the columns or leave them out?

@tseemann
Copy link
Owner

No titles. Just like the docs show.

@peflanag
Copy link
Author

cheers!

@peflanag
Copy link
Author

Hey @tseemann sorry to bother you again. I've made the tab file and I ran the script but it said the sample was an unreadable file. Any idea what might be wrong?

Cheers,

P

ClusterIDs.txt

Screenshot 2019-10-30 at 09 46 26

@peflanag
Copy link
Author

I should point out that this is the first file in the folder so I'm guessing all the others will have the same errors

@peflanag
Copy link
Author

Hi @tseemann just wondering if you happened to see the message above?

All the best,

Peter

@peflanag peflanag reopened this Oct 31, 2019
@tseemann
Copy link
Owner

Snippy can't read that file. Usually this is

  • a unix permissions problem (on that file or any of the directories enclosing it)
  • a bad symlink
  • trying to run on Windows 10 Linux Subsystem
  • a file 0 bytes long
  1. what does stat <FULL PATH TO THE FILE> say?
  2. what does snippy --version say?

@peflanag
Copy link
Author

peflanag commented Nov 1, 2019

Hi @tseemann

I ran the stat /Path/To/File and attached a screenshot below. Snippy is running v4.4.3

I'm not sure what a Windows Linux Subsystem is. I have a macOS machine and a Ubuntu machine. Any of the PCs in work are restricted and I don't have command line access on them. I made the file on excel on my mac, saved as tab delimitated which made a text file and I copied it to the Linux system to run.

Screenshot 2019-11-01 at 07 28 30

@tseemann
Copy link
Owner

tseemann commented Nov 3, 2019

  1. I mean the "R2" file that it says is "unreadable", not the .txt file.

  2. If you created the file in MacOS, you will need to do mac2unix on the Linux machine to fix the file endings. That is probably the cause of your error.

@tseemann tseemann closed this as completed Nov 3, 2019
@peflanag
Copy link
Author

peflanag commented Nov 4, 2019

Hi @tseemann I ran mac2unix but it didnt work. I have attached a screenshot of the MiniSeq compressed fastq file that I opened. I'm not sure how it should look for snippy though.

Screenshot 2019-11-04 at 10 24 00

@tseemann
Copy link
Owner

tseemann commented Nov 5, 2019

No, i mean run mac2unix on your tab-separated file you are giving to snippy-multi

@peflanag
Copy link
Author

peflanag commented Nov 5, 2019

I ran “mac2unix tab.file” and it didn’t work. The screenshot is just to show you what the compressed R1 file looks like in case there’s something wrong with that?

@tseemann
Copy link
Owner

tseemann commented Nov 6, 2019

It looks fine.
Maybe tell me

  1. how you installed snippy
  2. exactly the input files and commands you are running

@peflanag
Copy link
Author

peflanag commented Nov 6, 2019

Hey @tseemann so i did the brew install version of snippy thats on the github page and then when running I ran this command,

snippy-multi Path/To/Tab.text --ref Path/To/Ref --mask Path/To/mask.bed > runme.sh

and I also tried without the --mask command.

I appreciate the help.

@tseemann
Copy link
Owner

tseemann commented Nov 9, 2019

Ok, i can see cr lf at the end of each line.
This means you are still in Windows/DOS text format.
Please run dos2unix XXXXXXX on that file so it is in Unix text format.

@peflanag
Copy link
Author

Hey @tseemann so I tried that but again it seems to fail! I don't know what I'm doing wrong? should I try making the tab file on the linux machine instead?

Screenshot 2019-11-11 at 08 45 05

@peflanag
Copy link
Author

So it turns out making the file on the Linux machine works but know I have a new error! Any ideas what this means? I don't know what it means by missing read/contig data.

Cheers,

P

Screenshot 2019-11-11 at 08 50 54

@tseemann
Copy link
Owner

You seem to have created a CSV file (comma separated) not TSV (tab separated).

This hack might fix it:

tr "," "\t" < ClusterIDs.csv > ClusterIDs.tab

@peflanag
Copy link
Author

Hey @tseemann so I tried that but it still hasn't worked. I have attached a screenshot below. I have received a new iMac Pro that is solely for bioinformatics since the linux machine I am using is quite old. Once I set it up I am going to try that tomorrow and hope for the best!

But I thought I would attached the screenshot below to see if I am doing it right!

P

Screenshot 2019-11-11 at 20 44 09

@tseemann
Copy link
Owner

You seem to have your old problem back.
I can't read screenshots very well.
Can you cut + paste the errors from now on please?

@peflanag
Copy link
Author

peflanag commented Nov 11, 2019

Sure, its this:

~$ snippy-multi '/home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/ClusterIDs.tab' --ref '/home/linux-biostation/Documents/Peter_F_Work/H37rv_Ref3.fasta' --mask '/home/linux-biostation/Documents/Peter_F_Work/Mtb_NC_000962.3_mask.bed' > runme.sh
Reading: /home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/ClusterIDs.tab
ERROR: [IMRL28] unreadable file '/home/linux-biostation/Documents/Peter_F_Work/IMRL_Cluster_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz'
(base) linux-biostation@Linux-BioStation:~$ 

@peflanag
Copy link
Author

I don't know why it has a line through it. I am logged into the linux machine from home so trying to paste it from there

@peflanag
Copy link
Author

peflanag commented Nov 11, 2019

When I ran this command,

tr "," "\t" < ClusterIDs.csv > ClusterIDs.tab

there was no error. it just returned back to (base) linux-biostation@Linux-BioStation:$

@tseemann
Copy link
Owner

What does this say

ls -lsa /home/linux-biostation/Documents/Peter_F_Work/IMRL_Cluster_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz

@peflanag
Copy link
Author

peflanag commented Nov 12, 2019

Hey @tseemann I ran the the script above and got this:

(base) linux-biostation@Linux-BioStation:~$ ls -lsa '/home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz' 
157236 -rw-r--r-- 1 linux-biostation linux-biostation 161006911 Oct 29 10:00 /home/linux-biostation/Documents/Peter_F_Work/IMRL_Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz
(base) linux-biostation@Linux-BioStation:~$ 

Not sure what it means. I've attached a screenshot cause it was highlighted in red.

Screenshot from 2019-11-12 11-14-44

@tseemann
Copy link
Owner

The file seems ok. It's red because it's gzip compressed I think.
Did you remember to run mac2unix and dos2unix on the file you are providing as --input ?

@peflanag
Copy link
Author

I’m running it on a Mac this time so I didn’t run mac2unix or dox2unix

@tseemann
Copy link
Owner

You still need to run mac2unix as you are running it on the BSD Unix system underneath the hood of MacOS.

Details: basically all your text files must have nl at the end of each line. Not a cr and not cr nl.

@peflanag
Copy link
Author

HI @tseemann cheers for that! I didnt know I still had to do on a Mac. I tried mac2unix "file" and dos2unix" on the file on the mac but got the reply

-bash: mac2unix: command not found

and the same for dos2unix. Do I have to install something to run it?

Cheers

@tseemann
Copy link
Owner

Yes, you need to install those tools somehow (assuming Nullarbor is still not working). I don't know if they are in Conda or not. Sorry. Or copy the file to a Linux system and run it there.

@peflanag
Copy link
Author

peflanag commented Nov 13, 2019

So I managed to install it through Homebrew and ran the snippy command again but I'm still getting the same error. I have pasted it below. Also when I ran od -a file | head -n 20 it looks like the mac2unix or dos2unix didnt work

(base) med176028:~ peterflanagan$ brew install dos2unix
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Formulae
cf-tool
==> Updated Formulae
abcmidi                azure-cli              dependency-check       flow                   highlight              lazygit                now-cli                suil                   vfuse
ansifilter             bitrise                dvc                    goreleaser             jfrog-cli-go           mariadb-connector-c    scc                    telegraf               xa
aws-cdk                bullet                 embulk                 haxe                   just                   mdbook                 sord                   tokei                  yadm
aws-google-auth        cheat                  exploitdb              hcloud                 kubernetes-helm        netlify-cli            starship               ungit

==> Downloading https://homebrew.bintray.com/bottles/dos2unix-7.4.1.catalina.bottle.tar.gz
######################################################################## 100.0%
==> Pouring dos2unix-7.4.1.catalina.bottle.tar.gz
🍺  /usr/local/Cellar/dos2unix/7.4.1: 24 files, 370.0KB
(base) med176028:~ peterflanagan$ mac2unix /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab 
mac2unix: converting file /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab to Unix format...
(base) med176028:~ peterflanagan$ conda activate Snippy
(Snippy) med176028:~ peterflanagan$ snippy --check
[08:51:14] This is snippy 4.4.5
[08:51:14] Written by Torsten Seemann
[08:51:14] Obtained from https://github.com/tseemann/snippy
[08:51:14] Detected operating system: darwin
[08:51:14] Enabling bundled darwin tools.
[08:51:14] Found bwa - /Users/peterflanagan/miniconda3/envs/Snippy/bin/bwa
[08:51:14] Found bcftools - /Users/peterflanagan/miniconda3/envs/Snippy/bin/bcftools
[08:51:14] Found samtools - /Users/peterflanagan/miniconda3/envs/Snippy/bin/samtools
[08:51:14] Found java - /Users/peterflanagan/miniconda3/envs/Snippy/bin/java
[08:51:14] Found snpEff - /Users/peterflanagan/miniconda3/envs/Snippy/bin/snpEff
[08:51:14] Found samclip - /Users/peterflanagan/miniconda3/envs/Snippy/bin/samclip
[08:51:14] Found seqtk - /Users/peterflanagan/miniconda3/envs/Snippy/bin/seqtk
[08:51:14] Found parallel - /Users/peterflanagan/miniconda3/envs/Snippy/bin/parallel
[08:51:14] Found freebayes - /Users/peterflanagan/miniconda3/envs/Snippy/bin/freebayes
[08:51:14] Found freebayes-parallel - /Users/peterflanagan/miniconda3/envs/Snippy/bin/freebayes-parallel
[08:51:14] Found fasta_generate_regions.py - /Users/peterflanagan/miniconda3/envs/Snippy/bin/fasta_generate_regions.py
[08:51:14] Found vcfstreamsort - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vcfstreamsort
[08:51:14] Found vcfuniq - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vcfuniq
[08:51:14] Found vcffirstheader - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vcffirstheader
[08:51:14] Found gzip - /usr/bin/gzip
[08:51:14] Found vt - /Users/peterflanagan/miniconda3/envs/Snippy/bin/vt
[08:51:14] Found snippy-vcf_to_tab - /Users/peterflanagan/miniconda3/envs/Snippy/bin/snippy-vcf_to_tab
[08:51:14] Found snippy-vcf_report - /Users/peterflanagan/miniconda3/envs/Snippy/bin/snippy-vcf_report
[08:51:14] Checking version: samtools --version is >= 1.7 - ok, have 1.9
[08:51:14] Checking version: bcftools --version is >= 1.7 - ok, have 1.9
[08:51:14] Checking version: freebayes --version is >= 1.1 - ok, have 1.3
[08:51:14] Checking version: java -version is >= 1.8 - ok, have 1.8
[08:51:15] Checking version: snpEff -version is >= 4.3 - ok, have 4.3
[08:51:15] Checking version: bwa is >= 7.12 - ok, have 7.17
[08:51:15] Dependences look good!
(Snippy) med176028:~ peterflanagan$ snippy-multi /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab --ref /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/H37rv_Ref3.fasta --mask /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/Mtb_NC_000962.3_mask.bed --cpus 16 > runme.sh
Reading: /Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/ClusterIDs_copy.tab
'RROR: [IMRL28] unreadable file '/Users/peterflanagan/IMRL_Seq_Data/Illumina/Snippy_IMRL_Clusters/Fastq/IMRL28_S15_L001_R2.fastq.gz
(Snippy) med176028:~ peterflanagan$ 
(Snippy) med176028:~ peterflanagan$ 

@peflanag
Copy link
Author

peflanag commented Nov 13, 2019

I also tried it on the linux machine and I don't think it worked.

(base) linux-biostation@Linux-BioStation:~$ mac2unix '/home/linux-biostation/Desktop/MacClusterIDs.txt' 
mac2unix: converting file /home/linux-biostation/Desktop/MacClusterIDs.txt to Unix format ...
(base) linux-biostation@Linux-BioStation:~$ od -a '/home/linux-biostation/Desktop/MacClusterIDs.txt' | head -n 20
0000000   I   M   R   L   2   8  ht   /   h   o   m   e   /   l   i   n
0000020   u   x   -   b   i   o   s   t   a   t   i   o   n   /   D   o
0000040   c   u   m   e   n   t   s   /   P   e   t   e   r   _   F   _
0000060   W   o   r   k   /   I   M   R   L   _   C   l   u   s   t   e
0000100   r   _   S   n   i   p   p   y   /   F   a   s   t   q   /   I
0000120   M   R   L   2   8   _   S   1   5   _   L   0   0   1   _   R
0000140   1   .   f   a   s   t   q   .   g   z  ht   /   h   o   m   e
0000160   /   l   i   n   u   x   -   b   i   o   s   t   a   t   i   o
0000200   n   /   D   o   c   u   m   e   n   t   s   /   P   e   t   e
0000220   r   _   F   _   W   o   r   k   /   I   M   R   L   _   C   l
0000240   u   s   t   e   r   _   S   n   i   p   p   y   /   F   a   s
0000260   t   q   /   I   M   R   L   2   8   _   S   1   5   _   L   0
0000300   0   1   _   R   2   .   f   a   s   t   q   .   g   z  nl   I
0000320   M   R   L   2   9  ht   /   h   o   m   e   /   l   i   n   u
0000340   x   -   b   i   o   s   t   a   t   i   o   n   /   D   o   c
0000360   u   m   e   n   t   s   /   P   e   t   e   r   _   F   _   W
0000400   o   r   k   /   I   M   R   L   _   C   l   u   s   t   e   r
0000420   _   S   n   i   p   p   y   /   F   a   s   t   q   /   I   M
0000440   R   L   2   9   _   S   9   _   L   0   0   1   _   R   1   .
0000460   f   a   s   t   q   .   g   z  ht   /   h   o   m   e   /   l
(base) linux-biostation@Linux-BioStation:~$ 

@tseemann
Copy link
Owner

The file looks fine on the Unix.

Maybe I am wrong and you need to leave it Mac format when running on a Mac?
I am very confused.

Can you run snippy by itself on just ONE sample anywhere?

@peflanag
Copy link
Author

Hey @tseemann so I ran that IMRL28 sample on the linux environment and it ran fine but when I go to run the snippy-multi it just doesn't like it! I haven't tried on the iMac Pro yet because I just discovered last night when trying to run MTBseq that with the macOS Catalina update, Apple seems to have done something where root permissions are only read and not write and data is on a separate "virtual?" HD. I don't quite understand it but from what I've been reading its messing up conda and conda envs. Needless to say I have wiped the iMac Pro to start everything from scratch.

But I dont understand why the file wont work on the Linux machine. I have about 125 samples to run. Just to check, I'm using excel to make the file as attached. Then I click say as and select tab-deliminated which saves as a text file then I'm using mac2unix "file" or dos2unix -c mac "file"

MacClusterIDs.xlsx

MacClusterIDs.txt

@tseemann
Copy link
Owner

If you are running it on MacOS can you try NOT using mac2unix?
Also, it would probably just be easier if you used nullarbor instead of manual snippy work?

conda create -n nullarbor_env nullarbor
conda activate nullarbor
nullarbor.pl --ref XXX --input MacClusterIDs.txt --name IMRL --outdir nullarbor
# etc

@peflanag
Copy link
Author

Hey, I’ll try that tomorrow. I’ll save the excel file as a tab-delimimated file which will be .txt and run snippy-multi. If that doesn’t work then I’ll try nullarbor. I was having issues with conda after the macOS Catalina update so hopefully it’ll install. Currently running snippy manually in each sample!

@peflanag
Copy link
Author

hi @tseemann so i tried what you suggested above and it still doesn't work. This is the print out of my terminal:

(nullarbor) peterflanagan@med176028 Clusters_Snippy % nullarbor.pl --ref /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/H37rv_Ref3.fasta --input /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/MacClusters.txt --name IMRL_Clusters --outdir /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Output
[10:17:24] Hello peterflanagan
[10:17:24] This is nullarbor.pl 2.0.20191013
[10:17:24] Send complaints to Torsten Seemann
[10:17:24] Scanning --ref for problematic sequence IDs...
[10:17:24] Using reference genome: /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/H37rv_Ref3.fasta
[10:17:24] ERROR: Isolate 'IMRL28' - can not read sequence #2 of 1 files: '/Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R2.fastq.g'
(nullarbor) peterflanagan@med176028 Clusters_Snippy %

I have however managed to write a .sh loop to run snippy and it works. The only problem is it doesnt generate a .vcf file and I dont know why? So I can't run snippy-core afterwards. I have attached teh .sh loop that I wrote. Maybe I left something out? If I run snippy manually on a sample then it makes the vcf file.

Peter

IMRL_Clusters.txt

@tseemann
Copy link
Owner

tseemann commented Nov 24, 2019

Your shell script seems fine. Are you saying NO .vcf files are generated at all when you run Snippy in that shell loop?
What does the {sample_name}/snippy.log file say?

@peflanag
Copy link
Author

Yup! It generates everything in the folders for each sample except a .vcf file

@tseemann
Copy link
Owner

What does the {sample_name}/snippy.log file say?

@peflanag
Copy link
Author

Hey @tseemann so theres no snippy.log file but there is a snps.log file? I have attached it below and a screenshot of what is in the folder.

echo snippy 4.4.5

cd /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy

/usr/local/bin/snippy --cpus 16 --outdir /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Output/IMRL28 --ref /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/H37rv_Ref3.fasta --R1 /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz -R2 /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R2.fastq.gz

samtools faidx reference/ref.fa

bwa index reference/ref.fa

[bwa_index] Pack FASTA... 0.03 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.86 seconds elapse.
[bwa_index] Update BWT... 0.03 sec
[bwa_index] Pack forward-only FASTA... 0.02 sec
[bwa_index] Construct SA from BWT and Occ... 0.28 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index reference/ref.fa
[main] Real time: 1.224 sec; CPU: 1.221 sec

mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa

ln -sf reference/ref.fa .

ln -sf reference/ref.fa.fai .

mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz

bwa mem -Y -M -R '@rg\tID:IMRL28\tSM:IMRL28' -t 16 reference/ref.fa /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R1.fastq.gz /Users/peterflanagan/IMRL_Sequencing/Illumina/Clusters_Snippy/Fastq/IMRL28_S15_L001_R2.fastq.gz | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /private/var/folders/yw/9tmv96yx73ld2r712hclgjv00000gn/T --threads 15 -m 266M | samtools fixmate -m - - | samtools sort -l 0 -T /private/var/folders/yw/9tmv96yx73ld2r712hclgjv00000gn/T --threads 15 -m 266M | samtools markdup -T /private/var/folders/yw/9tmv96yx73ld2r712hclgjv00000gn/T -r -s - - > snps.bam

READ 3922924 WRITTEN 3768726
EXCLUDED 123679 EXAMINED 3799245
PAIRED 3766766 SINGLE 32479
DULPICATE PAIR 130732 DUPLICATE SINGLE 23466
DUPLICATE TOTAL 154198

samtools index snps.bam

fasta_generate_regions.py reference/ref.fa.fai 144681 > reference/ref.txt

File "/usr/local/bin/fasta_generate_regions.py", line 7
print "usage: ", sys.argv[0], " "
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print("usage: ", sys.argv[0], " ")?

Screenshot 2019-11-25 at 08 23 48

@tseemann
Copy link
Owner

You have a problem with python2 vs python3.

I think the only way you will get this working is to create a brand new miniconda3 installaiton on linux and install snippy from bioconda.

follow these precisely: https://bioconda.github.io/user/install.html#install-conda

@peflanag
Copy link
Author

Cool I’ll uninstall miniconda 3 tomorrow and reinstall as above and test!

@peflanag
Copy link
Author

@tseemann you're a legend! I dont want to jinx it but it seems to be working and making all the required files in the folders now!

Thanks for your help!

Screenshot 2019-11-26 at 09 16 00

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants