Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No scaffolding being done on the Scaff10xV4 #8

Open
harish0201 opened this issue Apr 12, 2019 · 4 comments
Open

No scaffolding being done on the Scaff10xV4 #8

harish0201 opened this issue Apr 12, 2019 · 4 comments

Comments

@harish0201
Copy link

harish0201 commented Apr 12, 2019

Hi,

I have been trying out scaffolding the pacbio contigs using scaff10x, using the latest version the version 4 uploaded last month.

I can see that the align* series files are empty, but the tool doesn't throw an error, nor does it do scaffolding. Please see the attached screenshot.
scaff10x

It seems during the update the tool names in the scaff-bin folder have been changed, but calling them in the main script seems to be done using the older names, creating the error. (I observed this for bwa).

This is the command I have been using:

~/Tools/Scaff10X/src/scaff10x -nodes 48 -longread 1 -gap 100 -matrix 2000 -reads 10 -score 20 -edge 50000 -link 8 -block 50000 final_MP_scaff.fa genome-BC2_1.fastq.gz genome-BC2_2.fastq.gz out.scaffolds.fa &

I'm also a bit unsure if this is because the genome is fragmented? Please see the stats below:

file final_MP_scaff.fa
num_seqs 20896
sum_len 896879817
min_len 2002
avg_len 42921.1
max_len 1276294
Q1 14596
Q2 21642
Q3 37521
sum_gap 1782762
N50 86627

Edit: On closer look, it seems that for some reason the bwa indices are not being created.

@EdHarry
Copy link
Contributor

EdHarry commented Apr 12, 2019

I've just pushed an update with should have better error handling. Could you try the latest version out.

BTW, unless you want to keep the barcode processed reads, it will be more efficient to list your 10x fq files in an input.fofn text file and pass that in with -data input.fofn.

@harish0201
Copy link
Author

Thank you for the update!

I did try having the list in the fofn and then pass it on. But I then broke down all the steps to see where the errors occurred!

I have manually generated the bwa indices and I'm currently doing the steps manually to further trouble shoot where the issues are.

I'll keep you updated.

@mwhj
Copy link

mwhj commented Apr 15, 2019

I have the same problem! I've been pulling my hair out thinking I've been doing something wrong. It would be really brilliant to get this fixed.

@harish0201
Copy link
Author

harish0201 commented Apr 20, 2019

Alright, I have finally a 10X dataset that I have to use to scaffold. These are the errors on the latest commit:

[main] CMD: /home/harish/Tools/Scaff10X/src/scaff-bin/bwa mem -t 30 tarseq.fastq /data/analysis/dr.n.v.singh/genome/10X/scaff/genome-BC2_1.fastq.gz /data/analysis/dr.n.v.singh/genome/10X/scaff/genome-BC2_2.fastq.gz [main] Real time: 43975.365 sec; CPU: 1319572.223 sec sh: -c: line 0: syntax error near unexpected token ('
sh: -c: line 0: mv genome.fasta /data/analysis/plant/genome/10X/scaff/(null)' Error running command: mv genome.fasta /data/analysis/plant/genome/10X/scaff/(null) Input target assembly file2: /data/analysis/plant/genome/10X/scaff/polished.fa Input read1 file: /data/analysis/plant/genome/10X/scaff/genome-BC2_1.fastq.gz Input read2 file: /data/analysis/plant/genome/10X/scaff/genome-BC2_2.fastq.gz /home/harish/Tools/Scaff10X/src/scaff-bin/bwa mem -t 30 tarseq.fastq /data/analysis/plant/genome/10X/scaff/genome-BC2_1.fastq.gz /data/analysis/plant/genome/10X/scaff/genome-BC2_2.fastq.gz | egrep tarseq_ | awk '($2<100)&&($5>=0){print $1,$2,$3,$4,$5}' | egrep -v '^@' > align.dat

It still generates a genome.fasta which seems to be after scaffolding, but there's very less reduction, which is kinda curious. I'll have to check with the older version.

Reg: Retaining the debar-coded reads, I plan to use them for polishing as well :) Thanks for the heads-up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants