Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No fusion found using tests data #16

Open
chizhenfen opened this issue Jan 1, 2019 · 20 comments
Open

No fusion found using tests data #16

chizhenfen opened this issue Jan 1, 2019 · 20 comments

Comments

@chizhenfen
Copy link

Hi Shifu,
no fusion found using tests data
./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 R1.fq.gz -2 R2.fq.gz -h r1r2n.html
15:51:11 start with 4 threads
15:51:50 mapper indexing done
15:52:20 sequence number before filtering: 0
15:52:20 removeByComplexity: 0
15:52:20 removeByDistance: 0
15:52:20 removeIndels: 0
15:54:5 matcher indexing done
15:54:5 removeAlignables: 0
15:54:5 found 0 fusions

./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h genefuser1r2n.html
15:55:45 start with 4 threads
15:56:25 mapper indexing done
15:56:36 sequence number before filtering: 0
15:56:36 removeByComplexity: 0
15:56:36 removeByDistance: 0
15:56:36 removeIndels: 0
15:58:25 matcher indexing done
15:58:25 removeAlignables: 0
15:58:25 found 0 fusions

Dataset was downloaded from: http://opengene.org/dataset.html

Thanks.

@sfchen
Copy link
Member

sfchen commented Jan 2, 2019

I just tried again with command: ./genefuse -r ~/data/ref/hg19.fa -1 ~/data/fq/genefuse.R1.fq.gz -2 ~/data/fq/genefuse.R2.fq.gz -h test.html -j test.json -f genes/druggable.hg19.csv

and got:

15:1:4 start with 4 threads
15:1:47 mapper indexing done
15:2:38 sequence number before filtering: 1329
15:2:38 removeByComplexity: 0
15:2:38 removeByDistance: 39
15:2:38 removeIndels: 67
15:4:3 matcher indexing done
15:4:3 removeAlignables: 8

Probably you used incorrect reference? I used hg19 downloaded from UCSC. Did you checked the downloaded files using MD5?

@dickyornot
Copy link

Hi sfchen,

I have the same problem. I used all the demo files you provided including the reference genome. but I still got nothing in my result.

@MarcHiggins
Copy link

Hi sfchen,

I have experienced the same problem as others have reported here. Have you gotten to the bottom of this?

Thanks.

@sfchen
Copy link
Member

sfchen commented May 16, 2019

Can you guys check md5 for the downloaded FASTQ file?

@sfchen
Copy link
Member

sfchen commented May 16, 2019

http://opengene.org/dataset.html

You should download following files:
Paired-end FASTQ files for GeneFuse testing (Illumina platform)
genefuse.R1.fq.gz (size: 62 M, MD5: 171e6dfa0af37fe95c826005bc5fcdf9)
genefuse.R2.fq.gz (size: 66 M, MD5: e756cf01e256186dccaa9e700d85a342)

@MarcHiggins
Copy link

Hi sfchen,

Yes those are the same md5 I get when I check on the downloaded FASTQs. The command I run is: ./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h report.html >result

I have downloaded the .fasta file from ensembl.

Thanks.

@sfchen
Copy link
Member

sfchen commented May 16, 2019

The druggable.hg19.csv is in the genes folder

Have you checked the error message?

@sfchen
Copy link
Member

sfchen commented May 16, 2019

I mean, you should run:

./genefuse -r Homo_sapiens_assembly19.fasta -f genes/druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h report.html >result

@MarcHiggins
Copy link

I have downloaded via wget the druggable.hg.csv from the genes folder. In the results document there are no reported errors

@sfchen
Copy link
Member

sfchen commented May 16, 2019

Errors are saved to STDERR, not STDOUT. So you cannot find errors in the result file.

Can you just run the command without redirecting to result?

@MarcHiggins
Copy link

I do not get any STDERR or STDOUT files regardless of if I redirect to result or not. I am running the binary if this may make a difference. Thank you for your help by the way.

@MarcHiggins
Copy link

Apologies I meant I do not get an STDOUT file at all.

@sfchen
Copy link
Member

sfchen commented May 16, 2019

You used >, which redirected STDOUT to the file you specified.

@MarcHiggins
Copy link

But even if I exclude > there is no STDERR file - that is what I meant not the lack of STDOUT apologies for confusion.

@sfchen
Copy link
Member

sfchen commented May 16, 2019

You didn't redirect STDERR, so it would be printed on terminal.

You can use following command to also redirect STDERR:

./genefuse -r Homo_sapiens_assembly19.fasta -f druggable.hg19.csv -1 genefuse.R1.fq.gz -2 genefuse.R2.fq.gz -h report.html >result & 2>err.log

@MarcHiggins
Copy link

I have ran again and no errors are reported. I notice however the FASTQ files don't have the 15 million lines you mention in a different thread - they have more like 800,000. Maybe this is the issue?

@MarcHiggins
Copy link

Hi sfchen, I have again ran genefuse on fastqs which I know to contain translocations. Your software did not call these. This is more just to let you know than a specific request or question.

@sfchen
Copy link
Member

sfchen commented May 17, 2019

Thanks, I will try to reproduce it.

@tkcaccia
Copy link

Hi sfchen,

I am in the same situation. I do not find genefusion in the test dataset.

@amacbride
Copy link

I found the answer in #31 -- the NCBI version of hg19 had a different chromosome naming convention, so it doesn't work. The version downloadable from UCSC is fine:

wget http://hgdownload.cse.ucsc.edu/goldenPath/hg19/bigZips/hg19.fa.gz
(then unzip)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants