Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Local Graph Exploded" warning for hisat2 build using --snp,--haplotype,--exon and --ss. #404

Open
SJJHK opened this issue Feb 2, 2023 · 0 comments

Comments

@SJJHK
Copy link

SJJHK commented Feb 2, 2023

Hi there,

We hope you are well.

Quite a few people have commented on this issue (n=9, #267)

I am reposting my problem here as it was prior in a conversation trail.

The warnings in Issue 267 (the aforementioned link) of exploding graphs and a length of 57344 is also the case for us (Hisat2 ver 2.2.1).

Please see the attached output.

This was the code utilised:

hisat2-build --exon V41_extractexon --ss V41_extractsplice --snp genome.snp --haplotype genome.haplotype -f Gencode_V41_Comp_ERCC_Merge_Genome.fa snp_haplotype_test

After "Generation 22", the warnings appear.

We utilise the Human Release of Gencode v41 Comprehensive (GRCh38.p13).

Is it safe to say a "warning" is a warning and not an "error", as the build continues through to completion?

We are able to align a sample file and view the .bam. Additionally, we compared it to the same sample file aligned using a standard index build utilising a novel splice site infile during alignment. We saw a 1% reduction in alignment rate when comparing these 2. Albeit, 2 files is not a comparison.

Two other questions:

  1. In the initial stages of the build output it notes that the Local Sequence Length is 57344. Is this a coincidence that the exploding graph length notes the same length?
  2. In the initial stages after it identifies the input file fasta, reading the reference sizes etc, a line stating time to read snps and splice site appears with the time 1:29, however no mention of the haplotype file occurs. Is this an issue? The "--haplotype" section of the manual states " See the above option, –snp, about how to extract haplotypes. This option is not required, but haplotype information can keep the index construction from exploding and reduce the index size substantially."

Thanks in advance for replying to the issue or "non-issue"?

hisat2-build-log-jan24-2023.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant