-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are non-human reference genomes supported? #6
Comments
HATCHet can be applied with any reference genome of a diploid organism, however the current implementation needs to be aware of some details of the reference: in particular, it needs to know the length of the chromosomes in the reference. This is why there is a list of supported references. If you could please provide the name and the details of your reference genome (especially the chromosome lengths), I will be very happy to integrate immediately that reference among those supported. |
Thanks! I could not find the list of supported references. The reference genome I am working with is sus scrofa 11.1 |
We are gonna release a new version in the next couple of days which will automatically support any reference genome, without the need to specify the name. I will update this post as soon as it will be released |
@programmingprincess The current version of HATCHet is now supporting any reference genome; in particular, now Please let us know if you can test the current version with your samples and reference genome and we keep the issue open for now. |
@simozacca Just tested the current version with the reference genome, it looks good! Thanks for the follow up 👍 |
Hello- thanks for the tool. I am having trouble using this tool with a non-human reference. All in the same dir are : genome.fasta, genome.fasta.fai, and genome.dict. I tried running the whole with a hatchet.ini file as discussed here:
And I tried just
Both give this error:
I installed hatchet the following way:
Tell Ben I say hello. I knew him when he was at Brown, and sat in on some of his lab meetings. |
p.s. Im guessing the offending code is this:
This appears to be hard-coded to expect human chromosomes. I work with an insect - the chromosome names are: X, II, III, IV, and ....associated_contig_k... (for k in 1:N). |
Unfortunately the current version of HATCHet does not support genomes with different chromosome names than those in the human genome. Also, please note that HATCHet can only properly work with diploid genomes. We will consider the extension in future developments, thank you! |
Thanks for letting me know. When you say "diploid genomes", are you referring to the ploidy of the organism or the assembly? The insect I work with is diploid (well it some cells polyploidize, but most somatic cells are diploid). The current assembly is collapsed into 1 haploid complement. |
I was referring to the ploidy of the organism; sorry I did not know that this insect was diploid. We think that we might be able to extend HATCHet to work with arbitrary chromosomes and we are working on it. Hopefully we might have new developments soon. |
I am attempting to run the example script (https://github.com/raphael-group/hatchet/blob/master/script/runHATCHet.sh) on pig data. In the deBAF step, I am getting the error:
The given reference cannot be used because the chromosome names are inconsistent!
The command being used is
python /home/jiaqiwu6/hatchet/utils/deBAF.py -N /scratch/data/oncopig/kidney_RG.dedup.bam -T /scratch/data/oncopig/tumor1_RG.dedup.bam /scratch/data/oncopig/tumor2_RG.dedup.bam /scratch/data/oncopig/tumor3_RG.dedup.bam /scratch/data/oncopig/tumor4_RG.dedup.bam /scratch/data/oncopig/tumor5_RG.dedup.bam /scratch/data/oncopig/cell_line_RG.dedup.bam -S Normal tumor1 tumor2 tumor3 tumor4 tumor5 tumor0 -r /scratch/data/oncopig/ref/sus11.1.fa -j 22 -q 20 -Q 20 -U 20 -c 4 -C 300 -O /scratch/data/oncopig/hatchet_script/baf/normal.baf -o /scratch/data/oncopig/hatchet_script/baf/bulk.baf -v
I wonder if this is because I used
-g hg19
in the binBAM step. Are non-human reference genomes supported (if so, where can I specify this)? Alternatively, can we exclude certain chromosomes from the computation entirely?Thanks in advance.
The text was updated successfully, but these errors were encountered: