Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Have found no compatible reference specifications #61

Open
santhoshnh opened this issue Aug 6, 2021 · 11 comments
Open

Error: Have found no compatible reference specifications #61

santhoshnh opened this issue Aug 6, 2021 · 11 comments

Comments

@santhoshnh
Copy link

Hii,
I installed HLA-LA through conda. I have downloaded graph and indexed, And while running the HLA-LA.pl I am getting below error

Have found no compatible reference specifications in /home/admin/anaconda3/opt/hla-la/src/../graphs/PRG_MHC_GRCh38_withIMGT/knownReferences - create a file for this BAM file and try again. at /home/admin/anaconda3/bin/HLA-LA.pl line 309

Below I am attaching samtools idxstats results for my BAM file
Kindly help me to resolve the issue
samtools_idsstats.txt

@TonyLupara
Copy link

Hi, I've corrected your file to use as extraction specifications by HLA-LA. It contains information about what reads to extract from your BAM file to use in genotyping. I marked chr6 MHC region and all unmapped reads.

Add this file into graphs/PRG_MHC_GRCh38_withIMGT/knownReferences folder.

reference_extraction.txt

Let me know if it work.

@santhoshnh
Copy link
Author

Hii I am getting following error after adding reference_extraction.txt into graphs/PRG_MHC_GRCh38_withIMGT/knownReferences folder

Incorrect header for /home/admin/anaconda3/opt/hla-la/src/../graphs/PRG_MHC_GRCh38_withIMGT/knownReferences/reference_extraction.txt at /home/admin/anaconda3/bin/HLA-LA.pl line 255, line 1

@TonyLupara
Copy link

Use this one, it should work

reference_extraction_UNIX.txt

@santhoshnh
Copy link
Author

Thank you. It is working. Can I know what is the difference between this file and previous one?? So that if I use different genome I can make it by my own

@TonyLupara
Copy link

The main idea is that when you edit text file in Windows it uses CR LF (Windows) line break type, I have changed it with Notepad++ to LF (Unix) break type.

@santhoshnh
Copy link
Author

Thank you. Can I use the bam file which is obtained by mapping fasta reads to reference genome??

@TonyLupara
Copy link

You can, just make samtools idxstats results for your BAM file, extract contigID and Length, make proper extraction file with Excel for example, and switch to Unix line break type.

@chenxf611
Copy link

chenxf611 commented Oct 11, 2021

Hi, I mapped my pacbio HLA reads to hg38 and then run HLA-LA.pl with --longReads pacbio tag, I got same error:

Have found no compatible reference specifications in ./graphs/PRG_MHC_GRCh38_withIMGT/knownReferences - create a file for this BAM file and try again.

My question is, which reference should I use to map my reads in the first place, regular hg38 or specified HLA reference?

Thanks

Jack

@TonyLupara
Copy link

Use whatever reference you think is correct in this situation. For example I use GRCh38_full_analysis_set_plus_decoy_hla.fa for my processing, as I believe it increase mapQ quality for reads aligned to alternative contigs. Notice that reads should be mapped to reference with alt-contigs with alt-aware mapper (for example, bwa-mem) and ideally with Postprocessing. Otherwise read maps to multiple locations will have zero mapQ.
More details here:
https://github.com/lh3/bwa/blob/master/README-alt.md
https://lh3.github.io/2017/11/13/which-human-reference-genome-to-use

If you don't know how to use alt-aware mapper its better to work with hg38_analysis_set without alt-contigs. For example:
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz

@AndresCongenica
Copy link

AndresCongenica commented Apr 3, 2023

Hi, I am trying to evaluate different tools and have chosen to use NF-Core's reference file for cutting down to relevant regions. I am facing the same issue as the rest in the thread: "Have found no compatible reference specifications in..."

Kindly see my idxstats file attached and please let me know of anything else you may require.
idx_stats.txt

@AlexanderDilthey
Copy link
Member

Hi all,

If you have freedom to choose the reference you map against, I would recommend using the standard 1000 Genomes reference: http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/GRCh38_reference_genome/GRCh38_full_analysis_set_plus_decoy_hla.fa

@AndresCongenica, it looks like all of the contigs in your file contain a "HLA" substring, which may indicate that all reads mapping to any of these should be extracted. However, I am not sure where any such BAM would come from - would it be based on extracting alignment records from a BAM that is based on mapping against a whole-genome reference (which would probably be fine), or based on mapping a set of whole-genome reads against a reference containing only the HLA reference contigs (which may be a process prone to attracting false-positive alignments)?

Best wishes

Alex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants