Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distance between vcf position and chrom start is smaller than read length. #38

Closed
MaoYafei opened this issue Feb 5, 2020 · 3 comments
Closed

Comments

@MaoYafei
Copy link

MaoYafei commented Feb 5, 2020

Hi All,

I try to run paragraph to my test dataset. but got an error below:

$ python3 ../../../Tools/paragraph/bin/multigrmpy.py -i ../pbsv_sample_Bonobo_sv.vcf.gz -m sample.txt -r ../ref.fa -o test &
[1] 39497
$ 2020-02-04 16:37:34,250 ERROR VCF to JSON conversion failed.
2020-02-04 16:37:34,303 ERROR Traceback (most recent call last):
2020-02-04 16:37:34,303 ERROR File "../../../Tools/paragraph/bin/multigrmpy.py", line 52, in load_graph_description header, records, event_list = convert_vcf_to_json(args, alt_paths=True)
2020-02-04 16:37:34,303 ERROR File "/net/eichler/vol26/projects/primate_sv/nobackups/Tools/paragraph/lib/python3/grm/vcf2paragraph/init.py", line 133, in convert_vcf_to_json header, records, block_ids = parse_vcf_lines(args.input, args.read_length, args.split_type)
2020-02-04 16:37:34,304 ERROR File "/net/eichler/vol26/projects/primate_sv/nobackups/Tools/paragraph/lib/python3/grm/vcf2paragraph/init.py", line 209, in parse_vcf_lines raise Exception("Distance between vcf position and chrom start is smaller than read length.")
2020-02-04 16:37:34,304 ERROR Exception: Distance between vcf position and chrom start is smaller than read length.
2020-02-04 16:37:34,305 ERROR Traceback (most recent call last):
2020-02-04 16:37:34,305 ERROR File "../../../Tools/paragraph/bin/multigrmpy.py", line 261, in run graph_files = load_graph_description(args)
2020-02-04 16:37:34,305 ERROR File "../../../Tools/paragraph/bin/multigrmpy.py", line 52, in load_graph_description header, records, event_list = convert_vcf_to_json(args, alt_paths=True)
2020-02-04 16:37:34,305 ERROR File "/net/eichler/vol26/projects/primate_sv/nobackups/Tools/paragraph/lib/python3/grm/vcf2paragraph/init.py", line 133, in convert_vcf_to_json header, records, block_ids = parse_vcf_lines(args.input, args.read_length, args.split_type)
2020-02-04 16:37:34,306 ERROR File "/net/eichler/vol26/projects/primate_sv/nobackups/Tools/paragraph/lib/python3/grm/vcf2paragraph/init.py", line 209, in parse_vcf_lines raise Exception("Distance between vcf position and chrom start is smaller than read length.")
2020-02-04 16:37:34,306 ERROR Exception: Distance between vcf position and chrom start is smaller than read length.
Traceback (most recent call last):
File "../../../Tools/paragraph/bin/multigrmpy.py", line 353, in
main()
File "../../../Tools/paragraph/bin/multigrmpy.py", line 349, in main
run(args)
File "../../../Tools/paragraph/bin/multigrmpy.py", line 261, in run
graph_files = load_graph_description(args)
File "../../../Tools/paragraph/bin/multigrmpy.py", line 52, in load_graph_description
header, records, event_list = convert_vcf_to_json(args, alt_paths=True)
File "/net/eichler/vol26/projects/primate_sv/nobackups/Tools/paragraph/lib/python3/grm/vcf2paragraph/init.py", line 133, in convert_vcf_to_json
header, records, block_ids = parse_vcf_lines(args.input, args.read_length, args.split_type)
File "/net/eichler/vol26/projects/primate_sv/nobackups/Tools/paragraph/lib/python3/grm/vcf2paragraph/init.py", line 209, in parse_vcf_lines
raise Exception("Distance between vcf position and chrom start is smaller than read length.")
Exception: Distance between vcf position and chrom start is smaller than read length.

Here is my manifest file:

$ cat sample.txt
id path idxdepth
bonobo_10 aln_realigned_reads.bam aln_realigned_reads.bam.json
$ ls aln_realigned_reads.bam
aln_realigned_reads.bam
$ ls aln_realigned_reads.bam.json
aln_realigned_reads.bam.json

Do you have any idea about it?

Thank you so much.

Best,
Yafei

@traxexx
Copy link
Contributor

traxexx commented Feb 10, 2020

Hi @MaoYafei Could you please check if your VCF contains a record with POS value smaller than your read length? That should be the reason. The graph alignment needs the flanking to be at least one read-length long. If the variant is too close to the start of the chromosome, there is no way to build a good graph for it.
I will update the error message to make it more precise.

@netwon123
Copy link

coule u solve this problem?

@pingxinxing
Copy link

Hello, have you solved this problem successfully? What is the solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants