Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some error raised and some questions #299

Closed
codeunsolved opened this issue Apr 9, 2019 · 2 comments
Closed

Some error raised and some questions #299

codeunsolved opened this issue Apr 9, 2019 · 2 comments

Comments

@codeunsolved
Copy link

Hi, LUMPY heroes:

Recently, I run LUMPY(v0.2.13) on NA12878 for benchmarking. It seems to output some error, but VCF file was successfully generated. So I am not sure the VCF file is valid.

I followed the workflow from README with LUMPY (traditional) usage. Here is my workflow and log:

CLICK ME

[INFO] >> 1 BAMs found!
[INFO] >> Caller  : lumpy
[INFO] • No.1 ******.bam 94.34MB
[INFO] >> Output  : /******/TEST/test_lumpy/******
[INFO] • Run lumpy v0.2.13
[INFO] [RUN_LUMPY] SampleID: ******, RLEN: 150, min_non_overlap: 101, discordant_z: 5, back_distance: 10, weight: 1, min_mapping_threshold: 20, mw: 4, tt: 0
[INFO] [RUN_LUMPY] Step 0-1: Generate discordants BAM for PE
[INFO] CMD: samtools view -b -F 1294 /******/Downloads/NGS-Data/******/******.bam > /******/TEST/test_lumpy/******/******.discordants.unsorted.bam
[INFO] CMD: samtools sort /******/TEST/test_lumpy/******/******.discordants.unsorted.bam > /******/TEST/test_lumpy/******/******.discordants.bam
[INFO] CMD: rm /******/TEST/test_lumpy/******/******.discordants.unsorted.bam
[INFO] [RUN_LUMPY] Step 0-2: Generate splitters BAM for SR
[INFO] CMD: samtools view -h /******/Downloads/NGS-Data/******/******.bam             | /******/miniconda3/envs/py2.7/bin/python /******/Projects/NGS/lumpy-sv/scripts/extractSplitReads_BwaMem -i stdin             | samtools view -Sb - > /******/TEST/test_lumpy/******/******.splitters.unsorted.bam

[INFO] CMD: samtools sort /******/TEST/test_lumpy/******/******.splitters.unsorted.bam > /******/TEST/test_lumpy/******/******.splitters.bam
[INFO] CMD: rm /******/TEST/test_lumpy/******/******.splitters.unsorted.bam
[INFO] [RUN_LUMPY] Step 1: Generate histo
[INFO] CMD: samtools view /******/Downloads/NGS-Data/******/******.bam         | tail -n 100000         | /******/miniconda3/envs/py2.7/bin/python /******/Projects/NGS/lumpy-sv/scripts/pairend_distro.py         -r 150         -X 4         -N 10000         -o /******/TEST/test_lumpy/******/******.histo

[INFO] mean:218.3957	stdev:85.4206071245
[ERROR] Removed 0 outliers with isize >= 844
[INFO] [RUN_LUMPY] histo mean: 218.3957 stdev: 85.4206071245
[INFO] [RUN_LUMPY] Step 2: Run LUMPY with paired-end and split-reads
[INFO] CMD: /******/Projects/NGS/lumpy-sv/bin/lumpy     -mw 4     -tt 0     -pe 'id:******,bam_file:/******/TEST/test_lumpy/******/******.discordants.bam,histo_file:/******/TEST/test_lumpy/******/******.histo,mean:218.3957,stdev:85.4206071245,read_length:150,min_non_overlap:101,discordant_z:5,back_distance:10,weight:1,min_mapping_threshold:20'     -sr 'id:******,bam_file:/******/TEST/test_lumpy/******/******.splitters.bam,back_distance:10,weight:1,min_mapping_threshold:20'     > /******/TEST/test_lumpy/******/******.vcf

[ERROR] 417	0
1	1000000
2	1000000
3	1000000
4	1000000
5	1000000
6	1000000
7	1000000
8	1000000
9	1000000
10	1000000
11	1000000
12	1000000
13	1000000
13	2000000
13	4000000
13	8000000
13	16000000
13	32000000
14	1000000
14	2000000
14	4000000
14	8000000
14	16000000
14	32000000
15	1000000
15	2000000
15	4000000
15	8000000
15	16000000
15	32000000
16	1000000
17	1000000
18	1000000
19	1000000
20	1000000
20	2000000
21	1000000
21	2000000
21	4000000
21	8000000
21	16000000
22	1000000
22	2000000
22	4000000
22	8000000
22	16000000
22	32000000
X	1000000
Y	1000000
Y	2000000
Y	4000000
MT	1000000
GL000226.1	1000000
GL000229.1	1000000
GL000239.1	1000000
GL000245.1	1000000
GL000197.1	1000000
GL000246.1	1000000
GL000232.1	1000000
GL000237.1	1000000
GL000204.1	1000000
GL000198.1	1000000
GL000208.1	1000000
GL000228.1	1000000
GL000214.1	1000000
GL000221.1	1000000
GL000218.1	1000000
GL000220.1	1000000
GL000199.1	1000000
GL000217.1	1000000
GL000216.1	1000000
GL000205.1	1000000
GL000219.1	1000000
GL000224.1	1000000
GL000195.1	1000000
GL000212.1	1000000
GL000222.1	1000000
GL000193.1	1000000
GL000225.1	1000000
GL000192.1	1000000
[INFO] [TIME] Lumpy: 1min 01sec

At the end of log, LUMPY output something to stderr:

417 0
1 1000000
2 1000000
3 1000000
......

Questions:
(1). Is this output ok? or somethings not expected occurred that I should pay attention to.
(2). Then, I found all REF in VCF are 'N'. Is that also the expected output?
(3). I want to plot ROC using SU in FORMAT as threshold, Is that reasonable? It seems to be the same thing in paper.

Many thanks!

@ryanlayer
Copy link
Collaborator

ryanlayer commented Apr 16, 2019 via email

@codeunsolved
Copy link
Author

Thank you! thank you for your help! Dr. Layer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants