Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests to check generated data correctness #61

Closed
6 of 7 tasks
ayamada opened this issue Apr 20, 2017 · 6 comments
Closed
6 of 7 tasks

Add tests to check generated data correctness #61

ayamada opened this issue Apr 20, 2017 · 6 comments

Comments

@ayamada
Copy link
Contributor

ayamada commented Apr 20, 2017

(Came from #60 (review) )

Add many tests by #60, but some tests doesn't check generated(converted) result data.
Should check these data (but may need correct result data).

  1. ;; TODO: examine result

  2. (are [?args] (not-throw? (with-out-file temp-out (cli/pileup ?args)))

  3. ;; TODO: examine contents of temp-bam

  4. ;; TODO: check out-file

  5. (deftest about-create-dict

  6. ;; TODO: need bam-file include PCR duplications

  7. ;; TODO: Add test sorter/sort-by-pos with bam file includes many blocks

    • Need lightweight bam-file with many blocks (require to over 1500000) for sorter test coverage.
    • I commit ayamada@f2ed61b , it can work, but it cause very long time in lein cloverage, may not merge it.
@ayamada
Copy link
Contributor Author

ayamada commented Apr 21, 2017

I numbered items of this issue.

Seems (1)(2)(3)(4)(5) can be assigned to current cljam result for controlled test result.
(But it is very simple tests only comparing files, not perfect.)

Shall I create PR to compare contorolled test result for test (1)(2)(3)(4)(5) ?

ayamada added a commit to ayamada/cljam that referenced this issue Apr 21, 2017
ayamada added a commit to ayamada/cljam that referenced this issue Apr 21, 2017
ayamada added a commit to ayamada/cljam that referenced this issue Apr 21, 2017
totakke added a commit that referenced this issue Apr 24, 2017
Issue #61 (2) Examine result files of `cljam pileup` in test
ayamada added a commit to ayamada/cljam that referenced this issue Apr 24, 2017
totakke added a commit that referenced this issue Apr 25, 2017
Issue #61 (6) Examine result file of `dedupe` in test
totakke added a commit that referenced this issue Apr 25, 2017
Issue #61 (5) Examine result file of `create-dict` in test
ayamada added a commit to ayamada/cljam that referenced this issue Apr 25, 2017
ayamada added a commit to ayamada/cljam that referenced this issue Apr 25, 2017
@ayamada
Copy link
Contributor Author

ayamada commented Apr 25, 2017

Supplied tests for (1)(2)(4)(5)(6).

I have not good idea for (3).

I commit ayamada@f2ed61b for (7), work well, but it cause very long time in lein cloverage, may not merge it.

totakke added a commit that referenced this issue Apr 26, 2017
Issue #61 (1) Examine result file of `cljam normalize` in test
totakke added a commit that referenced this issue Apr 27, 2017
Issue #61 (4) Examine result file of `create-mpileup` based on (2)
@alumi
Copy link
Member

alumi commented Apr 27, 2017

About (3)

For small BAM files, it might be easier to check levels manually with IGV, samtools tview or something.

For more complex BAM files, we need some programatic approach.

One option is to fork samtools.
samtools doesn't have a feature to output levels, but tview calculates them internally in lpileup.

https://github.com/samtools/htslib/blob/95b1034/htslib/sam.h#L527
https://github.com/samtools/samtools/blob/74a3e0a/bam_lpileup.c#L141

bam_pileup1_t has level and a pointer to bam1_t which has original BAM alignment information in bam1_core_t.

It may be possible to output SAM records or pairs of QNAME and level from that struct.

@ayamada
Copy link
Contributor Author

ayamada commented Apr 27, 2017

@alumi Thanks for your explanation.
It seems to be difficult for me to verify, tentatively, only check result of small BAM in ayamada@a134664 .
If this is OK, I will make PR later.

@alumi
Copy link
Member

alumi commented Apr 28, 2017

I checked ayamada/cljam@a134664 and the levels seem to be correct. Thanks!


just FYI

I wrote rough patch, and tested the bam file with commands like samtools tview -p ref:1-1000 -d T test.sorted.bam 2>/dev/null | sort | uniq >> test.sorted.sam.

I got following SAM and it's equal to levels in ayamada/cljam@a134664.

r001	163	ref	7	30	8M4I4M1D3M	=	37	39	TTAGATAAAGAGGATACTG	*	XX:B:S,12561,2,20,112	LV:i:0
r002	0	ref	9	30	1S2I6M1P1I1P1I4M2I	*	0	0	AAAAGATAAGGGATAAA	*	LV:i:1
r003	0	ref	9	30	5H6M	*	0	0	AGCTAA	*	LV:i:2
r004	0	ref	16	30	6M14N1I5M	*	0	0	ATAGCTCTCAGC	*	LV:i:2
r003	16	ref	29	30	6H5M	*	0	0	TAGGC	*	LV:i:0
r001	83	ref	37	30	9M	=	7	-39	CAGCGCCAT	*	LV:i:0
x1	0	ref2	1	30	20M	*	0	0	AGGTTTTATAAAACAAATAA	????????????????????	LV:i:0
x2	0	ref2	2	30	21M	*	0	0	GGTTTTATAAAACAAATAATT	?????????????????????	LV:i:1
x3	0	ref2	6	30	9M4I13M	*	0	0	TTATAAAACAAATAATTAAGTCTACA	??????????????????????????	LV:i:2
x4	0	ref2	10	30	25M	*	0	0	CAAATAATTAAGTCTACAGAGCAAC	?????????????????????????	LV:i:3
x5	0	ref2	12	30	24M	*	0	0	AATAATTAAGTCTACAGAGCAACT	????????????????????????	LV:i:4
x6	0	ref2	14	30	23M	*	0	0	TAATTAAGTCTACAGAGCAACTA	???????????????????????	LV:i:5
diff --git bam_tview.c bam_tview.c
index 206ac8b..7b16d93 100644
--- bam_tview.c
+++ bam_tview.c
@@ -62,7 +62,7 @@ int base_tv_init(tview_t* tv, const char *fn, const char *fn_fa,
 {
     assert(tv!=NULL);
     assert(fn!=NULL);
-    tv->mrow = 24; tv->mcol = 80;
+    tv->mrow = 8000; tv->mcol = 80;
     tv->color_for = TV_COLOR_MAPQ;
     tv->is_dot = 1;

@@ -113,6 +113,7 @@ void base_tv_destroy(tview_t* tv)
     sam_close(tv->fp);
 }

+samFile* gOUT;

 int tv_pl_func(uint32_t tid, uint32_t pos, int n, const bam_pileup1_t *pl, void *data)
 {
@@ -166,6 +167,10 @@ int tv_pl_func(uint32_t tid, uint32_t pos, int n, const bam_pileup1_t *pl, void
     for (j = 0; j <= max_ins; ++j) {
         for (i = 0; i < n; ++i) {
             const bam_pileup1_t *p = pl + i;
+            bam1_t* d = bam_dup1(p->b);
+            uint32_t lv = p->level - 1;
+            bam_aux_append(d, "LV", 'i', 4, (uint8_t*)&lv);
+            sam_write1(gOUT, tv->header, d);
             int row = TV_MIN_ALNROW + p->level - tv->row_shift;
             if (j == 0) {
                 if (!p->is_del) {
@@ -409,7 +414,7 @@ int bam_tview_main(int argc, char *argv[])
         error("cannot create view");
         return EXIT_FAILURE;
     }
-
+    gOUT = sam_open_format("-", "w", 0);
     if ( position )
     {
         int tid, beg, end;
@@ -417,7 +422,7 @@ int bam_tview_main(int argc, char *argv[])
         if (name_lim) *name_lim = '\0';
         else beg = 0; // region parsing failed, but possibly a seq named "foo:a"
         tid = bam_name2id(tv->header, position);
-        if (tid >= 0) { tv->curr_tid = tid; tv->left_pos = beg; }
+        if (tid >= 0) { tv->curr_tid = tid; tv->left_pos = beg; tv->mcol = end;}
     }
     else if ( tv->fai )
     {
@@ -437,6 +442,6 @@ int bam_tview_main(int argc, char *argv[])
     tv->my_drawaln(tv, tv->curr_tid, tv->left_pos);
     tv->my_loop(tv);
     tv->my_destroy(tv);
-
+    sam_close(gOUT);
     return EXIT_SUCCESS;
 }
diff --git bam_tview_html.c bam_tview_html.c
index e3aecda..a7927b5 100644
--- bam_tview_html.c
+++ bam_tview_html.c
@@ -327,7 +327,7 @@ tview_t* html_tv_init(const char *fn, const char *fn_fa, const char *samples,
         }
     tv->row_count=0;
     tv->screen=NULL;
-    tv->out=stdout;
+    tv->out=stderr;
     tv->attributes=0;
     base_tv_init(base,fn,fn_fa,samples,fmt);
     /* initialize callbacks */

@ayamada
Copy link
Contributor Author

ayamada commented Apr 28, 2017

@alumi I see, thank you!
And, I created #70 .

alumi added a commit that referenced this issue Apr 29, 2017
Issue #61 (3) Examine result file of `cljam level` test
@alumi alumi mentioned this issue Jun 2, 2017
@alumi alumi closed this as completed Jun 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants