Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curcake SRR8767348; ignoring read without sequence #28

Closed
akesarwani opened this issue Nov 4, 2019 · 5 comments
Closed

Curcake SRR8767348; ignoring read without sequence #28

akesarwani opened this issue Nov 4, 2019 · 5 comments

Comments

@akesarwani
Copy link

For one the curlcake sample from Liu_et_Nat_Com_2019 (SRR8767348.fastq.gz), the feature extraction from fastq produced weird results. Unable to understand. Could some please help!

[M::mm_idx_gen::0.0380.26] collected minimizers
[M::mm_idx_gen::0.043
0.35] sorted minimizers
[M::main::0.0430.35] loaded/built the index for 4 target sequence(s)
[M::mm_mapopt_update::0.045
0.34] mid_occ = 3
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 4
[M::mm_idx_stat::0.0450.35] distinct minimizers: 1864 (99.09% are singletons); average occurrences: 1.009; average spacing: 5.316
[M::worker_pipeline::83.752
2.98] mapped 396018 sequences
[M::worker_pipeline::165.269*1.99] mapped 348987 sequences
[M::main] Version: 2.17-r941
[M::main] CMD: minimap2 -ax map-ont /projects/ke-lab/kesara/ONT/downloads/Liu_et_Nat_Com_2019/curlcake/reference/GSE124309_FASTA_sequences_of_Curlcakes.fa /projects/ke-lab/kesara/ONT/results/epinano/SRR8767348.U2T.fastq
[M::main] Real time: 165.282 sec; CPU: 329.645 sec; Peak RSS: 2.254 GB
[bam_sort_core] merging from 0 files and 4 in-memory blocks...
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.6490
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.8810
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.48744
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.94999
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.205713
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.261486
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.272770
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.311546
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.373700
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.389571
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.526071
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.532636
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.576609
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.667801
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.716665
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.52678
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.217676
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.716054
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.210763
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.513356
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.744910
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.10752
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.218171
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.463532
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.548731
[INFO][Sam2Tsv]Count: 5,728 Elapsed: 11 seconds(0.10%) Remains: 3 hours(99.90%) Last: cc6m_2244_t7_ecorv:10
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.269086
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.159458
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.116329
[INFO][Sam2Tsv]Count: 12,016 Elapsed: 22 seconds(1.20%) Remains: 30 minutes(98.80%) Last: cc6m_2244_t7_ecorv:120
[INFO][Sam2Tsv]Count: 19,229 Elapsed: 33 seconds(6.76%) Remains: 7 minutes(93.24%) Last: cc6m_2244_t7_ecorv:676
[INFO][Sam2Tsv]Count: 21,801 Elapsed: 44 seconds(8.54%) Remains: 7 minutes(91.46%) Last: cc6m_2244_t7_ecorv:854
[INFO][Sam2Tsv]Count: 28,585 Elapsed: 55 seconds(11.28%) Remains: 7 minutes(88.72%) Last: cc6m_2244_t7_ecorv:1,128
[INFO][Sam2Tsv]Count: 35,531 Elapsed: 1 minute(14.17%) Remains: 6 minutes(85.83%) Last: cc6m_2244_t7_ecorv:1,417
[INFO][Sam2Tsv]Count: 47,470 Elapsed: 1 minute(17.57%) Remains: 6 minutes(82.43%) Last: cc6m_2244_t7_ecorv:1,757
[INFO][Sam2Tsv]Count: 49,954 Elapsed: 1 minute(18.05%) Remains: 6 minutes(81.95%) Last: cc6m_2244_t7_ecorv:1,805
[INFO][Sam2Tsv]Count: 63,706 Elapsed: 1 minute(0.06%) Remains: 1 day(99.94%) Last: cc6m_2459_t7_ecorv:6
[INFO][Sam2Tsv]Count: 66,561 Elapsed: 1 minute(0.08%) Remains: 1 day(99.92%) Last: cc6m_2459_t7_ecorv:8
[INFO][Sam2Tsv]Count: 69,608 Elapsed: 2 minutes(0.09%) Remains: 1 day(99.91%) Last: cc6m_2459_t7_ecorv:9
[INFO][Sam2Tsv]Count: 74,088 Elapsed: 2 minutes(0.09%) Remains: 1 day(99.91%) Last: cc6m_2459_t7_ecorv:9
[INFO][Sam2Tsv]Count: 74,844 Elapsed: 2 minutes(0.09%) Remains: 1 day(99.91%) Last: cc6m_2459_t7_ecorv:9
[INFO][Sam2Tsv]Count: 75,057 Elapsed: 2 minutes(0.10%) Remains: 1 day(99.90%) Last: cc6m_2459_t7_ecorv:10
[INFO][Sam2Tsv]Count: 75,547 Elapsed: 2 minutes(0.10%) Remains: 1 day(99.90%) Last: cc6m_2459_t7_ecorv:10
[INFO][Sam2Tsv]Count: 75,935 Elapsed: 3 minutes(0.10%) Remains: 2 days(99.90%) Last: cc6m_2459_t7_ecorv:10
[INFO][Sam2Tsv]Count: 77,641 Elapsed: 3 minutes(0.10%) Remains: 2 days(99.90%) Last: cc6m_2459_t7_ecorv:10
[INFO][Sam2Tsv]Count: 82,926 Elapsed: 3 minutes(0.11%) Remains: 2 days(99.89%) Last: cc6m_2459_t7_ecorv:11
[INFO][Sam2Tsv]Count: 86,991 Elapsed: 3 minutes(0.14%) Remains: 1 day(99.86%) Last: cc6m_2459_t7_ecorv:14
[INFO][Sam2Tsv]Count: 87,087 Elapsed: 3 minutes(0.15%) Remains: 1 day(99.85%) Last: cc6m_2459_t7_ecorv:15
[INFO][Sam2Tsv]Count: 87,252 Elapsed: 3 minutes(0.15%) Remains: 1 day(99.85%) Last: cc6m_2459_t7_ecorv:15
[INFO][Sam2Tsv]Count: 87,500 Elapsed: 4 minutes(0.15%) Remains: 1 day(99.85%) Last: cc6m_2459_t7_ecorv:15
[INFO][Sam2Tsv]Count: 88,780 Elapsed: 4 minutes(0.15%) Remains: 2 days(99.85%) Last: cc6m_2459_t7_ecorv:15
[INFO][Sam2Tsv]Count: 94,268 Elapsed: 4 minutes(0.16%) Remains: 1 day(99.84%) Last: cc6m_2459_t7_ecorv:16
[INFO][Sam2Tsv]Count: 97,359 Elapsed: 4 minutes(0.16%) Remains: 2 days(99.84%) Last: cc6m_2459_t7_ecorv:16
[INFO][Sam2Tsv]Count: 102,909 Elapsed: 4 minutes(0.16%) Remains: 2 days(99.84%) Last: cc6m_2459_t7_ecorv:16
[INFO][Sam2Tsv]Count: 108,371 Elapsed: 5 minutes(0.16%) Remains: 2 days(99.84%) Last: cc6m_2459_t7_ecorv:16
[INFO][Sam2Tsv]Count: 111,018 Elapsed: 5 minutes(0.17%) Remains: 2 days(99.83%) Last: cc6m_2459_t7_ecorv:17
[INFO][Sam2Tsv]Count: 116,979 Elapsed: 5 minutes(0.17%) Remains: 2 days(99.83%) Last: cc6m_2459_t7_ecorv:17
[INFO][Sam2Tsv]Count: 122,575 Elapsed: 5 minutes(0.29%) Remains: 1 day(99.71%) Last: cc6m_2459_t7_ecorv:29
[INFO][Sam2Tsv]Count: 127,852 Elapsed: 5 minutes(0.48%) Remains: 20 hours(99.52%) Last: cc6m_2459_t7_ecorv:48
[INFO][Sam2Tsv]Count: 133,454 Elapsed: 5 minutes(1.72%) Remains: 5 hours(98.28%) Last: cc6m_2459_t7_ecorv:172
[INFO][Sam2Tsv]Count: 139,083 Elapsed: 6 minutes(2.98%) Remains: 3 hours(97.02%) Last: cc6m_2459_t7_ecorv:298
[INFO][Sam2Tsv]Count: 144,749 Elapsed: 6 minutes(4.21%) Remains: 2 hours(95.79%) Last: cc6m_2459_t7_ecorv:421
[INFO][Sam2Tsv]Count: 151,006 Elapsed: 6 minutes(5.95%) Remains: 1 hour(94.05%) Last: cc6m_2459_t7_ecorv:595
[INFO][Sam2Tsv]Count: 154,925 Elapsed: 6 minutes(6.84%) Remains: 1 hour(93.16%) Last: cc6m_2459_t7_ecorv:684
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.117494
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.286166
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.3889
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.369619
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.415095
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.439087
[INFO][Sam2Tsv]Count: 157,227 Elapsed: 6 minutes(7.30%) Remains: 1 hour(92.70%) Last: cc6m_2459_t7_ecorv:730
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.674372
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.87373
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.89719
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.98935
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.170353
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.224054
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.248600
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.280525
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.377275
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.519807
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.592519
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.622145
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.639466
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.691410
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.16925
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.139264
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.215124
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.291455
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.412762
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.317610
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.439003
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.678649
[INFO][Sam2Tsv]Count: 158,510 Elapsed: 7 minutes(7.65%) Remains: 1 hour(92.35%) Last: cc6m_2459_t7_ecorv:765
[WARN][Sam2Tsv]Ignoring read without sequence: SRR8767348.98271
[INFO][Sam2Tsv]Count: 159,277 Elapsed: 7 minutes(7.72%) Remains: 1 hour(92.28%) Last: cc6m_2459_t7_ecorv:772
[INFO][Sam2Tsv]Count: 163,214 Elapsed: 7 minutes(8.68%) Remains: 1 hour(91.32%) Last: cc6m_2459_t7_ecorv:868
[INFO][Sam2Tsv]Count: 169,601 Elapsed: 7 minutes(10.12%) Remains: 1 hour(89.88%) Last: cc6m_2459_t7_ecorv:1,012
[INFO][Sam2Tsv]Count: 177,563 Elapsed: 7 minutes(11.88%) Remains: 58 minutes(88.12%) Last: cc6m_2459_t7_ecorv:1,188
[INFO][Sam2Tsv]Count: 186,905 Elapsed: 8 minutes(13.70%) Remains: 51 minutes(86.30%) Last: cc6m_2459_t7_ecorv:1,370
[INFO][Sam2Tsv]Count: 196,915 Elapsed: 8 minutes(15.27%) Remains: 46 minutes(84.73%) Last: cc6m_2459_t7_ecorv:1,527
[INFO][Sam2Tsv]Count: 209,441 Elapsed: 8 minutes(17.26%) Remains: 40 minutes(82.74%) Last: cc6m_2459_t7_ecorv:1,726
[INFO][Sam2Tsv]Count: 226,472 Elapsed: 8 minutes(20.34%) Remains: 33 minutes(79.66%) Last: cc6m_2459_t7_ecorv:2,034
[INFO][Sam2Tsv]Count: 240,011 Elapsed: 8 minutes(22.47%) Remains: 30 minutes(77.53%) Last: cc6m_2595_t7_ecorv:5
[INFO][Sam2Tsv]Count: 245,567 Elapsed: 9 minutes(22.50%) Remains: 31 minutes(77.50%) Last: cc6m_2595_t7_ecorv:8
[INFO][Sam2Tsv]Count: 251,220 Elapsed: 9 minutes(22.52%) Remains: 31 minutes(77.48%) Last: cc6m_2595_t7_ecorv:10
[INFO][Sam2Tsv]Count: 254,828 Elapsed: 9 minutes(22.52%) Remains: 32 minutes(77.48%) Last: cc6m_2595_t7_ecorv:10
[INFO][Sam2Tsv]Count: 260,504 Elapsed: 9 minutes(22.52%) Remains: 32 minutes(77.48%) Last: cc6m_2595_t7_ecorv:10
[INFO][Sam2Tsv]Count: 265,000 Elapsed: 9 minutes(22.52%) Remains: 33 minutes(77.48%) Last: cc6m_2595_t7_ecorv:10
[INFO][Sam2Tsv]Count: 270,525 Elapsed: 9 minutes(22.53%) Remains: 34 minutes(77.47%) Last: cc6m_2595_t7_ecorv:11
[INFO][Sam2Tsv]Count: 275,950 Elapsed: 10 minutes(22.53%) Remains: 34 minutes(77.47%) Last: cc6m_2595_t7_ecorv:11
[INFO][Sam2Tsv]Count: 279,381 Elapsed: 10 minutes(22.53%) Remains: 35 minutes(77.47%) Last: cc6m_2595_t7_ecorv:11
[INFO][Sam2Tsv]Count: 285,038 Elapsed: 10 minutes(22.56%) Remains: 36 minutes(77.44%) Last: cc6m_2595_t7_ecorv:14
[INFO][Sam2Tsv]Count: 290,857 Elapsed: 10 minutes(22.78%) Remains: 36 minutes(77.22%) Last: cc6m_2595_t7_ecorv:36
[INFO][Sam2Tsv]Count: 295,898 Elapsed: 10 minutes(24.16%) Remains: 34 minutes(75.84%) Last: cc6m_2595_t7_ecorv:174
[INFO][Sam2Tsv]Count: 301,757 Elapsed: 11 minutes(25.74%) Remains: 31 minutes(74.26%) Last: cc6m_2595_t7_ecorv:332
[INFO][Sam2Tsv]Count: 307,762 Elapsed: 11 minutes(27.27%) Remains: 30 minutes(72.73%) Last: cc6m_2595_t7_ecorv:485
[INFO][Sam2Tsv]Count: 314,040 Elapsed: 11 minutes(28.77%) Remains: 28 minutes(71.23%) Last: cc6m_2595_t7_ecorv:635
[INFO][Sam2Tsv]Count: 320,957 Elapsed: 11 minutes(30.39%) Remains: 26 minutes(69.61%) Last: cc6m_2595_t7_ecorv:797
[INFO][Sam2Tsv]Count: 325,295 Elapsed: 11 minutes(31.31%) Remains: 26 minutes(68.69%) Last: cc6m_2595_t7_ecorv:889
[INFO][Sam2Tsv]Count: 333,474 Elapsed: 12 minutes(32.80%) Remains: 24 minutes(67.20%) Last: cc6m_2595_t7_ecorv:1,038
[INFO][Sam2Tsv]Count: 342,027 Elapsed: 12 minutes(34.40%) Remains: 23 minutes(65.60%) Last: cc6m_2595_t7_ecorv:1,198
[INFO][Sam2Tsv]Count: 350,753 Elapsed: 12 minutes(35.76%) Remains: 22 minutes(64.24%) Last: cc6m_2595_t7_ecorv:1,334
[INFO][Sam2Tsv]Count: 359,807 Elapsed: 12 minutes(37.43%) Remains: 21 minutes(62.57%) Last: cc6m_2595_t7_ecorv:1,501
[INFO][Sam2Tsv]Count: 369,442 Elapsed: 12 minutes(38.62%) Remains: 20 minutes(61.38%) Last: cc6m_2595_t7_ecorv:1,620
[INFO][Sam2Tsv]Count: 380,514 Elapsed: 12 minutes(40.22%) Remains: 19 minutes(59.78%) Last: cc6m_2595_t7_ecorv:1,780
[INFO][Sam2Tsv]Count: 395,411 Elapsed: 13 minutes(42.44%) Remains: 17 minutes(57.56%) Last: cc6m_2595_t7_ecorv:2,002
[INFO][Sam2Tsv]Count: 406,515 Elapsed: 13 minutes(43.68%) Remains: 17 minutes(56.32%) Last: cc6m_2595_t7_ecorv:2,126
[INFO][Sam2Tsv]Count: 427,821 Elapsed: 13 minutes(47.08%) Remains: 15 minutes(52.92%) Last: cc6m_2709_t7_ecorv:8
[INFO][Sam2Tsv]Count: 432,891 Elapsed: 13 minutes(47.10%) Remains: 15 minutes(52.90%) Last: cc6m_2709_t7_ecorv:10
[INFO][Sam2Tsv]Count: 437,761 Elapsed: 13 minutes(47.10%) Remains: 15 minutes(52.90%) Last: cc6m_2709_t7_ecorv:10
[INFO][Sam2Tsv]Count: 442,793 Elapsed: 14 minutes(47.11%) Remains: 15 minutes(52.89%) Last: cc6m_2709_t7_ecorv:11
[INFO][Sam2Tsv]Count: 447,936 Elapsed: 14 minutes(47.12%) Remains: 16 minutes(52.88%) Last: cc6m_2709_t7_ecorv:12
[INFO][Sam2Tsv]Count: 453,185 Elapsed: 14 minutes(47.14%) Remains: 16 minutes(52.86%) Last: cc6m_2709_t7_ecorv:14
[INFO][Sam2Tsv]Count: 458,257 Elapsed: 14 minutes(47.14%) Remains: 16 minutes(52.86%) Last: cc6m_2709_t7_ecorv:14
[INFO][Sam2Tsv]Count: 463,289 Elapsed: 14 minutes(47.18%) Remains: 16 minutes(52.82%) Last: cc6m_2709_t7_ecorv:18
[INFO][Sam2Tsv]Count: 468,680 Elapsed: 15 minutes(47.18%) Remains: 16 minutes(52.82%) Last: cc6m_2709_t7_ecorv:18
[INFO][Sam2Tsv]Count: 473,745 Elapsed: 15 minutes(47.18%) Remains: 17 minutes(52.82%) Last: cc6m_2709_t7_ecorv:18
[INFO][Sam2Tsv]Count: 478,580 Elapsed: 15 minutes(47.20%) Remains: 17 minutes(52.80%) Last: cc6m_2709_t7_ecorv:20
[INFO][Sam2Tsv]Count: 483,621 Elapsed: 15 minutes(47.26%) Remains: 17 minutes(52.74%) Last: cc6m_2709_t7_ecorv:26
[INFO][Sam2Tsv]Count: 488,529 Elapsed: 15 minutes(47.37%) Remains: 17 minutes(52.63%) Last: cc6m_2709_t7_ecorv:37
[INFO][Sam2Tsv]Count: 493,220 Elapsed: 16 minutes(47.97%) Remains: 17 minutes(52.03%) Last: cc6m_2709_t7_ecorv:97
[INFO][Sam2Tsv]Count: 498,707 Elapsed: 16 minutes(49.44%) Remains: 16 minutes(50.56%) Last: cc6m_2709_t7_ecorv:244
[INFO][Sam2Tsv]Count: 504,196 Elapsed: 16 minutes(50.93%) Remains: 15 minutes(49.07%) Last: cc6m_2709_t7_ecorv:392
[INFO][Sam2Tsv]Count: 509,455 Elapsed: 16 minutes(52.17%) Remains: 15 minutes(47.83%) Last: cc6m_2709_t7_ecorv:516
[INFO][Sam2Tsv]Count: 515,568 Elapsed: 16 minutes(53.67%) Remains: 14 minutes(46.33%) Last: cc6m_2709_t7_ecorv:666
[INFO][Sam2Tsv]Count: 519,924 Elapsed: 16 minutes(54.76%) Remains: 14 minutes(45.24%) Last: cc6m_2709_t7_ecorv:775
[INFO][Sam2Tsv]Count: 526,384 Elapsed: 17 minutes(56.48%) Remains: 13 minutes(43.52%) Last: cc6m_2709_t7_ecorv:947
[INFO][Sam2Tsv]Count: 533,537 Elapsed: 17 minutes(58.19%) Remains: 12 minutes(41.81%) Last: cc6m_2709_t7_ecorv:1,118
[INFO][Sam2Tsv]Count: 538,545 Elapsed: 17 minutes(59.36%) Remains: 11 minutes(40.64%) Last: cc6m_2709_t7_ecorv:1,235
[INFO][Sam2Tsv]Count: 546,542 Elapsed: 17 minutes(61.37%) Remains: 11 minutes(38.63%) Last: cc6m_2709_t7_ecorv:1,436
[INFO][Sam2Tsv]Count: 555,744 Elapsed: 17 minutes(63.39%) Remains: 10 minutes(36.61%) Last: cc6m_2709_t7_ecorv:1,638
[INFO][Sam2Tsv]Count: 560,945 Elapsed: 18 minutes(64.39%) Remains: 9 minutes(35.61%) Last: cc6m_2709_t7_ecorv:1,738
[INFO][Sam2Tsv]Count: 572,867 Elapsed: 18 minutes(66.60%) Remains: 9 minutes(33.40%) Last: cc6m_2709_t7_ecorv:1,959
[INFO][Sam2Tsv]Count: 586,593 Elapsed: 18 minutes(68.71%) Remains: 8 minutes(31.29%) Last: cc6m_2709_t7_ecorv:2,170
[INFO][Sam2Tsv]. Completed. N=745,740. That took:19 minutes

@Huanle
Copy link
Collaborator

Huanle commented Nov 5, 2019

@akesarwani Can you send me the command that generated these errors?
Thanks.

@akesarwani
Copy link
Author

akesarwani commented Nov 5, 2019

Please see below the entire script. The same script worked for other 3 culrcakes but not for Open
Curcake SRR8767348

#1 trim the first and last few bad quality bases from raw fastq with NanoFilt (feel free to replace nanofilt with custome script)
gunzip -c ${fq} | NanoFilt -q 0 --headcrop 5 --tailcrop 3 --readtype 1D --logfile ${prefix}.nanofilt.log > ${prefix}.h5t3.fastq
#NanoFilt -q 0 --headcrop 5 --tailcrop 3 --readtype 1D --logfile ${prefix}.nanofilt.log ${fq} > ${prefix}.h5t3.fastq # for uncompressed fastq

#2 'U' to 'T' conversion
awk '{ if (NR%4 == 2) {gsub(/U/,"T",$1); print $1} else print }' ${prefix}.h5t3.fastq > ${prefix}.U2T.fastq

#3 mapping to reference using minimap2
minimap2 -ax map-ont ${ref} ${prefix}.U2T.fastq | samtools view -bhS - | samtools sort -@ ${PBS_NP} -o ${prefix}.bam && samtools index ${prefix}.bam

#4 calling variants for each single read-to-reference alignment
#reads mapped to reverse strand of reference seqeucne will be flipped
java -jar /home/kesara/tools/jvarkit/dist/sam2tsv.jar -r ${ref} -o ${prefix}.bam.tsv ${prefix}.bam

module load python/2.7.10
#5 convert results from step 4 and generate per_read variants information; the input file can be splitted based on read into smaller files to speed this step up.
python ${script}/per_read_var.py ${prefix}.bam.tsv > ${prefix}.per_read.var.csv

#6 sumarize results from step 4 and generate variants information according the reference sequences (i.e., per_site variants); the input file can be splitted based on ref into smaller ones to speed this step up.
python ${script}/per_site_var.py ${prefix}.bam.tsv > ${prefix}.per_site.var.csv

#7 slide per_site variants with window size of 5, so that fast5 event table information can be combined
python ${script}/slide_per_site_var.py ${prefix}.ref.per_site.var.csv > ${prefix}.per_site.var.sliding.win.csv

@Huanle
Copy link
Collaborator

Huanle commented Nov 8, 2019

hi @akesarwani ,
based on the warning messages generated by minimap2, the input fastq file seems to have empty read entries.
May I ask you to double check if this is the case?
If so, can you re-download the seqeucnes?
thanks.

@akesarwani
Copy link
Author

I noticed that the reads that got warning "WARN][Sam2Tsv]Ignoring read without sequence:" were mapped to two locations.

@Huanle
Copy link
Collaborator

Huanle commented Nov 22, 2019

Hi @akesarwani , is multi-mapping common in your case?
If so, maybe you should proceed with uniquely mapped reads.
Otherwise, if you really want to keep those multi-mapping reads, maybe you can keep only the primary alignments.

@Huanle Huanle closed this as completed Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants