Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perl tests fail with perl version 5.18 #470

Closed
cmdcolin opened this issue Apr 29, 2014 · 10 comments
Closed

Perl tests fail with perl version 5.18 #470

cmdcolin opened this issue Apr 29, 2014 · 10 comments
Assignees
Labels
high priority related to a high-level project goal urgent this should be worked on ASAP
Milestone

Comments

@cmdcolin
Copy link
Contributor

cmdcolin commented Apr 29, 2014

Results of prove -I src/perl5 -lr tests/ on Ubuntu 14 with perl 5.18.2

``` tests/perl_tests/00.compile.t .............. ok tests/perl_tests/add-json.pl.t ............. ok # using temp dir /tmp/Yl3wANo8Si tests/perl_tests/bam-to-json.pl.t .......... ok # writing output to /tmp/Qs_enmpmW8

Failed test 'got the right genes trackdata'

at tests/perl_tests/biodb-to-json.pl.t line 44.

Structures begin differing at:

$got->[3] = 'ctgA'

$expected->[3] = 'example'

[

0,

1049,

9000,

1,

'ctgA',

'EDEN',

'gene',

'EDEN',

'example',

'protein kinase',

[

[

1,

1299,

9000,

1,

[

[

2,

2999,

3300,

1,

'five_prime_UTR',

'example',

'EDEN.3',

'ctgA'

],

[

3,

7600,

9000,

1,

'three_prime_UTR',

'example',

'ctgA',

'EDEN.3'

],

[

4,

3300,

3902,

1,

'example',

'CDS',

'ctgA',

'EDEN.3',

'0'

],

[

5,

1299,

1500,

1,

'five_prime_UTR',

'ctgA',

'EDEN.3',

'example'

],

[

6,

6999,

7600,

1,

'1',

'EDEN.3',

'ctgA',

'CDS',

'example'

],

[

7,

4999,

5500,

1,

'example',

'CDS',

'1',

'ctgA',

'EDEN.3'

]

],

'Eden splice form 3',

'example',

'EDEN.3',

'mRNA',

'EDEN',

'ctgA',

'EDEN.3'

],

[

8,

1049,

9000,

1,

'example',

'EDEN.2',

[

[

9,

7608,

9000,

1,

'three_prime_UTR',

'EDEN.2',

'ctgA',

'example'

],

[

10,

6999,

7608,

1,

'0',

'EDEN.2',

'ctgA',

'CDS',

'example'

],

[

10,

1200,

1500,

1,

'0',

'EDEN.2',

'ctgA',

'CDS',

'example'

],

[

11,

1049,

1200,

1,

'example',

'ctgA',

'EDEN.2',

'five_prime_UTR'

],

[

10,

4999,

5500,

1,

'0',

'EDEN.2',

'ctgA',

'CDS',

'example'

]

],

'Eden splice form 2',

'EDEN.2',

'EDEN',

'ctgA',

'mRNA'

],

[

12,

1049,

9000,

1,

'Eden splice form 1',

[

[

13,

1200,

1500,

1,

'example',

'CDS',

'EDEN.1',

'ctgA',

'0'

],

[

4,

4999,

5500,

1,

'example',

'CDS',

'ctgA',

'EDEN.1',

'0'

],

[

14,

2999,

3902,

1,

'example',

'CDS',

'EDEN.1',

'ctgA',

'0'

],

[

9,

7608,

9000,

1,

'three_prime_UTR',

'EDEN.1',

'ctgA',

'example'

],

[

15,

1049,

1200,

1,

'ctgA',

'EDEN.1',

'example',

'five_prime_UTR'

],

[

16,

6999,

7608,

1,

'example',

'CDS',

'0',

'EDEN.1',

'ctgA'

]

],

'EDEN.1',

'example',

'mRNA',

'EDEN.1',

'ctgA',

'EDEN'

]

]

]

writing output to /tmp/ygsDVImhob

Failed test 'got the right genes trackdata'

at tests/perl_tests/biodb-to-json.pl.t line 111.

Structures begin differing at:

$got->[3] = 'gene'

$expected->[3] = 'example'

[

0,

1049,

9000,

1,

'gene',

'ctgA',

'EDEN',

[

[

1,

1049,

9000,

1,

'mRNA',

'EDEN.1',

'ctgA',

'EDEN',

'Eden splice form 1',

[

[

2,

2999,

3902,

1,

'example',

'CDS',

'ctgA',

'EDEN.1',

'0'

],

[

3,

4999,

5500,

1,

'example',

'CDS',

'EDEN.1',

'ctgA',

'0'

],

[

4,

1200,

1500,

1,

'example',

'0',

'ctgA',

'EDEN.1',

'CDS'

],

[

5,

6999,

7608,

1,

'0',

'EDEN.1',

'ctgA',

'CDS',

'example'

],

[

6,

1049,

1200,

1,

'example',

'ctgA',

'EDEN.1',

'five_prime_UTR'

],

[

7,

7608,

9000,

1,

'three_prime_UTR',

'example',

'EDEN.1',

'ctgA'

]

],

'EDEN.1',

'example'

],

[

8,

1049,

9000,

1,

'mRNA',

'EDEN',

'ctgA',

'EDEN.2',

[

[

9,

7608,

9000,

1,

'EDEN.2',

'ctgA',

'example',

'three_prime_UTR'

],

[

10,

6999,

7608,

1,

'example',

'0',

'EDEN.2',

'ctgA',

'CDS'

],

[

11,

1200,

1500,

1,

'CDS',

'0',

'EDEN.2',

'ctgA',

'example'

],

[

7,

1049,

1200,

1,

'five_prime_UTR',

'example',

'EDEN.2',

'ctgA'

],

[

4,

4999,

5500,

1,

'example',

'0',

'ctgA',

'EDEN.2',

'CDS'

]

],

'Eden splice form 2',

'example',

'EDEN.2'

],

[

12,

1299,

9000,

1,

'EDEN',

'ctgA',

'EDEN.3',

'mRNA',

'example',

'EDEN.3',

[

[

13,

3300,

3902,

1,

'ctgA',

'EDEN.3',

'0',

'CDS',

'example'

],

[

14,

2999,

3300,

1,

'example',

'EDEN.3',

'ctgA',

'five_prime_UTR'

],

[

15,

7600,

9000,

1,

'three_prime_UTR',

'example',

'ctgA',

'EDEN.3'

],

[

10,

6999,

7600,

1,

'example',

'1',

'EDEN.3',

'ctgA',

'CDS'

],

[

9,

1299,

1500,

1,

'EDEN.3',

'ctgA',

'example',

'five_prime_UTR'

],

[

16,

4999,

5500,

1,

'CDS',

'ctgA',

'EDEN.3',

'1',

'example'

]

],

'Eden splice form 3'

]

],

'protein kinase',

'example',

'EDEN'

]

Looks like you failed 2 tests of 15.

tests/perl_tests/biodb-to-json.pl.t ........
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/15 subtests
tests/perl_tests/conf_format.t ............. ok

using temp dir /tmp/4fvuNkBf6A

using temp dir /tmp/J3HXHaH5XA

tests/perl_tests/draw-basepair-track.pl.t .. ok
tests/perl_tests/fakefasta.t ............... ok
tests/perl_tests/featurestream.t ........... ok

Failed test 'exonerate mRNA has its subfeatures'

at tests/perl_tests/flatfile-to-json.pl.t line 95.

got: ''

expected: 'ARRAY'

{

'featureCount' => 2,

'formatVersion' => 1,

'histograms' => {

'meta' => [

{

'arrayParams' => {

'chunkSize' => 10000,

'length' => 1,

'urlTemplate' => 'hist-50000-{Chunk}.jsonz'

},

'basesPerBin' => '50000'

}

],

'stats' => [

{

'basesPerBin' => '50000',

'max' => 2,

'mean' => 2

}

]

},

'intervals' => {

'classes' => [

{

'attributes' => [

'Start',

'End',

'Strand',

'Id',

'Subfeatures',

'Note',

'Source',

'Name',

'Type',

'Phase',

'Seq_id'

],

'isArrayAttr' => {

'Subfeatures' => 1

}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Seq_id',

'Type',

'Phase',

'Source'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Seq_id',

'Phase',

'Type',

'Source'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Source',

'Phase',

'Type',

'Seq_id'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Subfeatures',

'Id',

'Seq_id',

'Type',

'Note',

'Source',

'Name'

],

'isArrayAttr' => {

'Subfeatures' => 1

}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Source',

'Seq_id',

'Type'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Source',

'Phase',

'Type',

'Seq_id'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Phase',

'Type',

'Seq_id',

'Source'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Source',

'Phase',

'Type',

'Seq_id'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Strand',

'Type',

'Seq_id',

'Source'

],

'isArrayAttr' => {}

},

{

'attributes' => [

'Start',

'End',

'Chunk'

],

'isArrayAttr' => {

'Sublist' => 1

}

}

],

'count' => 2,

'lazyClass' => 10,

'maxEnd' => 23000,

'minStart' => 12999,

'nclist' => [

[

0,

12999,

17200,

1,

'cds-Apple2',

[

[

1,

13499,

13800,

1,

'ctgA',

'CDS',

0,

'predicted'

],

[

2,

14999,

15500,

1,

'ctgA',

1,

'CDS',

'predicted'

],

[

3,

16499,

17000,

1,

'predicted',

2,

'CDS',

'ctgA'

]

],

'mRNA with CDSs but no UTRs',

'predicted',

'Apple2',

'mRNA',

0,

'ctgA'

],

[

4,

17399,

23000,

1,

[

[

5,

17399,

17999,

1,

'exonerate',

'ctgA',

'UTR'

],

[

6,

17999,

18800,

1,

'exonerate',

0,

'CDS',

'ctgA'

],

[

7,

18999,

19500,

1,

1,

'CDS',

'ctgA',

'exonerate'

],

[

8,

20999,

21200,

1,

'exonerate',

2,

'CDS',

'ctgA'

],

[

9,

21200,

23000,

1,

'UTR',

'ctgA',

'exonerate'

]

],

'rna-Apple3',

'ctgA',

'mRNA',

'mRNA with both CDSs and UTRs',

'exonerate',

'Apple3'

]

],

'urlTemplate' => 'lf-{Chunk}.jsonz'

}

}

Can't use string ("Apple3") as an ARRAY ref while "strict refs" in use at tests/perl_tests/flatfile-to-json.pl.t line 97.

Tests were run but no plan was declared and done_testing() was not seen.

tests/perl_tests/flatfile-to-json.pl.t .....
Dubious, test returned 25 (wstat 6400, 0x1900)
Failed 1/5 subtests

Failed test 'got right type in parent feature (full record)'

at tests/perl_tests/genbank.t line 75.

got: 'Homo sapiens'

expected: 'mRNA'

[

0,

5001,

10950,

1,

[

'Eukaryota',

'Metazoa',

'Chordata',

'Craniata',

'Vertebrata',

'Euteleostomi',

'Mammalia',

'Eutheria',

'Euarchontoglires',

'Primates',

'Haplorrhini',

'Catarrhini',

'Hominidae',

'Homo'

],

'Homo sapiens (human)',

'MIM:138350',

'glutathione S-transferase mu 1, transcript variant 1',

'9606',

'Homo sapiens glutathione S-transferase mu 1 (GSTM1), RefSeqGene on chromosome 1.',

'genomic DNA',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'glutathione S-transferase mu 1, transcript variant 1',

[

'NG_009246.1',

'GI:219521909'

],

[

'RefSeq; RefSeqGene'

],

'NG_009246.1',

{

'genbank_division' => 'PRI',

'locus_name' => 'NG_009246',

'modification_date' => '25-JUN-2013',

'molecule_type' => 'DNA linear',

'sequence_length' => '12950 bp'

},

[

[

1,

5001,

5114,

0,

'alignment:Splign:1.39.8',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'GSTM1',

'exon',

'1',

'NG_009246.1'

],

[

2,

5079,

5114,

0,

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4',

'glutathione S-transferase Mu 1 isoform 1',

'annotated by transcript or proteomic data',

'NG_009246.1',

'MIM:138350',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'CDS',

'NP_000552.2',

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'2.5.1.18',

'GSTM1'

],

[

3,

5375,

5450,

0,

'MIM:138350',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'CDS',

'NP_000552.2',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'1',

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'GSTM1',

'2.5.1.18',

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4',

'glutathione S-transferase Mu 1 isoform 1',

'annotated by transcript or proteomic data',

'NG_009246.1'

],

[

4,

5878,

5942,

0,

'NG_009246.1',

'annotated by transcript or proteomic data',

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4',

'glutathione S-transferase Mu 1 isoform 1',

'1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'GSTM1',

'2.5.1.18',

'MIM:138350',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'NP_000552.2',

'CDS'

],

[

5,

6253,

6334,

0,

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4',

'glutathione S-transferase Mu 1 isoform 1',

'annotated by transcript or proteomic data',

'NG_009246.1',

'MIM:138350',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'NP_000552.2',

'CDS',

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'1',

'2.5.1.18',

'GSTM1'

],

[

6,

6430,

6530,

0,

'GSTM1',

'2.5.1.18',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'1',

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'CDS',

'NP_000552.2',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'MIM:138350',

'annotated by transcript or proteomic data',

'NG_009246.1',

'glutathione S-transferase Mu 1 isoform 1',

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4'

],

[

7,

7476,

7571,

0,

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4',

'glutathione S-transferase Mu 1 isoform 1',

'NG_009246.1',

'annotated by transcript or proteomic data',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'MIM:138350',

'NP_000552.2',

'CDS',

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'2.5.1.18',

'GSTM1'

],

[

8,

7659,

7769,

0,

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'2.5.1.18',

'GSTM1',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'MIM:138350',

'NP_000552.2',

'CDS',

'annotated by transcript or proteomic data',

'NG_009246.1',

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4',

'glutathione S-transferase Mu 1 isoform 1'

],

[

9,

10411,

10500,

0,

'NG_009246.1',

'annotated by transcript or proteomic data',

'isoform 1 is encoded by transcript variant 1; glutathione S-transferase M1; S-(hydroxyalkyl)glutathione lyase; GST class-mu 1; glutathione S-alkyltransferase; glutathione S-aryltransferase; glutathione S-aralkyltransferase; HB subunit 4; GST HB subunit 4',

'glutathione S-transferase Mu 1 isoform 1',

'similar to AA sequence (same species):RefSeq:NP_000552.2',

'1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'2.5.1.18',

'GSTM1',

'MIM:138350',

'MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLDAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK',

'NP_000552.2',

'CDS'

],

[

10,

5375,

5450,

0,

'GSTM1',

'alignment:Splign:1.39.8',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'exon',

'NG_009246.1',

'2'

],

[

11,

5878,

5942,

0,

'NG_009246.1',

'3',

'GSTM1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'alignment:Splign:1.39.8',

'exon'

],

[

12,

6253,

6525,

0,

'UniSTS:87865',

'STS',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'GSTM1',

'NG_009246.1',

'RH64476'

],

[

13,

6253,

6334,

0,

'exon',

'GSTM1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'alignment:Splign:1.39.8',

'NG_009246.1',

'4'

],

[

12,

6297,

6454,

0,

'UniSTS:158567',

'STS',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'GSTM1',

'NG_009246.1',

'GDB:655882'

],

[

13,

6430,

6530,

0,

'exon',

'GSTM1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'alignment:Splign:1.39.8',

'NG_009246.1',

'5'

],

[

13,

7476,

7571,

0,

'exon',

'GSTM1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'alignment:Splign:1.39.8',

'NG_009246.1',

'6'

],

[

14,

7659,

7769,

0,

'exon',

'GSTM1',

'alignment:Splign:1.39.8',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'NG_009246.1',

'7'

],

[

15,

10286,

11032,

0,

'STS',

'UniSTS:186432',

'NG_009246.1',

'G67222'

],

[

16,

10411,

10950,

0,

'exon',

'alignment:Splign:1.39.8',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'GSTM1',

'8',

'NG_009246.1'

],

[

17,

10594,

10942,

0,

'SHGC-12332',

'NG_009246.1',

'STS',

'UniSTS:33074',

'GSTM1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1'

],

[

18,

10594,

10880,

0,

'GSTM1',

'NG_009246.1',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'STS',

'UniSTS:33073'

],

[

12,

10632,

10780,

0,

'UniSTS:139106',

'STS',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'GSTM1',

'NG_009246.1',

'G62022'

]

],

'mRNA',

'GSTM1',

'REVIEWED REFSEQ: This record has been curated by NCBI staff. The

reference sequence was derived from AC000031.6 and AC000032.7.

This sequence is a reference standard in the RefSeqGene project.

Summary: Cytosolic and membrane-bound forms of glutathione

S-transferase are encoded by two distinct supergene families. At

present, eight distinct classes of the soluble cytoplasmic

mammalian glutathione S-transferases have been identified: alpha,

kappa, mu, omega, pi, sigma, theta and zeta. This gene encodes a

glutathione S-transferase that belongs to the mu class. The mu

class of enzymes functions in the detoxification of electrophilic

compounds, including carcinogens, therapeutic drugs, environmental

toxins and products of oxidative stress, by conjugation with

glutathione. The genes encoding the mu class of enzymes are

organized in a gene cluster on chromosome 1p13.3 and are known to

be highly polymorphic. These genetic variations can change an

individual's susceptibility to carcinogens and toxins as well as

affect the toxicity and efficacy of certain drugs. Null mutations

of this class mu gene have been linked with an increase in a number

of cancers, likely due to an increased susceptibility to

environmental toxins and carcinogens. Multiple protein isoforms are

encoded by transcript variants of this gene. [provided by RefSeq,

Jul 2008].',

'Homo sapiens',

'GSTM1',

'NM_000561.3',

'NG_009246'

]

Failed test 'type set correctly in subfeature'

at tests/perl_tests/genbank.t line 103.

got: 'NG_009246.1'

expected: 'exon'

[

1,

5001,

5114,

0,

'alignment:Splign:1.39.8',

'GST1; GSTM1-1; GSTM1a-1a; GSTM1b-1b; GTH4; GTM1; H-B; MU; MU-1',

'GSTM1',

'exon',

'1',

'NG_009246.1'

]

Looks like you failed 2 tests of 21.

tests/perl_tests/genbank.t .................
Dubious, test returned 2 (wstat 512, 0x200)
Failed 2/21 subtests

Failed test 'got right data from volvox test data run'

at tests/perl_tests/generate-names.pl.t line 39.

Structures begin differing at:

$got->{a1e/3.json}{rs116260263}{exact}[0][1] = '12'

$expected->{a1e/3.json}{rs116260263}{exact}[0][1] = '11'

Failed test 'same data after incremental run'

at tests/perl_tests/generate-names.pl.t line 58.

Structures begin differing at:

$got->{e8b/f.json}{rs17878802}{exact}[0][1] = '12'

$expected->{e8b/f.json}{rs17878802}{exact}[0][1] = '11'

Failed test 'same data after incremental run with --safeMode'

at tests/perl_tests/generate-names.pl.t line 74.

Structures begin differing at:

$got->{7bf/e.json}{rs117304270}{exact}[0][1] = '12'

$expected->{7bf/e.json}{rs117304270}{exact}[0][1] = '11'

Looks like you failed 3 tests of 4.

tests/perl_tests/generate-names.pl.t .......
Dubious, test returned 3 (wstat 768, 0x300)
Failed 3/4 subtests
tests/perl_tests/json.t .................... ok

loaded 2559 test features

tests/perl_tests/lazy_nclist.t ............. ok
tests/perl_tests/maker2jbrowse.t ........... ok
tests/perl_tests/nclist.t .................. ok
WARNING: multiple reference sequences found named 'NC_001133', using only the first one.
WARNING: multiple reference sequences found named 'NC_001133', using only the first one.

/tmp/kClf3A11Ba

tests/perl_tests/prepare-refseqs.pl.t ...... ok
tests/perl_tests/remove-track.pl.t ......... ok

writing output to /tmp/DgNP6Qb2ry

Failed test 'ucsc_to_json.pl made the right output'

at tests/perl_tests/ucsc-to-json.pl.t line 33.

Structures begin differing at:

$got->{tracks/knownGene/chr1/lf-5.jsonz}[0][5] = 'B2RMP9'

$expected->{tracks/knownGene/chr1/lf-5.jsonz}[0][5] = 'uc001cfh.1'

Track nonExistentTrack not found in the UCSC track database (trackDb.txt.gz) file. Is it a real UCSC track? at bin/ucsc-to-json.pl line 194.

To format the jaxQtlAsIs track, you must have both files tests/data/hg19/database//jaxQtlAsIs.sql and tests/data/hg19/database//jaxQtlAsIs.txt.gz

Looks like you failed 1 test of 6.

tests/perl_tests/ucsc-to-json.pl.t .........
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/6 subtests

using temp dir /tmp/3OTz0xjUUl

tests/perl_tests/wig-to-json.pl.t .......... ok

Test Summary Report

tests/perl_tests/biodb-to-json.pl.t (Wstat: 512 Tests: 15 Failed: 2)
Failed tests: 3, 11
Non-zero exit status: 2
tests/perl_tests/flatfile-to-json.pl.t (Wstat: 6400 Tests: 5 Failed: 1)
Failed test: 5
Non-zero exit status: 25
Parse errors: No plan found in TAP output
tests/perl_tests/genbank.t (Wstat: 512 Tests: 21 Failed: 2)
Failed tests: 11, 18
Non-zero exit status: 2
tests/perl_tests/generate-names.pl.t (Wstat: 768 Tests: 4 Failed: 3)
Failed tests: 1-3
Non-zero exit status: 3
tests/perl_tests/ucsc-to-json.pl.t (Wstat: 256 Tests: 6 Failed: 1)
Failed test: 2
Non-zero exit status: 1
Files=19, Tests=199, 13 wallclock secs ( 0.07 usr 0.02 sys + 8.75 cusr 1.49 csys = 10.33 CPU)
Result: FAIL

</details>
@cmdcolin
Copy link
Contributor Author

Thomas on #bioperl provided the following tip

17:20 < trs> cdiesh: Perl version?
17:22 < trs> I don't know what the test code is doing, but if it's relying on the order of keys %hash or values %hash somewhere being stable, that'll fail on Perl >= 5.18.0

@cmdcolin
Copy link
Contributor Author

cmdcolin commented May 1, 2014

Confirmed using perlbrew to install 5.18 on Mac OSX. Tests pass fine on perl 5.16 normally on Mac OSX.

@cmdcolin
Copy link
Contributor Author

cmdcolin commented May 11, 2014

Here's what the nclist in sample_data/json/volvox/tracks/Genes/ctgA/trackData.json looks like on a system with perl5.16

EDIT: to include the whole intervals->nclist subtree

``` "intervals": { "nclist": [ [0, 1049, 9000, 1, "example", "ctgA", "EDEN", "EDEN", "protein kinase", "gene", [ [1, 1049, 9000, 1, "example", "ctgA", "EDEN.1", "EDEN.1", "Eden splice form 1", "EDEN", "mRNA", [ [2, 4999, 5500, 1, "example", "ctgA", "EDEN.1", "0", "CDS"], [2, 1200, 1500, 1, "example", "ctgA", "EDEN.1", "0", "CDS"], [2, 2999, 3902, 1, "example", "ctgA", "EDEN.1", "0", "CDS"], [2, 6999, 7608, 1, "example", "ctgA", "EDEN.1", "0", "CDS"], [3, 7608, 9000, 1, "example", "ctgA", "three_prime_UTR", "EDEN.1"], [3, 1049, 1200, 1, "example", "ctgA", "five_prime_UTR", "EDEN.1"] ]], [1, 1299, 9000, 1, "example", "ctgA", "EDEN.3", "EDEN.3", "Eden splice form 3", "EDEN", "mRNA", [ [3, 1299, 1500, 1, "example", "ctgA", "five_prime_UTR", "EDEN.3"], [2, 3300, 3902, 1, "example", "ctgA", "EDEN.3", "0", "CDS"], [2, 6999, 7600, 1, "example", "ctgA", "EDEN.3", "1", "CDS"], [3, 2999, 3300, 1, "example", "ctgA", "five_prime_UTR", "EDEN.3"], [3, 7600, 9000, 1, "example", "ctgA", "three_prime_UTR", "EDEN.3"], [2, 4999, 5500, 1, "example", "ctgA", "EDEN.3", "1", "CDS"] ]], [1, 1049, 9000, 1, "example", "ctgA", "EDEN.2", "EDEN.2", "Eden splice form 2", "EDEN", "mRNA", [ [3, 1049, 1200, 1, "example", "ctgA", "five_prime_UTR", "EDEN.2"], [2, 6999, 7608, 1, "example", "ctgA", "EDEN.2", "0", "CDS"], [2, 1200, 1500, 1, "example", "ctgA", "EDEN.2", "0", "CDS"], [3, 7608, 9000, 1, "example", "ctgA", "three_prime_UTR", "EDEN.2"], [2, 4999, 5500, 1, "example", "ctgA", "EDEN.2", "0", "CDS"] ]] ]] ], "classes": [{ "isArrayAttr": { "Subfeatures": 1 }, "attributes": ["Start", "End", "Strand", "Source", "Seq_id", "Load_id", "Name", "Note", "Type", "Subfeatures"] }, { "isArrayAttr": { "Subfeatures": 1 }, "attributes": ["Start", "End", "Strand", "Source", "Seq_id", "Load_id", "Name", "Note", "Parent_id", "Type", "Subfeatures"] }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Source", "Seq_id", "Parent_id", "Phase", "Type"] }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Source", "Seq_id", "Type", "Parent_id"] }, { "isArrayAttr": { "Sublist": 1 }, "attributes": ["Start", "End", "Chunk"] }], "maxEnd": 9000, "count": 1, "lazyClass": 4, "urlTemplate": "lf-{Chunk}.json", "minStart": 1049 } ```

Here's what the same nclist block looks like on a machine with perl 5.18 (note the contents of the feature data are split up)

``` "intervals": { "urlTemplate": "lf-{Chunk}.json", "count": 1, "lazyClass": 17, "maxEnd": 9000, "classes": [{ "isArrayAttr": { "Subfeatures": 1 }, "attributes": ["Start", "End", "Strand", "Note", "Load_id", "Type", "Source", "Subfeatures", "Name", "Seq_id"] }, { "attributes": ["Start", "End", "Strand", "Name", "Seq_id", "Parent_id", "Subfeatures", "Source", "Load_id", "Type", "Note"], "isArrayAttr": { "Subfeatures": 1 } }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Seq_id", "Type", "Parent_id", "Source"] }, { "attributes": ["Start", "End", "Strand", "Type", "Seq_id", "Phase", "Parent_id", "Source"], "isArrayAttr": {} }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Source", "Parent_id", "Type", "Seq_id"] }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Source", "Parent_id", "Type", "Seq_id"] }, { "attributes": ["Start", "End", "Strand", "Seq_id", "Source", "Phase", "Parent_id", "Type"], "isArrayAttr": {} }, { "isArrayAttr": { "Subfeatures": 1 }, "attributes": ["Start", "End", "Strand", "Note", "Load_id", "Type", "Subfeatures", "Source", "Parent_id", "Name", "Seq_id"] }, { "attributes": ["Start", "End", "Strand", "Type", "Seq_id", "Phase", "Parent_id", "Source"], "isArrayAttr": {} }, { "attributes": ["Start", "End", "Strand", "Type", "Seq_id", "Source", "Parent_id"], "isArrayAttr": {} }, { "attributes": ["Start", "End", "Strand", "Source", "Phase", "Parent_id", "Seq_id", "Type"], "isArrayAttr": {} }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Parent_id", "Phase", "Source", "Seq_id", "Type"] }, { "attributes": ["Start", "End", "Strand", "Seq_id", "Parent_id", "Phase", "Source", "Type"], "isArrayAttr": {} }, { "attributes": ["Start", "End", "Strand", "Parent_id", "Source", "Subfeatures", "Seq_id", "Name", "Note", "Load_id", "Type"], "isArrayAttr": { "Subfeatures": 1 } }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Seq_id", "Source", "Parent_id", "Phase", "Type"] }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Source", "Parent_id", "Type", "Seq_id"] }, { "isArrayAttr": {}, "attributes": ["Start", "End", "Strand", "Source", "Parent_id", "Phase", "Seq_id", "Type"] }, { "isArrayAttr": { "Sublist": 1 }, "attributes": ["Start", "End", "Chunk"] }], "minStart": 1049, "nclist": [ [0, 1049, 9000, 1, "protein kinase", "EDEN", "gene", "example", [ [1, 1299, 9000, 1, "EDEN.3", "ctgA", "EDEN", [ [2, 2999, 3300, 1, "ctgA", "five_prime_UTR", "EDEN.3", "example"], [3, 4999, 5500, 1, "CDS", "ctgA", "1", "EDEN.3", "example"], [4, 7600, 9000, 1, "example", "EDEN.3", "three_prime_UTR", "ctgA"], [5, 1299, 1500, 1, "example", "EDEN.3", "five_prime_UTR", "ctgA"], [3, 6999, 7600, 1, "CDS", "ctgA", "1", "EDEN.3", "example"], [6, 3300, 3902, 1, "ctgA", "example", "0", "EDEN.3", "CDS"] ], "example", "EDEN.3", "mRNA", "Eden splice form 3"], [7, 1049, 9000, 1, "Eden splice form 1", "EDEN.1", "mRNA", [ [8, 1200, 1500, 1, "CDS", "ctgA", "0", "EDEN.1", "example"], [9, 7608, 9000, 1, "three_prime_UTR", "ctgA", "example", "EDEN.1"], [10, 2999, 3902, 1, "example", "0", "EDEN.1", "ctgA", "CDS"], [11, 6999, 7608, 1, "EDEN.1", "0", "example", "ctgA", "CDS"], [5, 1049, 1200, 1, "example", "EDEN.1", "five_prime_UTR", "ctgA"], [12, 4999, 5500, 1, "ctgA", "EDEN.1", "0", "example", "CDS"] ], "example", "EDEN", "EDEN.1", "ctgA"], [13, 1049, 9000, 1, "EDEN", "example", [ [14, 1200, 1500, 1, "ctgA", "example", "EDEN.2", "0", "CDS"], [2, 7608, 9000, 1, "ctgA", "three_prime_UTR", "EDEN.2", "example"], [6, 6999, 7608, 1, "ctgA", "example", "0", "EDEN.2", "CDS"], [15, 1049, 1200, 1, "example", "EDEN.2", "five_prime_UTR", "ctgA"], [16, 4999, 5500, 1, "example", "EDEN.2", "0", "ctgA", "CDS"] ], "ctgA", "EDEN.2", "Eden splice form 2", "EDEN.2", "mRNA"] ], "EDEN", "ctgA"] ] } ```

@cmdcolin cmdcolin added this to the Release 1.11.4 milestone May 12, 2014
@cmdcolin
Copy link
Contributor Author

cmdcolin commented May 12, 2014

To be clear about the source of the problem, for example:

In the perl 5.18 code, the class structure for the class order "0" is:

[1:"Start", 2:"End", 3:"Strand", 4:"Note", 5:"Load_id", 6:"Type", 7:"Source", 8:"Subfeatures", 9:"Name", 10:"Seq_id"]

and then data structure in nclist matches this:

[1:1049, 2:9000, 3:1, 4:"protein kinase", 5:"EDEN", 6:"gene", 7:"example", 8:subfeatures, 9:"EDEN", 10: "ctgA"]

In perl 5.16 the structure of class order 0 is:

["Start", "End", "Strand", "Source", "Seq_id", "Load_id", "Name", "Note", "Type", "Subfeatures"]

and then the data structure in nclist matches this:

    [   1049,9000, 1,   'example',   'ctgA',  'EDEN',     'EDEN',    'protein kinase',  'gene']

Then, the test code assumes that the structure of class order 0 matches some pre-defined method, when in fact it appears this assumption is invalid. The test code will be updated

cmdcolin added a commit that referenced this issue May 12, 2014
cmdcolin added a commit that referenced this issue May 12, 2014
…x (lookup class structure). Problem remains where many different ArrayRepr classes are generated. See issue #470
@cmdcolin
Copy link
Contributor Author

cmdcolin commented May 13, 2014

Here is a full output using perl 5.18 on ubuntu (large file 127kb). It fails flatfile-to-json and generate-names tests
http://pastebin.com/ZDt5nBm0

Note: Example of problem in flatfile-to-json where many NCList ArrayRepr classes are dynamically created just slightly shuffled around

``` { 'attributes' => [ 'Start', 'End', 'Strand', 'Type', 'Id', 'Name', 'Source', 'Subfeatures', 'Score', 'Seq_id' ], 'isArrayAttr' => { 'Subfeatures' => 1 } }, { 'attributes' => [ 'Start', 'End', 'Strand', 'Seq_id', 'Score', 'Type', 'Id', 'Name', 'Source', 'Subfeatures' ], 'isArrayAttr' => { 'Subfeatures' => 1 } }, ```

For the generate-names.pl, a data item in the 'exact' match part of the names structures is not matching. I don't exactly know what this data item represents even after looking at source code

  Failed test 'got right data from volvox test data run'
  at tests/perl_tests/generate-names.pl.t line 39.
    Structures begin differing at:
         $got->{e8b/f.json}{rs17878802}{exact}[0][1] = '12'
    $expected->{e8b/f.json}{rs17878802}{exact}[0][1] = '11'

  Failed test 'same data after incremental run'
  at tests/perl_tests/generate-names.pl.t line 58.
    Structures begin differing at:
         $got->{f4d/1.json}{rs4998557}{exact}[0][1] = '12'
    $expected->{f4d/1.json}{rs4998557}{exact}[0][1] = '11'

  Failed test 'same data after incremental run with --safeMode'
  at tests/perl_tests/generate-names.pl.t line 74.
    Structures begin differing at:
         $got->{262/f.json}{rs80265967}{exact}[0][1] = '12'
    $expected->{262/f.json}{rs80265967}{exact}[0][1] = '11'
Looks like you failed 3 tests of 4.

@cmdcolin cmdcolin removed this from the Release 1.11.4 milestone May 14, 2014
@cmdcolin cmdcolin removed the urgent label May 14, 2014
@cmdcolin cmdcolin added this to the 1.11.5 milestone May 22, 2014
@cmdcolin cmdcolin changed the title Perl tests fail with perl version 5.18 Perl tests fail with perl version 5.18 or with bioperl 1.6.923 (updated) Jun 30, 2014
@cmdcolin cmdcolin changed the title Perl tests fail with perl version 5.18 or with bioperl 1.6.923 (updated) Perl tests fail with perl version 5.18 Jun 30, 2014
@cmdcolin cmdcolin modified the milestones: 1.11.5, 1.11.6 Sep 4, 2014
@cmdcolin cmdcolin removed this from the 1.11.6 milestone Jan 23, 2015
@cmdcolin cmdcolin added the high priority related to a high-level project goal label Apr 22, 2015
@cmdcolin cmdcolin removed the high priority related to a high-level project goal label Jun 12, 2015
@enuggetry enuggetry changed the title Perl tests fail with perl version 5.18 Perl tests fail with perl version 5.18 Mar 1, 2016
@cmdcolin
Copy link
Contributor Author

There is a pretty unfortunate consequence of this issue which is that using perl 5.18 and over with flatfile-to-json will take much longer and causes much bigger file sizes

This was sort of alluded to in previous comments here already, basically the fact that the hash order is randomized means that a bunch of combinatorial possibilities of feature types are generated (e.g. some features are represented by start,end,name,id,parent in trackData.json, some are represented by name,end,start,parent,id just with data values switched around etc.)

The data works at runtime but this inflates the size of the files and takes longer to run.

Here's a short example parsing a 280MB gff

Perl 5.14, takes about 5 minutes

time bin/flatfile-to-json.pl --gff file.gff --sortMem 1000000000 --trackLabel test_5_14
205.80s user 10.97s system 73% **cpu 4:54.56 total**

Perl 5.18, takes almost 4 hours

time bin/flatfile-to-json.pl --gff file.gff --sortMem 1000000000 --trackLabel test_5_18
13749.42s user 240.97s system 99% **cpu 3:54:43.13 total**

Not only this but the disk size is vastly huger

In the perl 5.14 data directory, the disk size is 366MB for this track. In the 5.18 instance, the disk space is 21 GB (gigabytes)

Therefore there is a 66x increase in running time and a 57x increase in disk space consumption!

Due to this, it might be advisable to (a) put a big warning saying to use versions earlier than 5.18 because 5.18 was when perl made the hash order randomization and/or (b) fix this bug

This seems weird to report about only now but I think this is reproducible and sucks for the end user. Also perl 5.18 and over probably only recently became the default perl distribution on newer operating systems so more users will experience this

@cmdcolin
Copy link
Contributor Author

Possible solution: everywhere where it says "keys %hash" replace it with "sort keys %hash".

@rbuels rbuels added urgent this should be worked on ASAP high priority related to a high-level project goal labels Jan 25, 2018
@rbuels rbuels added this to the 1.12.4 milestone Jan 25, 2018
@rbuels rbuels self-assigned this Jan 25, 2018
@cmdcolin
Copy link
Contributor Author

Here is a test GFF I think I recall demonstrated the very long run time and disk space blowup (not all gffs seem to do this) ftp://ftp.ncbi.nlm.nih.gov/genomes/Scleropages_formosus/GFF/ref_ASM162426v1_top_level.gff3.gz

@rbuels
Copy link
Collaborator

rbuels commented Jan 25, 2018

Looks like the changes you made in #912 fix that performance regression. Nicely done.

@rbuels
Copy link
Collaborator

rbuels commented Jan 25, 2018

Fixed! Merged the PR. Thanks so much @cmdcolin

@rbuels rbuels closed this as completed Jan 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high priority related to a high-level project goal urgent this should be worked on ASAP
Projects
None yet
Development

No branches or pull requests

2 participants