flatfile-to-json.pl problem with the chromosome name numbered "0" #610

Closed
ifantasy opened this Issue Jun 25, 2015 · 2 comments

Comments

Projects
None yet
2 participants
@ifantasy

When I formated a gff3 file with flatfile-to-json.pl, an error information was reported "Died at ....../src/perl5/Bio/JBrowse/FeatureStream.pm line 110".

0   ensembl gene    16437   18189   .   +   .   ID=gene:Solyc00g005000.2;assembly_name=SL2.40;biotype=protein_coding;logic_name=genemodel_itag2.3;version=1
0   ensembl transcript  16437   18189   .   +   .   ID=transcript:Solyc00g005000.2.1;Parent=gene:Solyc00g005000.2;assembly_name=SL2.40;biotype=protein_coding;description=Solyc00g005000.2.1;external_name=SOLYC00G005000.2;logic_name=genemodel_itag2.3;version=1
0   ensembl exon    16437   17275   .   +   .   Name=Solyc00g005000.2.1.exon1;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40;constitutive=1;ensembl_end_phase=1;ensembl_phase=-1;rank=1;version=1
0   .   five_prime_UTR  16437   16479   .   +   .   Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40
0   ensembl CDS 16480   17275   .   +   0   ID=CDS:Solyc00g005000.2.1;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40
0   ensembl CDS 17336   17940   .   +   2   ID=CDS:Solyc00g005000.2.1;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40
0   ensembl exon    17336   18189   .   +   .   Name=Solyc00g005000.2.1.exon2;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;rank=2;version=1
0   .   three_prime_UTR 17941   18189   .   +   .   Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40

However, there was no error information reported when the chromosome name was numbered "1" or others.

1   ensembl gene    16437   18189   .   +   .   ID=gene:Solyc00g005000.2;assembly_name=SL2.40;biotype=protein_coding;logic_name=genemodel_itag2.3;version=1
1   ensembl transcript  16437   18189   .   +   .   ID=transcript:Solyc00g005000.2.1;Parent=gene:Solyc00g005000.2;assembly_name=SL2.40;biotype=protein_coding;description=Solyc00g005000.2.1;external_name=SOLYC00G005000.2;logic_name=genemodel_itag2.3;version=1
1   ensembl exon    16437   17275   .   +   .   Name=Solyc00g005000.2.1.exon1;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40;constitutive=1;ensembl_end_phase=1;ensembl_phase=-1;rank=1;version=1
1   .   five_prime_UTR  16437   16479   .   +   .   Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40
1   ensembl CDS 16480   17275   .   +   0   ID=CDS:Solyc00g005000.2.1;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40
1   ensembl CDS 17336   17940   .   +   2   ID=CDS:Solyc00g005000.2.1;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40
1   ensembl exon    17336   18189   .   +   .   Name=Solyc00g005000.2.1.exon2;Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40;constitutive=1;ensembl_end_phase=-1;ensembl_phase=1;rank=2;version=1
1   .   three_prime_UTR 17941   18189   .   +   .   Parent=transcript:Solyc00g005000.2.1;assembly_name=SL2.40

So, how to deal with the problem that some chromosome name was numbered "0" in some genome assemblies?

Thank you!

@ifantasy ifantasy changed the title from flatfile-to-json.pl problem with chromosome name numbered "0" to flatfile-to-json.pl problem with the chromosome name numbered "0" Jun 25, 2015

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jun 25, 2015

Contributor

It looks like there is some weird thing that says this

my @namerec = (
    \@names,
    $self->{track_label},
    $names[0],
    $f->{seq_id} || die,    <--- kills it
    $f->{start}-1, #< to zero-based
    $f->{end}+0
    );

If you remove the part that says "|| die" that would probably fix it!

I am not sure what the reason for that code is but hope that helps

Also, make sure to run generate-names.pl on your data folder when you have chromosome names that are just numbers like this. Otherwise there can be some confusion if you pass in links like ?loc=1 because it will not understand that you want to go to chromosome 1

Contributor

cmdcolin commented Jun 25, 2015

It looks like there is some weird thing that says this

my @namerec = (
    \@names,
    $self->{track_label},
    $names[0],
    $f->{seq_id} || die,    <--- kills it
    $f->{start}-1, #< to zero-based
    $f->{end}+0
    );

If you remove the part that says "|| die" that would probably fix it!

I am not sure what the reason for that code is but hope that helps

Also, make sure to run generate-names.pl on your data folder when you have chromosome names that are just numbers like this. Otherwise there can be some confusion if you pass in links like ?loc=1 because it will not understand that you want to go to chromosome 1

@cmdcolin cmdcolin added this to the 1.12.0 milestone Nov 30, 2015

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Dec 2, 2015

Contributor

I think this should be fixed now via 2ff5a4a. Feel free to reopen if occurs again.

Contributor

cmdcolin commented Dec 2, 2015

I think this should be fixed now via 2ff5a4a. Feel free to reopen if occurs again.

@cmdcolin cmdcolin closed this Dec 2, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment