New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect association of a node in the synthetic tree: wrong taxonomic node #154

Closed
mtholder opened this Issue Jan 27, 2015 · 14 comments

Comments

Projects
None yet
3 participants
@mtholder
Member

mtholder commented Jan 27, 2015

https://tree.opentreeoflife.org/opentree/argus/otol.draft.22@860705/Halicnemia
shows up labelled as Halicnemia_ott776728 but it actually has the taxonomic composition of the taxonomic parent: Heteroxyidae_ott668403
see curl calls below.

requesting the genus:

$ curl -X POST -H "Content-Type":"application/json" -H "Accept":"application/json" http://api.opentreeoflife.org/v2/taxonomy/subtree --data '{"ott_id": 776728}' 

gives a subset:

{
  "subtree" : "(Halicnemia_sp._BELUM<GBR>_Mc5427_ott4939607,Halicnemia_sp._BELUM<GBR>_Mc4307_ott4939608,Halicnemia_papillosa_ott2835824,Halicnemia_diazae_ott2835823,Halicnemia_geniculata_ott2835822,Halicnemia_salomonensis_ott2835821,Halicnemia_arcuata_ott2835820,Halicnemia_patera_ott776726,Halicnemia_sp._A_CM-2010_ott145568)Halicnemia_ott776728;"
}

of what you get when you ask for the synth tree:

$ curl -X POST -H "Content-Type":"application/json" -H "Accept":"application/json" http://api.opentreeoflife.org/v2/tree_f_life/subtree --data '{"node_id": "860705", "tree_id": "otol.draft.22"}'

. Note the newick field and its root label:

{
  "newick" : "((Parahigginsia_phakelloides_ott2835918,Parahigginsia_phakellioides_ott2835865)Parahigginsia_ott2835866,(Alloscleria_tenuispinosa_ott2835914)Alloscleria_ott2835915,(Negombo_kellyae_ott2835911,Negombo_jogashimensis_ott2835936,Negombo_acanthosanidastera_ott2835862,Negombo_tenuistellata_ott2835858)Negombo_ott2835859,(Acanthoclada_prostrata_ott2835939)Acanthoclada_ott2835931,(Julavis_jamaicensis_ott2835881,Julavis_levis_ott2835880)Julavis_ott2835854,(Microxistyla_petrina_ott2835896)Microxistyla_ott2835867,(Heteroxya_corticata_ott2835861,Heteroxya_sp_1_CM-2013_ott5223187)Heteroxya_ott2835860,(Desmoxya_lunata_ott2835932,Desmoxya_pelagiae_ott2835917)Desmoxya_ott2835871,Halicnemia_arcuata_ott2835820,Halicnemia_sp_A_CM-2010_ott145568,Halicnemia_salomonensis_ott2835821,Halicnemia_sp_BELUM_GBR_Mc4307_ott4939608,Halicnemia_sp_BELUM_GBR_Mc5427_ott4939607,Halicnemia_geniculata_ott2835822,Halicnemia_patera_ott776726,Halicnemia_papillosa_ott2835824,Halicnemia_diazae_ott2835823)Halicnemia_ott776728;",
  "tree_id" : "otol.draft.22"
}

That tree is the same tree structure as the parent taxon

$ curl -X POST -H "Content-Type":"application/json" -H "Accept":"application/json" http://api.opentreeoflife.org/v2/taxonmy/subtree --data '{"ott_id": 668403}'

which results:

{
  "subtree" : "((Heteroxya_sp._1_CM-2013_ott5223187,Heteroxya_corticata_ott2835861)Heteroxya_ott2835860,(Negombo_acanthosanidastera_ott2835862,Negombo_tenuistellata_ott2835858,Negombo_jogashimensis_ott2835936,Negombo_kellyae_ott2835911)Negombo_ott2835859,(Desmoxya_lunata_ott2835932,Desmoxya_pelagiae_ott2835917)Desmoxya_ott2835871,(Microxistyla_petrina_ott2835896)Microxistyla_ott2835867,(Parahigginsia_phakellioides_ott2835865,Parahigginsia_phakelloides_ott2835918)Parahigginsia_ott2835866,(Julavis_levis_ott2835880,Julavis_jamaicensis_ott2835881)Julavis_ott2835854,(Acanthoclada_prostrata_ott2835939)Acanthoclada_ott2835931,(Alloscleria_tenuispinosa_ott2835914)Alloscleria_ott2835915,(Halicnemia_sp._BELUM<GBR>_Mc5427_ott4939607,Halicnemia_sp._BELUM<GBR>_Mc4307_ott4939608,Halicnemia_papillosa_ott2835824,Halicnemia_diazae_ott2835823,Halicnemia_geniculata_ott2835822,Halicnemia_salomonensis_ott2835821,Halicnemia_arcuata_ott2835820,Halicnemia_patera_ott776726,Halicnemia_sp._A_CM-2010_ott145568)Halicnemia_ott776728)Heteroxyidae_ott668403;"
  }

mtholder added a commit to mtholder/ncl that referenced this issue Jan 27, 2015

start of new example to check OT taxonomic nodes
In light of OpenTreeOfLife/treemachine#154
we need to check the labelled internal nodes.
@mtholder

This comment has been minimized.

Show comment
Hide comment
@mtholder

mtholder Jan 27, 2015

Member

From a new pull from NCL and then running:

$ example/check-taxo-nodes/checktaxonnodes -frelaxedphyliptree ../../draftversion2.tre ../Taxonomy.tre >out 2>err

I find 22 cases of named internal nodes in the synthetic tree with leaf sets that differ from the definition of that name in OTT:

ott1026107
ott103935
ott1051412
ott172860
ott197414
ott2840942
ott327448
ott389506
ott411973
ott411975
ott411977
ott438716
ott4795965
ott5247549
ott5424357
ott605182
ott776728
ott803358
ott890366
ott938413
ott966429
ott99242

I have not checked these manually (yet).

Member

mtholder commented Jan 27, 2015

From a new pull from NCL and then running:

$ example/check-taxo-nodes/checktaxonnodes -frelaxedphyliptree ../../draftversion2.tre ../Taxonomy.tre >out 2>err

I find 22 cases of named internal nodes in the synthetic tree with leaf sets that differ from the definition of that name in OTT:

ott1026107
ott103935
ott1051412
ott172860
ott197414
ott2840942
ott327448
ott389506
ott411973
ott411975
ott411977
ott438716
ott4795965
ott5247549
ott5424357
ott605182
ott776728
ott803358
ott890366
ott938413
ott966429
ott99242

I have not checked these manually (yet).

@mtholder

This comment has been minimized.

Show comment
Hide comment
@mtholder

mtholder Jan 27, 2015

Member

Looks like 19 are cases of getting a taxonomic content of a node in OTT but matching the wrong name:

    Found identical leaf sets for the synthetic tree "Chlamydiales ott966429" and the taxonomic node "Chlamydiae ott370886".
    Found identical leaf sets for the synthetic tree "Acidobacteria ott1051412" and the taxonomic node "Acidobacteria sup ott952528".
    Found identical leaf sets for the synthetic tree "Dehalococcoidaceae ott438716" and the taxonomic node "Dehalococcoidia ott346927".
    Found identical leaf sets for the synthetic tree "Aegilops speltoides ott605182" and the taxonomic node "Aegilops ott267024".
    Found identical leaf sets for the synthetic tree "Aegilops speltoides subsp speltoides ott327448" and the taxonomic node "Aegilops ott267024".
    Found identical leaf sets for the synthetic tree "Cycadaceae ott99242" and the taxonomic node "Cycadales ott614464".
    Found identical leaf sets for the synthetic tree "Bacillariophytina ott5247549" and the taxonomic node "Bacillariophyta ott5342311".
    Found identical leaf sets for the synthetic tree "Halicnemia ott776728" and the taxonomic node "Heteroxyidae ott668403".
    Found identical leaf sets for the synthetic tree "Tetrapocillon ott2840942" and the taxonomic node "Guitarridae ott2840923".
    Found identical leaf sets for the synthetic tree "Trikentrion ott172860" and the taxonomic node "Cyamoninae ott172859".
    Found identical leaf sets for the synthetic tree "Columbidae ott938413" and the taxonomic node "Columbiformes ott363030".
    Found identical leaf sets for the synthetic tree "Kurtoidei ott411975" and the taxonomic node "Gobiomorpharia ott5553755".
    Found identical leaf sets for the synthetic tree "Kurtidae ott411977" and the taxonomic node "Gobiomorpharia ott5553755".
    Found identical leaf sets for the synthetic tree "Kurtus ott411973" and the taxonomic node "Gobiomorpharia ott5553755".
    Found identical leaf sets for the synthetic tree "Peristediidae ott803358" and the taxonomic node "Triglioidei ott5557288".
    Found identical leaf sets for the synthetic tree "Rathbunella ott197414" and the taxonomic node "Bathymasteridae ott300544".
    Found identical leaf sets for the synthetic tree "Diapriidae ott890366" and the taxonomic node "Proctotrupoidea ott483914".
    Found identical leaf sets for the synthetic tree "Eumunida ott389506" and the taxonomic node "Chirostylidae ott389507".
    Found identical leaf sets for the synthetic tree "Marine Group I ott4795965" and the taxonomic node "Thaumarchaeota ott102415".

the other 3 are:

    Could not find this set of leaves in the synth "Mycale ott1026107" in any taxonomic node.
    Could not find this set of leaves in the synth "Higginsia ott103935" in any taxonomic node.
    Could not find this set of leaves in the synth "Dorylaimia ott5424357" in any taxonomic node.

(once again I have not manually checked these yet).

Member

mtholder commented Jan 27, 2015

Looks like 19 are cases of getting a taxonomic content of a node in OTT but matching the wrong name:

    Found identical leaf sets for the synthetic tree "Chlamydiales ott966429" and the taxonomic node "Chlamydiae ott370886".
    Found identical leaf sets for the synthetic tree "Acidobacteria ott1051412" and the taxonomic node "Acidobacteria sup ott952528".
    Found identical leaf sets for the synthetic tree "Dehalococcoidaceae ott438716" and the taxonomic node "Dehalococcoidia ott346927".
    Found identical leaf sets for the synthetic tree "Aegilops speltoides ott605182" and the taxonomic node "Aegilops ott267024".
    Found identical leaf sets for the synthetic tree "Aegilops speltoides subsp speltoides ott327448" and the taxonomic node "Aegilops ott267024".
    Found identical leaf sets for the synthetic tree "Cycadaceae ott99242" and the taxonomic node "Cycadales ott614464".
    Found identical leaf sets for the synthetic tree "Bacillariophytina ott5247549" and the taxonomic node "Bacillariophyta ott5342311".
    Found identical leaf sets for the synthetic tree "Halicnemia ott776728" and the taxonomic node "Heteroxyidae ott668403".
    Found identical leaf sets for the synthetic tree "Tetrapocillon ott2840942" and the taxonomic node "Guitarridae ott2840923".
    Found identical leaf sets for the synthetic tree "Trikentrion ott172860" and the taxonomic node "Cyamoninae ott172859".
    Found identical leaf sets for the synthetic tree "Columbidae ott938413" and the taxonomic node "Columbiformes ott363030".
    Found identical leaf sets for the synthetic tree "Kurtoidei ott411975" and the taxonomic node "Gobiomorpharia ott5553755".
    Found identical leaf sets for the synthetic tree "Kurtidae ott411977" and the taxonomic node "Gobiomorpharia ott5553755".
    Found identical leaf sets for the synthetic tree "Kurtus ott411973" and the taxonomic node "Gobiomorpharia ott5553755".
    Found identical leaf sets for the synthetic tree "Peristediidae ott803358" and the taxonomic node "Triglioidei ott5557288".
    Found identical leaf sets for the synthetic tree "Rathbunella ott197414" and the taxonomic node "Bathymasteridae ott300544".
    Found identical leaf sets for the synthetic tree "Diapriidae ott890366" and the taxonomic node "Proctotrupoidea ott483914".
    Found identical leaf sets for the synthetic tree "Eumunida ott389506" and the taxonomic node "Chirostylidae ott389507".
    Found identical leaf sets for the synthetic tree "Marine Group I ott4795965" and the taxonomic node "Thaumarchaeota ott102415".

the other 3 are:

    Could not find this set of leaves in the synth "Mycale ott1026107" in any taxonomic node.
    Could not find this set of leaves in the synth "Higginsia ott103935" in any taxonomic node.
    Could not find this set of leaves in the synth "Dorylaimia ott5424357" in any taxonomic node.

(once again I have not manually checked these yet).

@mtholder

This comment has been minimized.

Show comment
Hide comment
@mtholder

mtholder Jan 27, 2015

Member

I just confirmed that Mycale_titubans_ott403492 is not in https://tree.opentreeoflife.org/opentree/argus/ottol@1026107/Mycale-genus-ncbi-86015-in-family-Mycalidae-

The taxonomy has no name for a subgroup of Mycale_ott1026107 which excluded only Mycale_titubans_ott403492

So this could be a different bug than the one described earlier in the thread involving Heteroxyidae_ott668403.

We may want to separate this into 2 different issues.

Member

mtholder commented Jan 27, 2015

I just confirmed that Mycale_titubans_ott403492 is not in https://tree.opentreeoflife.org/opentree/argus/ottol@1026107/Mycale-genus-ncbi-86015-in-family-Mycalidae-

The taxonomy has no name for a subgroup of Mycale_ott1026107 which excluded only Mycale_titubans_ott403492

So this could be a different bug than the one described earlier in the thread involving Heteroxyidae_ott668403.

We may want to separate this into 2 different issues.

@mtholder

This comment has been minimized.

Show comment
Hide comment
@josephwb

This comment has been minimized.

Show comment
Hide comment
@josephwb

josephwb May 6, 2015

Member

Should be fixed as of 3cba417. Will know for sure shortly 😉

Member

josephwb commented May 6, 2015

Should be fixed as of 3cba417. Will know for sure shortly 😉

@chinchliff

This comment has been minimized.

Show comment
Hide comment
@chinchliff

chinchliff May 12, 2015

Member

Any update on this? It is referenced in the doc and I am curious if we have confirmation of Joseph's statement that it should be fixed

Member

chinchliff commented May 12, 2015

Any update on this? It is referenced in the doc and I am curious if we have confirmation of Joseph's statement that it should be fixed

@mtholder

This comment has been minimized.

Show comment
Hide comment
@mtholder

mtholder May 12, 2015

Member

his fix dealt with all but 2 cases (that or there is a bug in the otcetera code that identifies problems).

Member

mtholder commented May 12, 2015

his fix dealt with all but 2 cases (that or there is a bug in the otcetera code that identifies problems).

@josephwb

This comment has been minimized.

Show comment
Hide comment
@josephwb

josephwb May 13, 2015

Member

I haven't chased down whats what with this. We turned down verbosity recently in the synth log; I'll see if the info is in there.
Weirdly, there are no problems when run with individual clades (Embryophytes, Metazoa, etc.) but appear in the big analysis.

Member

josephwb commented May 13, 2015

I haven't chased down whats what with this. We turned down verbosity recently in the synth log; I'll see if the info is in there.
Weirdly, there are no problems when run with individual clades (Embryophytes, Metazoa, etc.) but appear in the big analysis.

@mtholder

This comment has been minimized.

Show comment
Hide comment
@mtholder

mtholder May 19, 2015

Member

The version 3 of synth is down to 2 cases. One is at the species level:
https://devtree.opentreeoflife.org/opentree/argus/opentree3.0@3678897
which excludes:
https://devtree.opentreeoflife.org/opentree/opentree3.0@3854155/Arthronema-gygaxiana--Limnothrix-redekei-NIVA-CYA-227/1

that latter grouping is odd. If you back up to https://devtree.opentreeoflife.org/opentree/opentree3.0@3855022
you see that it currently lists no studies supporting it. (I know that #180 is probably more relevant to that facet of this bug).

Member

mtholder commented May 19, 2015

The version 3 of synth is down to 2 cases. One is at the species level:
https://devtree.opentreeoflife.org/opentree/argus/opentree3.0@3678897
which excludes:
https://devtree.opentreeoflife.org/opentree/opentree3.0@3854155/Arthronema-gygaxiana--Limnothrix-redekei-NIVA-CYA-227/1

that latter grouping is odd. If you back up to https://devtree.opentreeoflife.org/opentree/opentree3.0@3855022
you see that it currently lists no studies supporting it. (I know that #180 is probably more relevant to that facet of this bug).

@josephwb

This comment has been minimized.

Show comment
Hide comment
@josephwb

josephwb May 20, 2015

Member

I implemented my missing-children fix for "Limnothrix redekei ott704052" and re-ran synthesis (i.e. the full analysis). The problem went away (and didn't generate new ones!). The "Cottioidei ott237343" issue remains (as do the 9 unsupported nodes).

Member

josephwb commented May 20, 2015

I implemented my missing-children fix for "Limnothrix redekei ott704052" and re-ran synthesis (i.e. the full analysis). The problem went away (and didn't generate new ones!). The "Cottioidei ott237343" issue remains (as do the 9 unsupported nodes).

@chinchliff

This comment has been minimized.

Show comment
Hide comment
@chinchliff

chinchliff May 20, 2015

Member

Oh. Well that's good. So maybe we can resolve this issue and remove the
associated disclaimer from the manuscript?

On Wed, May 20, 2015 at 9:07 AM Joseph W. Brown notifications@github.com
wrote:

I implemented my missing-children fix for "Limnothrix redekei ott704052"
and re-ran synthesis. The problem went away (and didn't generate new
ones!). The "Cottioidei ott237343" issue remains (as do the 9 unsupported
nodes).


Reply to this email directly or view it on GitHub
#154 (comment)
.

Member

chinchliff commented May 20, 2015

Oh. Well that's good. So maybe we can resolve this issue and remove the
associated disclaimer from the manuscript?

On Wed, May 20, 2015 at 9:07 AM Joseph W. Brown notifications@github.com
wrote:

I implemented my missing-children fix for "Limnothrix redekei ott704052"
and re-ran synthesis. The problem went away (and didn't generate new
ones!). The "Cottioidei ott237343" issue remains (as do the 9 unsupported
nodes).


Reply to this email directly or view it on GitHub
#154 (comment)
.

@chinchliff

This comment has been minimized.

Show comment
Hide comment
@chinchliff

chinchliff May 20, 2015

Member

Actually, I suppose not, since the issue with Cottioideae remains. We know
it isn't missing children. I wonder what else could be happening that would
affect taxonomic composition like that?

On Wed, May 20, 2015 at 9:27 AM Cody Hinchliff cody.hinchliff@gmail.com
wrote:

Oh. Well that's good. So maybe we can resolve this issue and remove the
associated disclaimer from the manuscript?

On Wed, May 20, 2015 at 9:07 AM Joseph W. Brown notifications@github.com
wrote:

I implemented my missing-children fix for "Limnothrix redekei ott704052"
and re-ran synthesis. The problem went away (and didn't generate new
ones!). The "Cottioidei ott237343" issue remains (as do the 9 unsupported
nodes).


Reply to this email directly or view it on GitHub
#154 (comment)
.

Member

chinchliff commented May 20, 2015

Actually, I suppose not, since the issue with Cottioideae remains. We know
it isn't missing children. I wonder what else could be happening that would
affect taxonomic composition like that?

On Wed, May 20, 2015 at 9:27 AM Cody Hinchliff cody.hinchliff@gmail.com
wrote:

Oh. Well that's good. So maybe we can resolve this issue and remove the
associated disclaimer from the manuscript?

On Wed, May 20, 2015 at 9:07 AM Joseph W. Brown notifications@github.com
wrote:

I implemented my missing-children fix for "Limnothrix redekei ott704052"
and re-ran synthesis. The problem went away (and didn't generate new
ones!). The "Cottioidei ott237343" issue remains (as do the 9 unsupported
nodes).


Reply to this email directly or view it on GitHub
#154 (comment)
.

@josephwb

This comment has been minimized.

Show comment
Hide comment
@josephwb

josephwb May 20, 2015

Member

I don't know about the unsupported nodes. Could come from adding missing children. Unfortunately, I cannot do the unsupported test with the partial synth tree, partial taxonomy tree, and pruned inputs: something goes wrong with pruning. Maybe @mtholder can take a look?

Member

josephwb commented May 20, 2015

I don't know about the unsupported nodes. Could come from adding missing children. Unfortunately, I cannot do the unsupported test with the partial synth tree, partial taxonomy tree, and pruned inputs: something goes wrong with pruning. Maybe @mtholder can take a look?

@josephwb josephwb closed this Mar 25, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment