Skip to content

Conversation

@caufieldjh
Copy link
Contributor

@caufieldjh caufieldjh commented Oct 20, 2022

@caufieldjh caufieldjh marked this pull request as ready for review October 21, 2022 19:29
@caufieldjh caufieldjh merged commit 81a97db into main Oct 21, 2022
@caufieldjh caufieldjh deleted the bioportal_parsing branch October 21, 2022 19:32
return (id_map, cat_map)


def obo_handle(old_id: str) -> str:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should really be fixing the sources rather than writing such exception handling code.. is there a way we can get a report of all "exceptions" fixed this way so we can try to correct them in the ontologies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love a way to automate fixing this across the original ~1,000 Bioportal entries (because that's mostly what this is for), but for now, all IDs are written out to one of three different reports, as needed:

  • IDs of unexpected format, e.g.,
ID
OBO:ExO_0000030
OBO:ExO_0000151
OBO:ExO_0000152
  • IDs with remapped categories
Old ID	New Category
OBO:ExO_0000030	biolink:NamedThing
OBO:ExO_0000151	biolink:NamedThing
OBO:ExO_0000152	biolink:NamedThing
  • IDs with remapped IDs
Old ID	New ID
OBO:ExO_0000030	EXO:0000030
OBO:ExO_0000151	EXO:0000151
OBO:ExO_0000152	EXO:0000152

So that last report would be most useful for finding the easily-solved exceptions, but the first report may also contain some candidates for repair.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remind me, why are these not correctly understood to be: ExO:0000030?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it's to align with the Bioportal ID (EXO). I'm thinking about adding a profile option to use "OBO mode" so the Bioportal prefixes can still be used for mapping but will be normalized to the preferred forms like ExO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants