Universal tables #9

zakandrewking · 2014-09-04T01:25:49Z

Loading universal components
1. If KEGG_ID is new, then add a new universal component
2. If KEGG_ID is not new, then add connect to existing universal components
  - New column in model_compartmentalized_component for old_bigg_id
3. If no KEGG_ID, then add a new universal component, and flag the row
Loading universal reactions
- Compare new reactions by stoichiometry
- check each metabolite and coefficient

steve-federowicz · 2014-09-20T20:33:00Z

Hey so was just talking with justin about this and ran into an issue.

The first is that most models don't have associated KEGG IDs. Ideally they all would and maybe I am wrong but pretty sure only 10-20% of what we load will have them.

Second is that we probably have to have our own non-external unique ids for universal components because there are going to be many more types of components than just metabolites. The above scheme would work but it would have a massive number of flagged rows.

zakandrewking · 2014-09-22T02:33:33Z

I can't check on this right now, but I remember going into Simpheny and seeing KEGG ids for almost every metabolite I checked. Definitely the central metabolic ones. I wonder if they never made it into GRMIT?

It's OK to have a massive number of flagged rows. We have to deal with this someday, and there are many automated approaches to consider. Andreas has done something very similar.

For non-metabolites, we should try to come up with external IDs where possible. For anything that's part of a template reaction, we can get fancy. For instance, a transcription elongation reaction could be linked to the reaction template AND to an external gene ID. But we don't have to solve that immediately.

pillmill · 2014-09-22T16:08:36Z

The Kegg and Cas IDs were imported from Simpheny.

On Sun, Sep 21, 2014 at 7:33 PM, Zachary King notifications@github.com
wrote:

I can't check on this right now, but I remember going into Simpheny and
seeing KEGG ids for almost every metabolite I checked. Definitely the
central metabolic ones. I wonder if they never made it into GRMIT?

It's OK to have a massive number of flagged rows. We have to deal with
this someday, and there are many automated approaches to consider. Andreas
has done something very similar.

For non-metabolites, we should try to come up with external IDs where
possible. For anything that's part of a template reaction, we can get
fancy. For instance, a transcription elongation reaction could be linked to
the reaction template AND to an external gene ID. But we don't have to
solve that immediately.

—
Reply to this email directly or view it on GitHub
#9 (comment).

draeger · 2014-09-22T16:14:26Z

Am 09/05/14 um 19:33 schrieb Zachary King:

I can't check on this right now, but I remember going into Simpheny
and seeing KEGG ids for almost every metabolite I checked. Definitely
the central metabolic ones. I wonder if they never made it into GRMIT?

It's OK to have a massive number of flagged rows. We have to deal with
this someday, and there are many automated approaches to consider.
Andreas has done something very similar.

For non-metabolites, we should try to come up with external IDs where
possible. For anything that's part of a template reaction, we can get
fancy. For instance, a transcription elongation reaction could be
linked to the reaction template AND to an external gene ID. But we
don't have to solve that immediately.

—
Reply to this email directly or view it on GitHub
#9 (comment).

Hi guys,

IMHO, having an own ID schema in addition to providing references to
external KEGG IDs would be a great idea. If BiGG ids would be consistent
and unique, other users could refer to us instead of pointing to KEGG
etc. It would be very nice if many models would contain references to
BiGG, ultimately increasing our access count when people look those up.

Cheers
Andreas

Dr. Andreas Draeger
University of California, San Diego, La Jolla, CA 92093-0412, USA
Bioengineering Dept., Systems Biology Research Group, Office #2506
Phone: +1-858-534-9717, Fax: +1-858-822-3120, twitter: @dr_drae

steve-federowicz · 2014-09-23T19:24:24Z

Ok, so how does this sound as a temporary solution.

We start by loading models that we know came from simpheny and have KEGG ids
- This ensures that most of the primary metabolic universal components will have metabolite entries that contain a valid KEGG id
Since we already have a column for KEGG id in the metabolite table then a simple select * from metabolite where kegg_id is null; should do the trick in keeping track of metabolites which need curation.

What do you think?

Actually in re-reading this is essentially exactly what you originally proposed??

zakandrewking · 2014-09-24T00:53:08Z

Bingo 🎱 (don't read into that)

jslu9 · 2014-09-24T00:59:50Z

Yeah, right now I made it so that the universal metabolite will update its
kegg id if it has a missing kegg id and another metabolite with the same
name and has a kegg id is uploaded into the database.

On Tue, Sep 23, 2014 at 5:53 PM, Zachary King notifications@github.com
wrote:

Bingo [image: 🎱](don't read into that)

—
Reply to this email directly or view it on GitHub
#9 (comment).

zakandrewking · 2014-10-15T17:55:18Z

universal metabolite table and page
universal reaction table and page

jslu9 · 2014-10-17T18:16:34Z

Just talked to John to try to put kegg ids into his new updated sbml models and he said that personally doesn't think that kegg ids are good for universal ids. He mentioned that metanetx is better.

steve-federowicz · 2014-10-17T22:37:35Z

Hmmmm ok, I would be fine with metanetx. I was always pretty impressed with their atom mapping work. I'm not sure if it has been fully implemented within metanetx yet but I think it will be and at that point I think it will be a pretty dominantly sophisticated resource. I think the downside is that KEGG has a lot of visibility outside of constraint-based sysbio and if we go with metanetx ids we are potentially losing some visibility. However, the upside is that costass and metanetx aren't going anywhere and will only continue to get better. Costas is also a friendly lab and so if an official collaboration needed to happen or larger things were to move forward then it would likely be a good situation.

Sent from my iPhone

On Oct 17, 2014, at 11:16 AM, Justin Lu notifications@github.com wrote:

Just talked to John to try to put kegg ids into his new updated sbml models and he said that personally doesn't think that kegg ids are good for universal ids. He mentioned that metanetx is better.

—
Reply to this email directly or view it on GitHub.

zakandrewking · 2014-10-17T22:48:40Z

We aren't technically using KEGG as universal ids: We are using KEGG to
generate universal BIGG ids. So we don't have to limit ourselves to one
type of external reference ID. It's worth thinking about this more.

Jon already has metanetx ids for his models?

On Fri, Oct 17, 2014 at 3:37 PM, Steve Federowicz notifications@github.com
wrote:

Hmmmm ok, I would be fine with metanetx. I was always pretty impressed
with their atom mapping work. I'm not sure if it has been fully implemented
within metanetx yet but I think it will be and at that point I think it
will be a pretty dominantly sophisticated resource. I think the downside is
that KEGG has a lot of visibility outside of constraint-based sysbio and if
we go with metanetx ids we are potentially losing some visibility. However,
the upside is that costass and metanetx aren't going anywhere and will only
continue to get better. Costas is also a friendly lab and so if an official
collaboration needed to happen or larger things were to move forward then
it would likely be a good situation.

Sent from my iPhone

On Oct 17, 2014, at 11:16 AM, Justin Lu notifications@github.com
wrote:

Just talked to John to try to put kegg ids into his new updated sbml
models and he said that personally doesn't think that kegg ids are good for
universal ids. He mentioned that metanetx is better.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
#9 (comment).

jslu9 · 2014-10-18T02:35:50Z

I don't think so actually. He might have some but it's not in his sbmls for
certain. But today I discussed with Jon on pulling out the kegg ids and cas
numbers and then putting them into his cobrapy objects. He'll be sending
his new models (w/ kegg ids) once he's done updating his python script.

On Fri, Oct 17, 2014 at 3:48 PM, Zachary King notifications@github.com
wrote:

We aren't technically using KEGG as universal ids: We are using KEGG to
generate universal BIGG ids. So we don't have to limit ourselves to one
type of external reference ID. It's worth thinking about this more.

Jon already has metanetx ids for his models?

On Fri, Oct 17, 2014 at 3:37 PM, Steve Federowicz <
notifications@github.com>
wrote:

Hmmmm ok, I would be fine with metanetx. I was always pretty impressed
with their atom mapping work. I'm not sure if it has been fully
implemented
within metanetx yet but I think it will be and at that point I think it
will be a pretty dominantly sophisticated resource. I think the downside
is
that KEGG has a lot of visibility outside of constraint-based sysbio and
if
we go with metanetx ids we are potentially losing some visibility.
However,
the upside is that costass and metanetx aren't going anywhere and will
only
continue to get better. Costas is also a friendly lab and so if an
official
collaboration needed to happen or larger things were to move forward
then
it would likely be a good situation.

Sent from my iPhone

On Oct 17, 2014, at 11:16 AM, Justin Lu notifications@github.com
wrote:

Just talked to John to try to put kegg ids into his new updated sbml
models and he said that personally doesn't think that kegg ids are good
for
universal ids. He mentioned that metanetx is better.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub
#9 (comment).

—
Reply to this email directly or view it on GitHub
#9 (comment).

draeger · 2014-10-19T18:29:43Z

Am 17.10.14 um 19:35 schrieb Justin Lu:

I don't think so actually. He might have some but it's not in his
sbmls for
certain. But today I discussed with Jon on pulling out the kegg ids
and cas
numbers and then putting them into his cobrapy objects. He'll be sending
his new models (w/ kegg ids) once he's done updating his python script.

Let's talk about all this on Tuesday during code talk. I think this is
very important and deserves a few words of direct discussion.

Dr. Andreas Draeger
University of California, San Diego, La Jolla, CA 92093-0412, USA
Bioengineering Dept., Systems Biology Research Group, Office #2506
Phone: +1-858-534-9717, Fax: +1-858-822-3120, twitter: @dr_drae

zakandrewking added this to the submission milestone Sep 4, 2014

jslu9 closed this as completed Oct 17, 2014

jslu9 reopened this Oct 17, 2014

jslu9 closed this as completed Nov 3, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Universal tables #9

Universal tables #9

zakandrewking commented Sep 4, 2014

steve-federowicz commented Sep 20, 2014

zakandrewking commented Sep 22, 2014

pillmill commented Sep 22, 2014

draeger commented Sep 22, 2014

steve-federowicz commented Sep 23, 2014

zakandrewking commented Sep 24, 2014

jslu9 commented Sep 24, 2014

zakandrewking commented Oct 15, 2014

jslu9 commented Oct 17, 2014

steve-federowicz commented Oct 17, 2014

zakandrewking commented Oct 17, 2014

jslu9 commented Oct 18, 2014

draeger commented Oct 19, 2014

Universal tables #9

Universal tables #9

Comments

zakandrewking commented Sep 4, 2014

steve-federowicz commented Sep 20, 2014

zakandrewking commented Sep 22, 2014

pillmill commented Sep 22, 2014

draeger commented Sep 22, 2014

steve-federowicz commented Sep 23, 2014

zakandrewking commented Sep 24, 2014

jslu9 commented Sep 24, 2014

zakandrewking commented Oct 15, 2014

jslu9 commented Oct 17, 2014

steve-federowicz commented Oct 17, 2014

zakandrewking commented Oct 17, 2014

jslu9 commented Oct 18, 2014

draeger commented Oct 19, 2014