New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSP Molecules catalogue #10

Open
jgonlop3 opened this Issue May 17, 2017 · 31 comments

Comments

Projects
None yet
6 participants
@jgonlop3

jgonlop3 commented May 17, 2017

Hi, I´m new in OSP and I think that a good idea for new users like me is to have a catalogue of molecules with preconfigured properties. I think that PK-Sim can be very usefull for clinical uses with this information.

@jgonlop3 jgonlop3 changed the title from Molecules catalogue to OSP Molecules catalogue May 17, 2017

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 17, 2017

Member

Great idea

Member

msevestre commented May 17, 2017

Great idea

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 17, 2017

I don´t have sufficient programming background kwnoledge but I´m very interested in this topic. How can I help? Ideas to lead and planning work?

jgonlop3 commented May 17, 2017

I don´t have sufficient programming background kwnoledge but I´m very interested in this topic. How can I help? Ideas to lead and planning work?

@msevestre msevestre referenced this issue May 17, 2017

Closed

OSP Portable #9

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 17, 2017

Member

@jgonlop3
The major task first of all is to compile this list of ever growing preconfigured molecules.

For each molecule, following parameters are required:

  • Name,
  • Lipophilicity
  • MolWeight
  • Solubility at Ref pH
  • Ref pH
  • PKAs

Then ideally, some processes describing how the drugs interacts with different enzymes, transporters etc..

The actual format could be an

  • a json file : here is an example of how a compound is read by pk-sim to perform batch simulations).
{
  "Name": "inhibitor",
  "Lipophilicity": 2,
  "FractionUnbound": 0.6,
  "MolWeight": 3.0E-7,
  "SolubilityAtRefpH": 1.0E-7,
  "RefpH": 9,
  "PkaTypes": [{
      "Type": "Acid",
      "Value": 8
  }],
   "PartialProcesses": [{
      "MoleculeName": "CYP3A4",
      "InternalName": "MixedInhibition",
      "DataSource": "Lab",
      "ParameterValues": {"Ki_c": 1,"Ki_u": 1}
  }],
  "CalculationMethods": ["Cellular partition coefficient method - Rodgers and Rowland"]
}
  • a template database file: It is possible to export a compound to a user template database
  • ...

Having a json file with all the compound properties would offer many advantages. One of which would be that this file would be on github and anyone could participate and add/modify the catalogue of molecules

So far, no programming knowledge required. Only a Text editor :)

In PK-Sim, a simple change would have to be implemented to allow for

  • Export of one or more compound to json (so that it can be added to the catalog)
  • Import a compound from the molecule catalog
Member

msevestre commented May 17, 2017

@jgonlop3
The major task first of all is to compile this list of ever growing preconfigured molecules.

For each molecule, following parameters are required:

  • Name,
  • Lipophilicity
  • MolWeight
  • Solubility at Ref pH
  • Ref pH
  • PKAs

Then ideally, some processes describing how the drugs interacts with different enzymes, transporters etc..

The actual format could be an

  • a json file : here is an example of how a compound is read by pk-sim to perform batch simulations).
{
  "Name": "inhibitor",
  "Lipophilicity": 2,
  "FractionUnbound": 0.6,
  "MolWeight": 3.0E-7,
  "SolubilityAtRefpH": 1.0E-7,
  "RefpH": 9,
  "PkaTypes": [{
      "Type": "Acid",
      "Value": 8
  }],
   "PartialProcesses": [{
      "MoleculeName": "CYP3A4",
      "InternalName": "MixedInhibition",
      "DataSource": "Lab",
      "ParameterValues": {"Ki_c": 1,"Ki_u": 1}
  }],
  "CalculationMethods": ["Cellular partition coefficient method - Rodgers and Rowland"]
}
  • a template database file: It is possible to export a compound to a user template database
  • ...

Having a json file with all the compound properties would offer many advantages. One of which would be that this file would be on github and anyone could participate and add/modify the catalogue of molecules

So far, no programming knowledge required. Only a Text editor :)

In PK-Sim, a simple change would have to be implemented to allow for

  • Export of one or more compound to json (so that it can be added to the catalog)
  • Import a compound from the molecule catalog
@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 17, 2017

Perfect! I think it´s easy to do. I start today! I make an LibreOffice Calc file with this properties. I think Calc is more easy for non-programmers than json. I agree with your comment, the perfect situation will be the PK-sim option to import/export properties of molecules.

jgonlop3 commented May 17, 2017

Perfect! I think it´s easy to do. I start today! I make an LibreOffice Calc file with this properties. I think Calc is more easy for non-programmers than json. I agree with your comment, the perfect situation will be the PK-sim option to import/export properties of molecules.

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 17, 2017

Member

@Yuri05 @TheiBa Do you have any suggestions ?

Member

msevestre commented May 17, 2017

@Yuri05 @TheiBa Do you have any suggestions ?

@TheiBa

This comment has been minimized.

Show comment
Hide comment
@TheiBa

TheiBa May 17, 2017

Member
Member

TheiBa commented May 17, 2017

@TheiBa

This comment has been minimized.

Show comment
Hide comment
@TheiBa

TheiBa May 17, 2017

Member

and further qualified models will be developed and posted as discussed in #11

Member

TheiBa commented May 17, 2017

and further qualified models will be developed and posted as discussed in #11

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 17, 2017

Member
Member

msevestre commented May 17, 2017

@TheiBa

This comment has been minimized.

Show comment
Hide comment
@TheiBa

TheiBa May 17, 2017

Member
Member

TheiBa commented May 17, 2017

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 17, 2017

Yes, I am agree with @msevestre. No quality = no usefull. But it´s a good starting point like @TheiBa told. There is a lot of databases information about molecular properties but we need to use the properties that match the best results with OSP. I think about starting with some of this references values and then validate the whole model in PK-Sim. The pack of properties for any molecule must be coordinate with adequate package of anatomy+physiological values...etc. In summary, we need to stablish an initial molecular values that works with the reference PBPK model implemmented in PK-Sim.

jgonlop3 commented May 17, 2017

Yes, I am agree with @msevestre. No quality = no usefull. But it´s a good starting point like @TheiBa told. There is a lot of databases information about molecular properties but we need to use the properties that match the best results with OSP. I think about starting with some of this references values and then validate the whole model in PK-Sim. The pack of properties for any molecule must be coordinate with adequate package of anatomy+physiological values...etc. In summary, we need to stablish an initial molecular values that works with the reference PBPK model implemmented in PK-Sim.

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 18, 2017

Member

@jgonlop3

I make an LibreOffice Calc file with this properties

Be aware that the number of properties is not set. That's why an excel/calc table won't work.
For example, in the json text I pasted above, you can see that there is also a PartialProcess defined for CYP3A4 of type MixedInhibition.
A compound can have many different processes defined and as you can imagine, this will be very hard to model in a table.

Member

msevestre commented May 18, 2017

@jgonlop3

I make an LibreOffice Calc file with this properties

Be aware that the number of properties is not set. That's why an excel/calc table won't work.
For example, in the json text I pasted above, you can see that there is also a PartialProcess defined for CYP3A4 of type MixedInhibition.
A compound can have many different processes defined and as you can imagine, this will be very hard to model in a table.

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 18, 2017

Yes, I know the limitations of a table. I try to do a database (sqlite or libreoffice base are good options for this purpose). First of all I need to design the tables and relations. I think on spreadsheet like Calc previously because of its simplicity. But I think you are right and database is better option. I will make a database.

jgonlop3 commented May 18, 2017

Yes, I know the limitations of a table. I try to do a database (sqlite or libreoffice base are good options for this purpose). First of all I need to design the tables and relations. I think on spreadsheet like Calc previously because of its simplicity. But I think you are right and database is better option. I will make a database.

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 18, 2017

Member

@jgonlop3 the problem with a database is that it is very hard to extend if you are non programmer.
Why not try the json file option? This is a simple text file that can be edited in almost any decent text editor.
The beauty in this approach is that anyone could submit a patch on github and it would be fairly easy to see what has change between two versions (text file)

At the moment however, you can start with a simple table as you suggested which would have to be converted to another format once somewhat ready. Thoughts?

Member

msevestre commented May 18, 2017

@jgonlop3 the problem with a database is that it is very hard to extend if you are non programmer.
Why not try the json file option? This is a simple text file that can be edited in almost any decent text editor.
The beauty in this approach is that anyone could submit a patch on github and it would be fairly easy to see what has change between two versions (text file)

At the moment however, you can start with a simple table as you suggested which would have to be converted to another format once somewhat ready. Thoughts?

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 18, 2017

@msevestre I don´t like json format because it seems programming language code! difficult to read for me: not too intuitive... but you´re right again! is simple and this is the key. I will do it in json.

jgonlop3 commented May 18, 2017

@msevestre I don´t like json format because it seems programming language code! difficult to read for me: not too intuitive... but you´re right again! is simple and this is the key. I will do it in json.

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 18, 2017

Member
Member

msevestre commented May 18, 2017

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 18, 2017

Member

See Open-Systems-Pharmacology/PK-Sim#168 for upcoming implementation in PK-Sim that will allow users to create a json block automatically

Member

msevestre commented May 18, 2017

See Open-Systems-Pharmacology/PK-Sim#168 for upcoming implementation in PK-Sim that will allow users to create a json block automatically

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 18, 2017

Thank you @msevestre I´ll read it

jgonlop3 commented May 18, 2017

Thank you @msevestre I´ll read it

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 19, 2017

I have a doubt: I am writing a json for vancomycin. I take the properties from Chemicalize.com. I have a lot of usefull information but I don´t know where to write it inside json frame. @msevestre can you help me? I try to simplify and take only the most important properties but in the json example you had sent I don´t know where to place this information. Example: How can I write solubility at differents pH? I think it is a very easy for you... ;)

jgonlop3 commented May 19, 2017

I have a doubt: I am writing a json for vancomycin. I take the properties from Chemicalize.com. I have a lot of usefull information but I don´t know where to write it inside json frame. @msevestre can you help me? I try to simplify and take only the most important properties but in the json example you had sent I don´t know where to place this information. Example: How can I write solubility at differents pH? I think it is a very easy for you... ;)

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 19, 2017

Member

@jgonlop3
What PKSim is supporting now is fairly limited

For example you can only enter one solubility for a given pH. Those properties are called SolubilityAtRefpH and RefpH respectively. We may have to extend the json format to support all capability of PK-Sim in the future. For example

{
"Name": "inhibitor",
"Lipophilicity": 2,
"FractionUnbound": 0.6,
"MolWeight": 3.0E-7,
"SolubilityAtRefpH": 1.0E-7,
"RefpH": 9,
...
}

Member

msevestre commented May 19, 2017

@jgonlop3
What PKSim is supporting now is fairly limited

For example you can only enter one solubility for a given pH. Those properties are called SolubilityAtRefpH and RefpH respectively. We may have to extend the json format to support all capability of PK-Sim in the future. For example

{
"Name": "inhibitor",
"Lipophilicity": 2,
"FractionUnbound": 0.6,
"MolWeight": 3.0E-7,
"SolubilityAtRefpH": 1.0E-7,
"RefpH": 9,
...
}

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 19, 2017

jgonlop3 commented May 19, 2017

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 19, 2017

jgonlop3 commented May 19, 2017

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 19, 2017

Regarding solubility data: How can I write 2 solubilities at 2 different pHs? Can you give me an example?

jgonlop3 commented May 19, 2017

Regarding solubility data: How can I write 2 solubilities at 2 different pHs? Can you give me an example?

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 19, 2017

Member

@jgonlop3

Is it planned that PK-Sim can export data of molecules in json format?

yes: Open-Systems-Pharmacology/PK-Sim#168
This will be released in 7.2.0. As soon as the feature is implemented however, you'll be able to try it out with the pre release

Regarding solubility data: How can I write 2 solubilities at 2 different pHs? Can you give me an example?

This is not supported by PK-Sim. Instead, using RefpH and Ref Solubility and solubility gain per charge, PK-Sim predicts the Sol = f(pH)

Here is an extract of Chapter 15 describing how PK-Sim handles solubility

The solubility can be specified together with the type of measurement or the medium used (first column, Experiment). The corresponding unit can be chosen from the drop-down menu in the second column (Solubility at Ref-pH). For charged compounds, the pH value at which the solubility of the compound was measured should be given in the third column (Ref-pH). In the fourth column, the Solubility gain per charge can be modified, which defines the factor by which the solubility increases with each ionization step. In order to calculate the charge of the molecule, the fraction of each microspecies is calculated according to the Henderson-Hasselbalch equation for a given pH. This is done across the entire pH-range such that the fractions are used to calculate the probability with which a molecule is in a certain ionization state. Based on this information, the pH-dependent solubility of molecules with one or more ionizable groups is calculated. By clicking on Show Graph, the pH-dependent solubility across the whole pH range calculated based on the experimental solubility at the defined pH is shown. For neutral compounds the input fields Ref-pH and Solubility gain per charge and the graph are irrelevant.

Member

msevestre commented May 19, 2017

@jgonlop3

Is it planned that PK-Sim can export data of molecules in json format?

yes: Open-Systems-Pharmacology/PK-Sim#168
This will be released in 7.2.0. As soon as the feature is implemented however, you'll be able to try it out with the pre release

Regarding solubility data: How can I write 2 solubilities at 2 different pHs? Can you give me an example?

This is not supported by PK-Sim. Instead, using RefpH and Ref Solubility and solubility gain per charge, PK-Sim predicts the Sol = f(pH)

Here is an extract of Chapter 15 describing how PK-Sim handles solubility

The solubility can be specified together with the type of measurement or the medium used (first column, Experiment). The corresponding unit can be chosen from the drop-down menu in the second column (Solubility at Ref-pH). For charged compounds, the pH value at which the solubility of the compound was measured should be given in the third column (Ref-pH). In the fourth column, the Solubility gain per charge can be modified, which defines the factor by which the solubility increases with each ionization step. In order to calculate the charge of the molecule, the fraction of each microspecies is calculated according to the Henderson-Hasselbalch equation for a given pH. This is done across the entire pH-range such that the fractions are used to calculate the probability with which a molecule is in a certain ionization state. Based on this information, the pH-dependent solubility of molecules with one or more ionizable groups is calculated. By clicking on Show Graph, the pH-dependent solubility across the whole pH range calculated based on the experimental solubility at the defined pH is shown. For neutral compounds the input fields Ref-pH and Solubility gain per charge and the graph are irrelevant.

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 commented May 19, 2017

Thanks @msevestre

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 22, 2017

In PK-Sim I can select differents units but in the json file you sent me as example this is not specified. Where can I registered the units (example mg/L) of the solubility at Ref-pH inside json files?

"SolubilityAtRefpH": 0.177827, <- Which units?????
"RefpH": 7.4,

jgonlop3 commented May 22, 2017

In PK-Sim I can select differents units but in the json file you sent me as example this is not specified. Where can I registered the units (example mg/L) of the solubility at Ref-pH inside json files?

"SolubilityAtRefpH": 0.177827, <- Which units?????
"RefpH": 7.4,

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 22, 2017

Member

This is a very good point. The json specification does not support unit at the moment. That means that for now, all values should be defined in base unit for a specific dimension.
PK-Sim and MoBi are using the following dimension and unit definition

https://github.com/Open-Systems-Pharmacology/OSPSuite.Dimensions/blob/master/OSPSuite.Dimensions.xml

For example, Concentration (molar) as a base unit of umol/l
https://github.com/Open-Systems-Pharmacology/OSPSuite.Dimensions/blob/master/OSPSuite.Dimensions.xml#L165

When the export is implemented in PK-Sim,. values will be exported in the correct unit by default

Does it make sense?

Member

msevestre commented May 22, 2017

This is a very good point. The json specification does not support unit at the moment. That means that for now, all values should be defined in base unit for a specific dimension.
PK-Sim and MoBi are using the following dimension and unit definition

https://github.com/Open-Systems-Pharmacology/OSPSuite.Dimensions/blob/master/OSPSuite.Dimensions.xml

For example, Concentration (molar) as a base unit of umol/l
https://github.com/Open-Systems-Pharmacology/OSPSuite.Dimensions/blob/master/OSPSuite.Dimensions.xml#L165

When the export is implemented in PK-Sim,. values will be exported in the correct unit by default

Does it make sense?

@jgonlop3

This comment has been minimized.

Show comment
Hide comment
@jgonlop3

jgonlop3 May 22, 2017

Understand. Is possible to add dime coment in json file with the units? Only like explanation. Ir nota is possible, dont worry. I understand It.

jgonlop3 commented May 22, 2017

Understand. Is possible to add dime coment in json file with the units? Only like explanation. Ir nota is possible, dont worry. I understand It.

@PavelBal

This comment has been minimized.

Show comment
Hide comment
@PavelBal

PavelBal May 23, 2017

Member

Why json? I am not a fan of XML, but wouldn't it be more consistent to use XML for compound definitions? OSPS uses XML for exporting model blocks and whole simulations (MatLab/R), so no new structure should be thought of. Furthermore, schemas could help in validating the molecules catalogue.

The advantage of json of being more lighweight should not play any role as the data should be used locally.

Member

PavelBal commented May 23, 2017

Why json? I am not a fan of XML, but wouldn't it be more consistent to use XML for compound definitions? OSPS uses XML for exporting model blocks and whole simulations (MatLab/R), so no new structure should be thought of. Furthermore, schemas could help in validating the molecules catalogue.

The advantage of json of being more lighweight should not play any role as the data should be used locally.

@Yuri05

This comment has been minimized.

Show comment
Hide comment
@Yuri05

Yuri05 May 23, 2017

Member

Defining a schema would be a good idea in any case.
JSON format also supports schema definition and validation can be performed even online.
http://www.jsonschemavalidator.net/

Member

Yuri05 commented May 23, 2017

Defining a schema would be a good idea in any case.
JSON format also supports schema definition and validation can be performed even online.
http://www.jsonschemavalidator.net/

@msevestre

This comment has been minimized.

Show comment
Hide comment
@msevestre

msevestre May 23, 2017

Member
  • json is the format that we use for batch simulations (see links above). We already support this format out of the box
  • json is much more readable than xml
  • any new API now favors json over xml
  • and it will be loaded right from github.com (albeit with a fallback option local). We will use this file a standard api endpoint to retrieve the compound list online

That being said, the format we have now, that I copied as an example above, will probably need to be extended. More particularly, I think we need to add

  • unit for parameter value (this will serve for documentation purpose)
  • ValueDescription (text describing where the value is coming from. Absolute must so that this compound value can be trusted)
  • description of compound itself
  • ...probably much more
  • Alternative values (a way to define more than one lipophilicity for example with different source)
  • fu: For which species ?

@jgonlop3 While json will remain, the actual data structure will change now that we go from an intern use to external use only. It might be wise to wait a bit until the structure is finalize

Member

msevestre commented May 23, 2017

  • json is the format that we use for batch simulations (see links above). We already support this format out of the box
  • json is much more readable than xml
  • any new API now favors json over xml
  • and it will be loaded right from github.com (albeit with a fallback option local). We will use this file a standard api endpoint to retrieve the compound list online

That being said, the format we have now, that I copied as an example above, will probably need to be extended. More particularly, I think we need to add

  • unit for parameter value (this will serve for documentation purpose)
  • ValueDescription (text describing where the value is coming from. Absolute must so that this compound value can be trusted)
  • description of compound itself
  • ...probably much more
  • Alternative values (a way to define more than one lipophilicity for example with different source)
  • fu: For which species ?

@jgonlop3 While json will remain, the actual data structure will change now that we go from an intern use to external use only. It might be wise to wait a bit until the structure is finalize

@CcVn

This comment has been minimized.

Show comment
Hide comment
@CcVn

CcVn Jun 18, 2017

Dear all,

The proposal of an "incremental" approach made by TheiBa sounds reasonable. Drugbank is a good source to start with : I used it quite often as a starting point when building a new PBPK model from scratch on reference compounds, and the quality of information seems sufficient for a first version of a model. Then the compound file may be gradually completed/optimized to use other validated in vitro or ex vivo data found elsewhere. Up to a real "validation" against clinical data (first PK in healthy people, then in disease population etc.). This approach may require to have a dedicated and standardized field within the compound file indicating the degree/scale of quality/validation of the file, starting with the lowest grade (pure in silico data) to the highest grade (file validated with various clinical datasets). This way people could know what degree of confidence they can put into their simulations.

Best regards,

CcVn commented Jun 18, 2017

Dear all,

The proposal of an "incremental" approach made by TheiBa sounds reasonable. Drugbank is a good source to start with : I used it quite often as a starting point when building a new PBPK model from scratch on reference compounds, and the quality of information seems sufficient for a first version of a model. Then the compound file may be gradually completed/optimized to use other validated in vitro or ex vivo data found elsewhere. Up to a real "validation" against clinical data (first PK in healthy people, then in disease population etc.). This approach may require to have a dedicated and standardized field within the compound file indicating the degree/scale of quality/validation of the file, starting with the lowest grade (pure in silico data) to the highest grade (file validated with various clinical datasets). This way people could know what degree of confidence they can put into their simulations.

Best regards,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment