Provide compound ID for all files #55

davidlmobley · 2017-08-25T14:21:38Z

I think we should probably move towards a model where all ligands (or guests) in each benchmark set have an appropriate, unique, paper-specific numerical compound ID, rather than the current model where this is dependent on what set we're looking at. For example:

CB7 Tables 1&2: Has unique CID we assigned
GDCC Tables 3: Has unique CID we assigned, but will get broken if we want to provide structures docked into hosts as there are two hosts but only one set of compound IDs
GDCC Table 4: Has unique CID we assigned
CD Table 5 and 6: Has unique CID we assigned
lysozyme Tables 7 and 8: No CIDs, uses compound names only
BRD4(1) Table 9: Uses heterogeneous identifiers -- "Compound 4", "alprazolam", "Bzt-7", "JQ1(+)" etc.; this is probably the worst offender since some of these are pretty unsuitable as filenames due to special characters and/or spaces (e.g. some tools can't load files with spaces in their filenames and/or handle some of these special characters).

@GHeinzelmann @nhenriksen - thoughts? My preference I think is to make sure every set has a unique numerical compound ID in the tables and that this is used for all of the relevant files.

GHeinzelmann · 2017-08-25T15:26:26Z

That sounds good, and it can be done quickly I think. I'll change the ligands names to a provided ID (from 1 -10), and change the associated tables in the paper and in the README file.

GHeinzelmann · 2017-08-25T18:01:25Z

Working in the BRD4(1) benchmarks table in the main paper, and I won't fit in the page if I keep the ligand names but also add an extra ligand ID column (as done in the CD tables). Should I drop the ligands names altogether? They might not be essential since we are also providing the references.

davidlmobley · 2017-08-25T18:02:22Z

I'm all for dropping the ligand names, or if you really want to keep track of them, put them in footnotes or in a separate markdown file you link to.

GHeinzelmann · 2017-08-25T18:04:58Z

No we can drop them, I only gave the ligands names so the table would look the same as the Lysozyme one. I'll just give a number for each, which will also make the table look better (it was a little decentralized before since it was too wide). Then I'll change the README table and the ligand files names.

davidlmobley · 2017-08-28T17:07:46Z

Resolved for bromodomains in #48 ; still needs to be done for lysozyme.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide compound ID for all files #55

Provide compound ID for all files #55

davidlmobley commented Aug 25, 2017

GHeinzelmann commented Aug 25, 2017

GHeinzelmann commented Aug 25, 2017 •

edited

Loading

davidlmobley commented Aug 25, 2017

GHeinzelmann commented Aug 25, 2017

davidlmobley commented Aug 28, 2017

Provide compound ID for all files #55

Provide compound ID for all files #55

Comments

davidlmobley commented Aug 25, 2017

GHeinzelmann commented Aug 25, 2017

GHeinzelmann commented Aug 25, 2017 • edited Loading

davidlmobley commented Aug 25, 2017

GHeinzelmann commented Aug 25, 2017

davidlmobley commented Aug 28, 2017

GHeinzelmann commented Aug 25, 2017 •

edited

Loading