Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes from Informatics Meeting Dec 2015 #364

Open
mattodd opened this issue Dec 2, 2015 · 41 comments
Open

Notes from Informatics Meeting Dec 2015 #364

mattodd opened this issue Dec 2, 2015 · 41 comments
Assignees

Comments

@mattodd
Copy link
Member

mattodd commented Dec 2, 2015

Hangout (#363) took place. Thanks to everyone for giving up their time. Timezones are a challenge, but feel free to have franchised discussions separately - but please just report back here any thoughts/ideas for those who can't make them.

(I'm pasting notes here so that anyone can correct, or supplement, or add questions. If branching action items are needed, please create those separately. Once this post remains inactive for more than a week, it can be closed.)

ChemInfo: @lpatiny outlined the many capabilities of the Cheminfo system, and some future plans related to the development of an ELN incorporating many of these features. Overarching idea is that data are stored on a server, while calculations are carried out locally on the client computer - e.g. NMR data can be stored inside an ELN, but processing can be carried out live in the browser. Cheminfo can do a number of things of immediate use for the OSM project, all the way to allowing calculations of inter-atomic distances from rendered 3D models. There is an upper size limit for browser-based activities of 50K molecules, beyond which (for e.g. visualising Chembl) a web server can be used.

Q: Different Chemical Drawing Packages: what if we’re all using different drawing packages? Luc mentioned that there is already a comprehensive file converter (is this openbabel?). The mol file is the recommended type, which can be read by everything. Can this be included in future ELNs? The idea is to make sure that chemical information, from anyone, can be read? Chemdraw is still heavily embedded in labs, and people feel happy using it to generate camera-ready figures.

Q: What is the primary identifier for molecules? @cdsouthan suggested this is the pubchem CID since it can handle complexities such as salts, ee’s. A feature @mattodd wanted to have, which would help the bench chemist, would be for a system to spot when a molecule being worked on was already associated with a pubchem ID. Could this be captured by Luc’s system?

@MedChemProf : The Workflow Problem. At the moment it's essentially impossible to find relevant synthetic chemistry people have carried out. @mattodd: This is a recurrent and major problem in OSM. We urgently need an ELN that is substructure-searchable. That this might be close is the only reason we’ve not carried on with the manual collection of synthetic data, which was impossibly cumbersome (though very useful for writing up papers). @MedChemProf : Need clearly defined workflows nonetheless - when people make molecules, or want to make molecules, what are the steps. (Do I have this right, Chase? We should e.g. do a flowchart, as a how-to but maybe somewhere like the wiki?)

Responsibility for data. The data generated by the OSM consortium are by default CC-BY. They can be anywhere, physically. However, who takes responsibility for maintaining the existence of the datasets? @lpatiny 's view is that the data are kept available through there being many copies. This is likely to be the most effective strategy. But in the case of a researcher working with OSM at University X, using public ELN Y, who accepts responsibility for making the data permanently available? Should this nontrivial problem fall to the parent institution (“As an institution carrying out primary research we guarantee the permanence of the research record”) or should OSM attempt to guarantee this (perhaps through Luc’s strategy of redundancy) by becoming a legal entity in itself and sourcing funds, then using e.g. CLOCKSS? To be discussed further. Mat: There might be other reasons why OSM should become a non-profit in its own right (longevity, ability to raise funds from more diverse sources etc) so we could talk about this later).

Desirable extras for Luc’s system:
i) When writing a procedure in English, for the system to understand which molecules are being discussed, i.e. an active link between table of reagents and the procedure.
ii) Automatic import of safety-related information from reagent table into the ELN.
iii) Auto-search of other resources, e.g. Chemspider synthetic pages, for related chemistry.
iv) @drc007 emphasized that though consideration of data location and provenance was important, the most important thing was further development of the front end of the cheminfo system, along the lines that is currently taking place.

Open Science Prize:
The pre-application to the Wellcome Trust by Mat, Luc and others for the MolTrakr idea (#361) is for a major undertaking. The full proposal to the WT would be written if the pre-app is approved (people here are very welcome to be part of that if it's of interest, by the way). The open science prize is smaller, and relies on developing a prototype in 2016. There is a heavy emphasis on proposals that "unleash the power of open content and data to advance biomedical research". A possibility discussed at the end of the meeting between Chris Swain, Chris Southan, Luc and Mat involved a variant of the “who’s making what and where?” problem. Could we develop a tool (as part of Luc’s system, or as a component of the MolTrakr platform) that, through “knowing” what chemistry is being undertaken in a med chem project, is able to connect researchers working on the same or related chemistry in real time. If, for example, a Friedel-Crafts reaction is being carried out in Sydney, a flag would go up to say that yesterday a chemist in Indiana, who also has an open lab notebook, was working on the same chemistry earlier that day. This would create a network that is social (because we want to see what other people are doing), but through raw data captured in lab notebooks. There is also the novel emphasis on reactions that are underway, rather than final compounds. It's also unlikely that anything proprietary could be used for this, since openness provides a strength - more open lab notebooks = more people able to work with you to enhance your research. There is an obvious possible "ResearchGate" model here for monetisation, along the lines of "27 people are working on this chemistry today! Sign up for details." though I would not want to pursue this as part of this prize. (Chris Southan said this reminded him of Biostars. Mat: not quite - that’s more of a Stackexchange site for expertise. We’d be talking about something different that helps you find people who are doing things related to what you’re doing, using passive, behind-the-scenes searching rather than Q&A. More like the ads that appear in Gmail, without that uneasy feeling).

Immediate ways forward: Luc’s ELN will be ready for testing at the end of 2015. However, we can already start posting data in the right format for automated indexing and extraction, via posting RXL, mol and JCAMP files. I'll make a separate issue on what’s needed from the bench chemist. Done: #365.

Not discussed in this meeting:

  1. How best to store data in the Master Sheet? Strings are important for now, but hopefully generation of molecule-related data/strings is more automated in future.

  2. The SGC has a ChemReg system. OSM have been offered a trial. Does anyone have any time to evaluate, as a place to store biological activity data for OSM compounds?

  3. (unrelated, but occurred to me as we were talking). The HRMS calculator Luc showed: which molecular formulae best match the HRMS peak found. Could be adapted for elemental analysis calculator? Given % found from elemental analysis, and a suspected molecular formula of the analyte, which solvents/salts/water and in what ratio would provide the best fit to the observed data? This would automate the (sometimes rather humiliating) manual shoehorning we sometimes have to do with elemental analysis data in which we try to make the data fit some realistic combination of molecules.

@drc007
Copy link

drc007 commented Dec 2, 2015

On 2 Dec 2015, at 00:07, Mat Todd notifications@github.com wrote:

Q: Different Chemical Drawing Packages: what if we’re all using different drawing packages? Luc mentioned that there is already a comprehensive file converter (is this openbabel http://openbabel.org/wiki/Main_Page?). The mol file is the recommended type, which can be read by everything. Can this be included in future ELNs? The idea is to make sure that chemical information, from anyone, can be read? Chemdraw is still heavily embedded in labs, and people feel happy using it to generate camera-ready figures.

A quick comment about Openbabel, (full disclosure I’m a contributor to OpenBabel), it is released under GPL license I think that you are simply calling it as a web service this probably does not cause any issues but I thought I’d mention it.

There are many chemical drawing packages that now have both desktop and web flavours, (even Chemdraw now has a mobile version) so I think users will expect to be able to use the same chemical drawing package for all their needs.

ChemDoodle web components are already moving along these lines https://web.chemdoodle.com https://web.chemdoodle.com/ and there is clear overlap with the sort of things Luc has so nicely demonstrated. Released under GNU GENERAL PUBLIC LICENSE.

Similarly Marvin JS https://www.chemaxon.com/products/marvin/marvin-js/ https://www.chemaxon.com/products/marvin/marvin-js/ offers drawing and querying tools, requires commercial license.

Elemental http://www.dotmatics.com/products/elemental/ http://www.dotmatics.com/products/elemental/ is another javascript based chemical drawing application, free but not sure of licensing

Ketcher http://lifescience.opensource.epam.com/ketcher/ http://lifescience.opensource.epam.com/ketcher/ javascript based drawing package free and open source, I think it requires a server back end to provide some of the functionality. Uses GNU Affero General Public License.

JSME is a javascript port of JME, JSME is released under a BSD license but I’m not sure about JME (which is only free to non-commercial applications) and I’m not sure about the situation with respect to Open Source. No desktop version.

@lpatiny
Copy link
Member

lpatiny commented Dec 2, 2015

JME does not matter, it is old java stuff so it became useless and was replaced by JSME.

ChemDoodle web component : we had trouble dealing with GPL license. The companies expect that everything that "touch" the web component has to be GPL. This means that if you include any of their component on a webpage the server should be GPL as well as the server towards which you send the request. We have therefore rewrite the jcamp converter and a much better jcamp visualizer.

I hope indeed that if you have a call towards a webserver that has openbabel it is not an issue. @drc007 maybe you could check this. If one webpage use a webservice towards openbabel, should the original server also be GPL ? So it could not be windows for example ?

In general I'm quite tired with GPL license, seems a lawyer problem and I don't feel like loosing any time with those things. This is the reason we do everything in MIT (or BSD).

To draw on-line we will probably go towards openChemLib 👍 I
http://www.cheminfo.org/?viewURL=http%3A%2F%2Fcouch.cheminfo.org%2Fcheminfo%2Fd9498d0a2ea400ea71efec8840fc186e%2Fview.json&loadversion=true&fillsearch=6.1.2+OCL+molecule+editor

It can enforce stereochemistry ! You need to specifiy if it is racemate, diastereoisomers, ... (molfile v3000 enhanced stereochemistry) and it is MIT.

@drc007
Copy link

drc007 commented Dec 2, 2015

@lpatiny I'm not a lawyer so I can't give a ruling, I just know that these sort of things have been an issue in the past.

@drc007
Copy link

drc007 commented Dec 2, 2015

@lpatiny does openChemLib support query atoms etc?

@lpatiny
Copy link
Member

lpatiny commented Dec 2, 2015

Yes we don't have a demonstration yet but this is something we need to show very soon. We will provide the wikipedia search engine as demonstration of substructure search in javascript in the webbrowser with query feature.
image
image

@drc007
Copy link

drc007 commented Dec 2, 2015

@lpatiny Excellent.

I presume it is all written in javascript?

@lpatiny
Copy link
Member

lpatiny commented Dec 2, 2015

Well technically not ... it is written in Java : https://github.com/Actelion/openchemlib
But we did the conversion using GWT to javascript : https://github.com/cheminfo/openchemlib-js

@MedChemProf
Copy link
Member

@mattodd - Thank you for the excellent meeting summary. I did go back to re-read the "How-To's" that were already available on the wiki on compound number registration. There was a single line that I previously must have missed that has made searching more productive. In the instructions, it does mention to check the "Use simple text search" box. By doing that I was able to get much more relevant and focused search results. While reaction searching is still not possible, running several compound searches using SMILES or InChI was possible to find compounds of interest as either starting materials or products. By looking at the Master Sheet, a chemist could make educated guesses at intermediates necessary to make those particular products and then search the notebooks for those particular intermediates to view the experimental conditions.

What I am still a little unclear on is whether or not we should be reproducing our experimentals in the OSM LabTrove ELN or only maintain them in the publicly available LabArchives ELN that we are currently using? Please let me know what is preferred by the group.

@lpatiny - I also wanted to thank you for your demo of the cheminfo.org tools available for use in the OSM program. I was looking over the cheminfo.org site in Firefox instead of Chrome and I noticed that under the 'Chemistry' menu selection, there was a subsection called 'Parsing Data' and then under that there was a choice for 'SDF 3D Plot'. Would it be possible to add the 'SDF 3D Plot' as an option to the tools arrayed in the OSM project? I have been importing the data into another program to examine the data graphically, but I think it would be advantageous to have it available on the web interface.
Also, can metrics such as '# Heavy Atoms', 'Ligand Efficiency (LE)' and 'Lipophilic Efficiency (LipE)' be calculated and added to the available numbers for the compounds in the Master Sheet?

Finally, @drc007 @cdsouthan @mattodd , I am more then willing to pitch in on the informatics grant process. Please let me know what might be needed or where you may want some help.

@alintheopen
Copy link
Member

Hi all, looks very interesting.
I have emailed Sydney University's data storage team to see if we can make one of our repository's completely open. This way the Sydney team could save all non-proprietary (and proprietary) data files to the research data store (RDS) and save relevant links to the ELN. If so, my next question is to ask if other institutions/individuals can push files to the RDS. Alternatively, should we use a github repository or dropbox folder or something else entirely for this purpose?

Also, it would be really helpful to have a 'how to' document to assist in the preparation of a 'model' open entry using all of the file types suggested in this thread and others. I'd be delighted to try out any and all suggestions and of course open to feedback. Cheers Alice

@cdsouthan
Copy link
Member

Good meeting and write-up NOBA being wowed by the @lpatiny open toolbox. I will check in with the PubChem folk about piping results into BioAssay and auto-allerts along the lines of "a new CID within 0.85 Tanimoto of our lead OSMXXXX structure was submitted by source Y this week" possibly also "tell me when new (but non OSM) bioassay data was added to these CIDs that are within 0.90 Tanmimoto of those OSM leads"

The aspect of structurally allerting (beyond designs and end products) to what synthetic steps global open teams are grappling with is, AWAK, more of challenge. I know NextMove are working on this commercialy https://www.nextmovesoftware.com/hazelnut.html but their seem to be some open standards evolving on reaction schema. The other angle is ChemSpider synthetic pages http://cssp.chemspider.com/ where groups can deposit and pick-up

@mattodd
Copy link
Member Author

mattodd commented Dec 7, 2015

The open science prize. There's a webinar on Dec 10th - see lower down here. It's 3am Sydney time. Anyone else able to go along and see how good a fit we are and pick up tips for an application?

@MedChemProf
Copy link
Member

@mattodd My apologies for missing the meeting on the 10th (I have been away for the past several weeks and I did not have the means to attend on that particular date.) I am back now and trying to play catch up. I was able to follow a brief exchange on Twitter between @cdsouthan and @drc007 in regards to @aclarkxyz and his open source reaction XMDS paradigm (http://cheminf20.org/2015/11/29/reactions-in-xmds-2/). I have had some exchanges with Alex Clark and I am a big fan of his mobile applications. Might this be a direction to take with the Open Science Prize since there is already a working model of the searchable reaction information?

@mattodd
Copy link
Member Author

mattodd commented Jan 11, 2016

Hi Chase - sorry for the delay. I'm not qualified to comment on different
technical approaches. We can specify what we're looking for, and the
advantages of a system we'd like (in terms of how it can impact research
relevant to health) and then be relatively agnostic about what that system
should look like, provided it generates a prototype before the start of
Phase 2 of the competition (as I understand it). But we need to be ramping
up efforts in this direction soon, yes. Deadline is end of February.

On 22 December 2015 at 03:10, Chase Smith notifications@github.com wrote:

@mattodd https://github.com/mattodd My apologies for missing the
meeting on the 10th (I have been away for the past several weeks and I did
not have the means to attend on that particular date.) I am back now and
trying to play catch up. I was able to follow a brief exchange on Twitter
between @cdsouthan https://github.com/cdsouthan and @drc007
https://github.com/drc007 in regards to @aclarkxyz
https://github.com/aclarkxyz and his open source reaction XMDS paradigm
(http://cheminf20.org/2015/11/29/reactions-in-xmds-2/). I have had some
exchanges with Alex Clark and I am a big fan of his mobile applications.
Might this be a direction to take with the Open Science Prize since there
is already a working model of the searchable reaction information?


Reply to this email directly or view it on GitHub
#364 (comment)
.

MATTHEW TODD | Associate Professor
School of Chemistry | Faculty of Science

THE UNIVERSITY OF SYDNEY
Rm 519, F11 | The University of Sydney | NSW | 2006
T +61 2 9351 2180 | F +61 2 9351 3329 | M +61 415 274104
E matthew.todd@sydney.edu.au | W
http://sydney.edu.au/science/people/matthew.todd.php
W http://opensourcemalaria.org/ | W http://opensourcetb.org/ | W
http://opensourcepharma.net/

CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised
use is strictly prohibited. If you receive this email in error, please
delete it and any attachments.

@MedChemProf
Copy link
Member

@mattodd - I know that @lpatiny was already working on a more robust ELN, but since I did not know the exact status, I wanted to propose a possible Google Hangout Meeting to discuss the status. The reason that I bring this up, as was mentioned earlier, is the clock is ticking down on time to propose an entry for the Open Science Prize. I briefly discussed the reaction searching issue with Dr. Alex Clark who has already been developing some tools along this line through his company Molecular Materials Informatics. He seems interested in contributing to the proposal, so I thought it would be best to bring in some of the more knowledgeable members of the OSM to discuss.
Would people be interested in possibly meeting Monday, January 18, 2016 at 4:00:00 PM EST (UTC-5 hours)? Please let me know.

@mattodd
Copy link
Member Author

mattodd commented Jan 14, 2016

Hi Chase. Sounds perfect, yes. You mean this time?

If so, we should try to get people together online, yes. Would you be happy to host, Chase? Available @lpatiny @drc007 @cdsouthan ?

I guess at this stage we are wanting to decide on the essential features of a striking proposal - something that delivers something new and useful for health research - and something for which we can deliver a prototype as part of the first stage of the competition. I'd want to avoid too many other hypotheticals for this call (i.e. all the stuff that we'd like for OSM but don't yet have).

I do think that an ELN, or something equivalent, that can identify people working on related chemistry in real time would be new and useful. i.e. a system that knows what you're working on and links you with others currently working on that science. I'm thinking (as a model) of the adverts in Gmail, but related to chemistry and therefore not as sinister.

@drc007
Copy link

drc007 commented Jan 14, 2016

I'm happy to take part.

@drc007
Copy link

drc007 commented Jan 14, 2016

I've forwarded this thread to Alexander Savelyev who may be might be able to provide insights into how the indigo toolkit might also be useful for reaction enumeration and searching. http://lifescience.opensource.epam.com/indigo/

@mattodd
Copy link
Member Author

mattodd commented Jan 14, 2016

Looks good. I'd not seen Indigo.

On 14 January 2016 at 18:11, Chris Swain notifications@github.com wrote:

I've forwarded this thread to Alexander Savelyev who may be might be able
to provide insights into how the indigo toolkit might also be useful for
reaction enumeration and searching.
http://lifescience.opensource.epam.com/indigo/


Reply to this email directly or view it on GitHub
#364 (comment)
.

MATTHEW TODD | Associate Professor
School of Chemistry | Faculty of Science

THE UNIVERSITY OF SYDNEY
Rm 519, F11 | The University of Sydney | NSW | 2006
T +61 2 9351 2180 | F +61 2 9351 3329 | M +61 415 274104
E matthew.todd@sydney.edu.au | W
http://sydney.edu.au/science/people/matthew.todd.php
W http://opensourcemalaria.org/ | W http://opensourcetb.org/ | W
http://opensourcepharma.net/

CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised
use is strictly prohibited. If you receive this email in error, please
delete it and any attachments.

@MedChemProf
Copy link
Member

@mattodd Yes, I meant to list the time that you posted (http://www.timeanddate.com/worldclock/meetingdetails.html?year=2016&month=1&day=18&hour=21&min=0&sec=0&p1=240&p2=43&p3=136). I will host on Google Hangouts as you suggested. I need to go back in to figure out how to send out the link, but I will do that in the next few days.
@drc007 I also thought the Indigo toolkit looked quite promising, I just wish their site had some more screenshots of the actual products.

@lpatiny
Copy link
Member

lpatiny commented Jan 14, 2016

It is ok for me, I will give you an update of the ELN on Monday.

@AlexanderSavelyev
Copy link

Hello everyone. I'm happy to join, but unfortunately the proposed time is too late for me. It will be 00.00 local time in Saint-Petersburg (Russia)

http://www.timeanddate.com/worldclock/meetingdetails.html?year=2016&month=1&day=18&hour=21&min=0&sec=0&p1=240&p2=43&p3=136&p4=352

Please let me know if I can help by giving some descriptions and notes for the Indigo toolkit

@drc007
Copy link

drc007 commented Jan 18, 2016

@MedChemProf Will you post the hangouts link on here? Or did I miss a message?

@lpatiny
Copy link
Member

lpatiny commented Jan 18, 2016

@MedChemProf
Copy link
Member

@MedChemProf
Copy link
Member

It looks like I am using Hangouts in the Air which might be different than Hangouts. I did not know another way to schedule the meeting in Hangouts. Please let me know if you can see the meeting.

@aclarkxyz
Copy link

I'm watching you, but not sure how to activate my mic :-)

On Mon, Jan 18, 2016 at 3:54 PM, Chase Smith notifications@github.com
wrote:

It looks like I am using Hangouts in the Air which might be different than
Hangouts. I did not know another way to schedule the meeting in Hangouts.
Please let me know if you can see the meeting.


Reply to this email directly or view it on GitHub
#364 (comment)
.

@MedChemProf
Copy link
Member

?OK No idea anyone was on.

...

Chase Smith, PhD
Associate Professor of Medicinal Chemistry
School of Pharmacy
MCPHS University

19 Foster Street | Worcester MA 01608
T 508.373.5717 C 978.855.7203 F 508.890.5618
chase.smith@mcphs.edu
www.mcphs.eduhttp://www.mcphs.edu

[cid:mcphsu_logo_c68d3913-2e4e-4db1-818e-86ca08ac55df.jpg]


From: Alex Clark notifications@github.com
Sent: Monday, January 18, 2016 3:55 PM
To: OpenSourceMalaria/OSM_To_Do_List
Cc: Smith, Chase
Subject: Re: [OSM_To_Do_List] Notes from Informatics Meeting Dec 2015 (#364)

I'm watching you, but not sure how to activate my mic :-)

On Mon, Jan 18, 2016 at 3:54 PM, Chase Smith notifications@github.com
wrote:

It looks like I am using Hangouts in the Air which might be different than
Hangouts. I did not know another way to schedule the meeting in Hangouts.
Please let me know if you can see the meeting.

Reply to this email directly or view it on GitHub
#364 (comment)
.

Reply to this email directly or view it on GitHubhttps://github.com//issues/364#issuecomment-172651306.
Confidentiality Note: This e-mail, and any attachment to it, is intended to be confidential and might be legally privileged. It is intended solely for the use of the addressee. If you are not the intended recipient, you are hereby notified that reading, copying, disseminating or distributing this email is strictly prohibited. If you have received this e-mail in error, please immediately return it to the sender and delete it from your system. Thank you.

@lpatiny
Copy link
Member

lpatiny commented Jan 18, 2016

@MedChemProf
Copy link
Member

Try: https://hangouts.google.com/hangouts/_/mlxvyemnvezx6bwpavznaqwbsya?

...

Chase Smith, PhD
Associate Professor of Medicinal Chemistry
School of Pharmacy
MCPHS University

19 Foster Street | Worcester MA 01608
T 508.373.5717 C 978.855.7203 F 508.890.5618
chase.smith@mcphs.edu
www.mcphs.eduhttp://www.mcphs.edu

[cid:mcphsu_logo_c68d3913-2e4e-4db1-818e-86ca08ac55df.jpg]


From: Alex Clark notifications@github.com
Sent: Monday, January 18, 2016 3:55 PM
To: OpenSourceMalaria/OSM_To_Do_List
Cc: Smith, Chase
Subject: Re: [OSM_To_Do_List] Notes from Informatics Meeting Dec 2015 (#364)

I'm watching you, but not sure how to activate my mic :-)

On Mon, Jan 18, 2016 at 3:54 PM, Chase Smith notifications@github.com
wrote:

It looks like I am using Hangouts in the Air which might be different than
Hangouts. I did not know another way to schedule the meeting in Hangouts.
Please let me know if you can see the meeting.

Reply to this email directly or view it on GitHub
#364 (comment)
.

Reply to this email directly or view it on GitHubhttps://github.com//issues/364#issuecomment-172651306.
Confidentiality Note: This e-mail, and any attachment to it, is intended to be confidential and might be legally privileged. It is intended solely for the use of the addressee. If you are not the intended recipient, you are hereby notified that reading, copying, disseminating or distributing this email is strictly prohibited. If you have received this e-mail in error, please immediately return it to the sender and delete it from your system. Thank you.

@MedChemProf
Copy link
Member

Link to his slides:

https://docs.google.com/presentation/d/1CMbBp9jti9qQ9hG9YsTZyygFt6ok7H8K0Pa2hBnxvcE/edit?usp=sharing

?

...

Chase Smith, PhD
Associate Professor of Medicinal Chemistry
School of Pharmacy
MCPHS University

19 Foster Street | Worcester MA 01608
T 508.373.5717 C 978.855.7203 F 508.890.5618
chase.smith@mcphs.edu
www.mcphs.eduhttp://www.mcphs.edu

[cid:mcphsu_logo_c68d3913-2e4e-4db1-818e-86ca08ac55df.jpg]


From: Alex Clark notifications@github.com
Sent: Monday, January 18, 2016 3:55 PM
To: OpenSourceMalaria/OSM_To_Do_List
Cc: Smith, Chase
Subject: Re: [OSM_To_Do_List] Notes from Informatics Meeting Dec 2015 (#364)

I'm watching you, but not sure how to activate my mic :-)

On Mon, Jan 18, 2016 at 3:54 PM, Chase Smith notifications@github.com
wrote:

It looks like I am using Hangouts in the Air which might be different than
Hangouts. I did not know another way to schedule the meeting in Hangouts.
Please let me know if you can see the meeting.

Reply to this email directly or view it on GitHub
#364 (comment)
.

Reply to this email directly or view it on GitHubhttps://github.com//issues/364#issuecomment-172651306.
Confidentiality Note: This e-mail, and any attachment to it, is intended to be confidential and might be legally privileged. It is intended solely for the use of the addressee. If you are not the intended recipient, you are hereby notified that reading, copying, disseminating or distributing this email is strictly prohibited. If you have received this e-mail in error, please immediately return it to the sender and delete it from your system. Thank you.

@drc007
Copy link

drc007 commented Jan 19, 2016

A few thoughts for "Live Searching"
Scientist X is making the same /similar compound
Several people are doing the same/similar reaction (similar starting material)
Several people are doing the same transformation (on a different substrate)
You are using solvent X have you considered the more environmentally friendly Y
Vendor X is offering reagent Y at a reduced rate
It looks like the catalyst from vendor Y usually gives higher yields
Scientist X reported an exotherm when adding reagent Y, take care
Scientist X reported difficulties in removing excess reagent/solvent consider modifying protocol
Thinking of making analogues? Click here for a diverse selection of 10 commercially available analogues.

@mattodd
Copy link
Member Author

mattodd commented Jan 19, 2016

Yes indeed, great scenarios. Both useful to researchers and potentially commercially valuable. The thing I like about this system is that it does not depend on people spotting connections, or text-heavy Q&A ("How do I purify this diamine?", as in known platforms like stackoverflow) but on an machine-based method of spotting similarity in real time.

@cdsouthan
Copy link
Member

Agreed. Setting up auto-alert triggers that have useful specificity will be one of the challenges but tacklable nontherless. Note we already can select the analogues via any similarity cut against vendor sources directly in PubChem and/or likely synthetic description in patents via SureChEMBL (n.b. ZINC has just refreshed to 23 mill CIDs)

@lpatiny
Copy link
Member

lpatiny commented Jan 22, 2016

We are still improving the system. Now as described during the hangout we can add the chemical structure from the reagents table as well as create the product based on the reaction scheme.
We are currently formatting the result so that it is nicely printable.
If somebody could help with CSS formatting of the output it would be great ! Any student there ?
It would also be nice to have couple of students that would like to test the ELN as soon as it is testable ... If you know some please ask them to subscribe to this Telegram group: https://telegram.me/joinchat/BcodBAZN5OLov9QI3DAEWA

@alintheopen
Copy link
Member

hi @mattoddchem - could this be a suitable TSP project?

On Fri, Jan 22, 2016 at 8:55 PM, lpatiny notifications@github.com wrote:

We are still improving the system. Now as described during the hangout we
can add the chemical structure from the reagents table as well as create
the product based on the reaction scheme.
We are currently formatting the result so that it is nicely printable.
If somebody could help with CSS formatting of the output it would be great
! Any student there ?
It would also be nice to have couple of students that would like to test
the ELN as soon as it is testable ... If you know some please ask them to
subscribe to this Telegram group:
https://telegram.me/joinchat/BcodBAZN5OLov9QI3DAEWA


Reply to this email directly or view it on GitHub
#364 (comment)
.

@mattodd
Copy link
Member Author

mattodd commented Jan 27, 2016

Hmm - not sure @alintheopen I think we'd need some chemical experimental content for this to count as a TSP project. However, we should put out a call for community volunteers @lpatiny - for that would you be able to create a new Issue here on OSM and write a few lines about what you are looking for for the CSS formatting (because I don't understand what you mean, so can't explain it to anyone else (which is OK) but it's important volunteers know the amount of work involved) as well as what you'd need (approximately) from ELN testers. We can then appeal to the community.

@MedChemProf
Copy link
Member

@mattodd @lpatiny @drc007 @alintheopen @cdsouthan @aclarkxyz I created a GoogleDrive folder with the following items included regarding the OSM submission to the Open Science Prize:

  1. Project Proposal (I started writing a rough skeleton.)
  2. Team Information Document (I will need those contributing to update their information in the document for inclusion when I submit.)
  3. To-Do List (I think we are already overdue on some of the items, but we can adjust as needed.)
  4. Folder with Supporting Documents (Open Science Prize Guidelines, Open Science Prize FAQ, Recording of Open Science Prize Online Meeting, and Open Science Prize suggested open source resources.)

For all of those interested in contributing to the proposal and project I can open the folder up to you for editing. At the moment, the shared documents are housed in my GoogleDrive account with the edit setting to 'Shared with Specific People'. I would appreciate any suggestions on how best to share this with others or all appropriately. Should this just be opened up to all for editing? Or do we have a sub-set of users? Any help appreciated and then I will change things accordingly. I was not sure how things were set-up or where documents were stored when you were writing drafts for publication. Thanks in advance.

@mattodd
Copy link
Member Author

mattodd commented Feb 1, 2016

Hi Chase - great to get this started. I would set this document up with the setting "anyone with the link can view" and post the link to a new Issue here on Github so that we can have a separate discussion thread purely on this. Then share with specific people, enabling those people to edit. Is that OK as a first step?

@lpatiny
Copy link
Member

lpatiny commented Feb 1, 2016

I don't have edition access to the document but here are some information about the technical part:

The new ELN will be construct based on open-source MIT or BSD projects. The main projects are couchDB (http://couchdb.apache.org/) and the visualizer (https://github.com/npellet/visualizer).
CouchDB not only allows to have multiple master and be able to deal with revisions but is also designed to attach large files that is common when analysing products.
All the ELN will require exclusively a modern web browser like Microsoft Edge or Google Chrome and the whole system is based on javascript. It will allow not only to store reactions and attached literature but will also allow to store analytical results associated with the products. Tools allowing to visualize and assign NMR spectra directly from the browser will also be developed.
image

@lpatiny
Copy link
Member

lpatiny commented Feb 1, 2016

Customisable reports will be available to either create an experimental part of to print the expeirment to go to the bench. Here is an example of this report:
image

@mattodd
Copy link
Member Author

mattodd commented Feb 1, 2016

Looks very nice @lpatiny . You should now have edit rights. Can we move discussion over to #371?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants