Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's Needed for a Test of Luc Patiny's New System? #372

Open
mattodd opened this issue Feb 2, 2016 · 22 comments
Open

What's Needed for a Test of Luc Patiny's New System? #372

mattodd opened this issue Feb 2, 2016 · 22 comments
Assignees

Comments

@mattodd
Copy link
Member

mattodd commented Feb 2, 2016

Hi @lpatiny . What we'd love to do is this:

  1. Post data about a chemical reaction we've done to a public folder. This would include everything your ELN would need to show the reaction in question (graphically), include the description of the reaction, and include all the acquired data (NMR, MS, IR etc)
  2. Show that your system can extract the data and generate the ELN page on the fly.
  3. Do the same with a similar dataset, for a related reaction, with data from some other team, e.g. @MedChemProf
  4. Show, as a proof of concept, that your system spots a similarity between the two reactions, to illustrate the possibilities in OSM Proposal to the Open Science Prize #371 At this stage the similarity could be merely a similarity between the structures of the product molecules.

So: first step. After discussions here it looks like the simplest place to put our example data, for one chemical reaction, is on a public Dropbox folder. Is that OK?
If no, where should we instead put the data?
If yes, can you tell us (bench chemists) how you want the data arranged - i.e. names of folders and filenames, and types of data.

@mattodd mattodd self-assigned this Feb 2, 2016
@alintheopen
Copy link
Member

should we discuss next steps for this as part of next open science prize
hangout?

On Tue, Feb 2, 2016 at 1:14 PM, Mat Todd notifications@github.com wrote:

Hi @lpatiny https://github.com/lpatiny . What we'd love to do is this:

  1. Post data about a chemical reaction we've done to a public folder. This
    would include everything your ELN would need to show the reaction in
    question (graphically), include the description of the reaction, and
    include all the acquired data (NMR, MS, IR etc)
  2. Show that your system can extract the data and generate the ELN page on
    the fly.
  3. Do the same with a similar dataset, for a related reaction, with data
    from some other team, e.g. @MedChemProf https://github.com/MedChemProf
  4. Show, as a proof of concept, that your system spots a similarity
    between the two reactions, to illustrate the possibilities in OSM Proposal to the Open Science Prize #371
    OSM Proposal to the Open Science Prize #371 At this
    stage the similarity could be merely a similarity between the structures of
    the product molecules.

So: first step. After discussions here it looks like the simplest place to
put our example data, for one chemical reaction, is on a public Dropbox
folder. Is that OK?
If no, where should we instead put the data?
If yes, can you tell us (bench chemists) how you want the data arranged -
i.e. names of folders and filenames, and types of data.


Reply to this email directly or view it on GitHub
#372.

@mattodd
Copy link
Member Author

mattodd commented Feb 15, 2016

Yes, good idea.

On 15 February 2016 at 11:33, alintheopen notifications@github.com wrote:

should we discuss next steps for this as part of next open science prize
hangout?

On Tue, Feb 2, 2016 at 1:14 PM, Mat Todd notifications@github.com wrote:

Hi @lpatiny https://github.com/lpatiny . What we'd love to do is this:

  1. Post data about a chemical reaction we've done to a public folder.
    This
    would include everything your ELN would need to show the reaction in
    question (graphically), include the description of the reaction, and
    include all the acquired data (NMR, MS, IR etc)
  2. Show that your system can extract the data and generate the ELN page
    on
    the fly.
  3. Do the same with a similar dataset, for a related reaction, with data
    from some other team, e.g. @MedChemProf https://github.com/MedChemProf
  4. Show, as a proof of concept, that your system spots a similarity
    between the two reactions, to illustrate the possibilities in OSM Proposal to the Open Science Prize #371
    OSM Proposal to the Open Science Prize #371 At this
    stage the similarity could be merely a similarity between the structures
    of
    the product molecules.

So: first step. After discussions here it looks like the simplest place
to
put our example data, for one chemical reaction, is on a public Dropbox
folder. Is that OK?
If no, where should we instead put the data?
If yes, can you tell us (bench chemists) how you want the data arranged -
i.e. names of folders and filenames, and types of data.


Reply to this email directly or view it on GitHub
#372.


Reply to this email directly or view it on GitHub
#372 (comment)
.

MATTHEW TODD | Associate Professor
School of Chemistry | Faculty of Science

THE UNIVERSITY OF SYDNEY
Rm 519, F11 | The University of Sydney | NSW | 2006
T +61 2 9351 2180 | F +61 2 9351 3329 | M +61 415 274104
E matthew.todd@sydney.edu.au | W
http://sydney.edu.au/science/people/matthew.todd.php
W http://opensourcemalaria.org/ | W http://opensourcetb.org/ | W
http://opensourcepharma.net/

CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised
use is strictly prohibited. If you receive this email in error, please
delete it and any attachments.

@drc007
Copy link

drc007 commented Feb 28, 2016

I was struck by the idea of having pictures next to the notebook in the video, I know it might sound trivial but could we really have a place to store a photo of the notebook owner in the ELN?

@mattodd
Copy link
Member Author

mattodd commented Feb 28, 2016

Good idea and I think probably easy to implement?

On 29 February 2016 at 00:26, Chris Swain notifications@github.com wrote:

I was struck by the idea of having pictures next to the notebook in the
video, I know it might sound trivial but could we really have a place to
store a photo of the notebook owner in the ELN?


Reply to this email directly or view it on GitHub
#372 (comment)
.

MATTHEW TODD | Associate Professor
School of Chemistry | Faculty of Science

THE UNIVERSITY OF SYDNEY
Rm 519, F11 | The University of Sydney | NSW | 2006
T +61 2 9351 2180 | F +61 2 9351 3329 | M +61 415 274104
E matthew.todd@sydney.edu.au | W
http://sydney.edu.au/science/people/matthew.todd.php
W http://opensourcemalaria.org/ | W http://opensourcetb.org/ | W
http://opensourcepharma.net/

CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised
use is strictly prohibited. If you receive this email in error, please
delete it and any attachments.

@cdsouthan
Copy link
Member

FYI http://www-rinchi.ch.cam.ac.uk/website2015/

The aim of the RInChI project, in the same vein as InChI, is to create a unique data string to describe a reaction.

@drc007
Copy link

drc007 commented Mar 8, 2016

@lpatiny
Copy link
Member

lpatiny commented Mar 8, 2016

Seems strange that Safari range from 9 to 62%. Safari is only working on OsX and iPhone and if we check wikimedia statistics (https://stats.wikimedia.org/wikimedia/squids/SquidReportClients.htm) safari is less than 5%.
I wonder what is the profile of the visitors so that for some site there is a majority of Mac and iPhone users.

@mattodd
Copy link
Member Author

mattodd commented Mar 8, 2016

Right, so Luc's new ELN is right to focus on Chrome as the starting point,
but we can't (yet) neglect the others entirely. Interesting data, thanks.

On 8 March 2016 at 19:21, Chris Swain notifications@github.com wrote:

Just FYI when thinking about a web-based system.
http://www.cambridgemedchemconsulting.com/news/index_files/f0da025f6c2148999e7b530dc89b4c1b-206.html


Reply to this email directly or view it on GitHub
#372 (comment)
.

MATTHEW TODD | Associate Professor
School of Chemistry | Faculty of Science

THE UNIVERSITY OF SYDNEY
Rm 519, F11 | The University of Sydney | NSW | 2006
T +61 2 9351 2180 | F +61 2 9351 3329 | M +61 415 274104
E matthew.todd@sydney.edu.au | W
http://sydney.edu.au/science/people/matthew.todd.php
W http://opensourcemalaria.org/ | W http://opensourcetb.org/ | W
http://opensourcepharma.net/

CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised
use is strictly prohibited. If you receive this email in error, please
delete it and any attachments.

@drc007
Copy link

drc007 commented Mar 8, 2016

I guess this is the profile of those involved in drug discovery not the general public.

If you walk around a bioinformatics lab it will be nearly all Mac, so for some software vendors it is not surprising that their users are a high proportion of Mac users.

Similarly in Computational chemistry, used to be dominated by silicon graphics etc. now Mac and Linux

On 8 Mar 2016, at 10:16, lpatiny notifications@github.com wrote:

Seems strange that Safari range from 9 to 62%. Safari is only working on OsX and iPhone and if we check wikimedia statistics (https://stats.wikimedia.org/wikimedia/squids/SquidReportClients.htm https://stats.wikimedia.org/wikimedia/squids/SquidReportClients.htm) safari is less than 5%.
I wonder what is the profile of the visitors so that for some site there is a majority of Mac and iPhone users.


Reply to this email directly or view it on GitHub #372 (comment).

@MedChemProf
Copy link
Member

@mattodd I just received the below email:
"Dear Dr. Smith,
Thank you for applying for the Open Science Prize, your application “SCINDR - The 'SCience INtroDuction Robot' that will Connect Open Scientists and Incentivize New Ones” has now been reviewed. The competition was very strong, and we are sorry to inform you that your application has not been successful. We apologise but due to the high volume of applications we received, we are unable to provide detailed feedback on individual applications.
We wish you all the very best in taking forward your idea, and hope you secure funding for your project from an alternative source.
Kind regards, The Open Science Prize Team"

@mattodd
Copy link
Member Author

mattodd commented Mar 23, 2016

Too bad! It would have been interesting to receive feedback - it's a great
proposal and much needed. Ah well. Good job everyone on pulling it
together, and we shall have to think of a way to disassemble the idea and
re-assemble it as part of something else. The core functionality will be a
major bonus to lab-based researchers in drug discovery. Something that
really brings an ELN alive, in my view.

On 23 March 2016 at 22:15, Chase Smith notifications@github.com wrote:

@mattodd https://github.com/mattodd I just received the below email:
"Dear Dr. Smith,
Thank you for applying for the Open Science Prize, your application
“SCINDR - The 'SCience INtroDuction Robot' that will Connect Open
Scientists and Incentivize New Ones” has now been reviewed. The competition
was very strong, and we are sorry to inform you that your application has
not been successful. We apologise but due to the high volume of
applications we received, we are unable to provide detailed feedback on
individual applications.
We wish you all the very best in taking forward your idea, and hope you
secure funding for your project from an alternative source.
Kind regards, The Open Science Prize Team"


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#372 (comment)

MATTHEW TODD | Associate Professor
School of Chemistry | Faculty of Science

THE UNIVERSITY OF SYDNEY
Rm 519, F11 | The University of Sydney | NSW | 2006
T +61 2 9351 2180 | F +61 2 9351 3329 | M +61 415 274104
E matthew.todd@sydney.edu.au | W
http://sydney.edu.au/science/people/matthew.todd.php
W http://opensourcemalaria.org/ | W http://opensourcetb.org/ | W
http://opensourcepharma.net/

CRICOS 00026A
This email plus any attachments to it are confidential. Any unauthorised
use is strictly prohibited. If you receive this email in error, please
delete it and any attachments.

@cdsouthan
Copy link
Member

:( can we surface the whiole thing on a blog and/or figshare ?

@MedChemProf
Copy link
Member

@lpatiny @mattodd I just wanted to surface the ELN again focusing on the issue of displaying NMR data on the website. At the moment, when our research group runs an NMR sample, the instrument sends us a pdf overview, a zipped data file as well as a jcamp.dx file. All of these files get placed in our ELN. Furthermore, we process the jcamp file to produce expansions using a commercially available software package called MestreNova. All works well.
My issue is that when I visit the http://www.cheminfo.org site and use some of the tools that have a 'drag-and-drop' display of the jcamp file, I only am able to view the FID. @lpatiny has mentioned that the jcamp file needs to be post-FT, so I brought the issue to the service that provides our NMR data. They looked at the jcamp file and they stated that it did contain the “##DATA TYPE= NMR SPECTRUM” section and that this was evidenced by the ability of the MestreNova software being able to view the spectrum. They also stated that the “##DATA TYPE= NMR SPECTRUM” happens to be in the second block of information in the file and the the first contains the FID. So it may appear that the web tool is grabbing the first block and not looking for the other section.
Is there any way the tool could be modified to accommodate the way the jcamp is packaged to look beyond the first block? I know an obvious thing to do is export a second jcamp file from MestreNova, but in the end this adds additional steps that lengthens any workflow. Thanks.

@lpatiny
Copy link
Member

lpatiny commented Jul 15, 2016

Please could you upload or send me one of the jcamp files ?

@MedChemProf
Copy link
Member

@lpatiny GitHub would not let me paste in the jcamp file, but here is a link to one of mine: https://mynotebook.labarchives.com/attachments/OTA0MS41fDM2NzcvNjk1NS9FbnRyeVBhcnQvMTY5OTk4NjYzfDIyOTUxLjU=/1/original?sf324=343

Here is also the link to the same spectra as the zipped data file directly from the instrument: https://mynotebook.labarchives.com/attachments/OTA0NC4xfDM2NzcvNjk1Ny9FbnRyeVBhcnQvMzE3ODgzMDU0fDIyOTU4LjE=/1/original?sf324=343

Please let me know if the links do not work for you. Thanks for looking into this.

@lpatiny
Copy link
Member

lpatiny commented Jul 20, 2016

Indeed this jcamp contains the FID as well as the spectrum after Fourier transform. We currently don't deal with this kind of file but we should be able to add this feature.
Your files are not compressed on the other hand and you could use the DIFDUP compression format when you export the jcamp files.
If you can you could also from the bruker spectrometer export the 2 files separately. We have an example of the script that allows to export the file.
You can get inspired by the following link to save the files from Topspin:
https://github.com/cheminfo/eln-couch/blob/master/tools/bruker/python/saveref
The script makes the difference between 1D and 2D.

@MedChemProf
Copy link
Member

@lpatiny The problem is that we do not own the machine nor actually process or export the data ourselves on the machine. We borrow time on another universities instrument and they cut us a break on costs. While they are usually very helpful, I am not sure just how much we can ask them to change their normal protocols on a very busy machine.

@lpatiny
Copy link
Member

lpatiny commented Jul 20, 2016

This will be solved by a new module that we are currently creating.

@MedChemProf
Copy link
Member

@lpatiny Thank you for your help on this.

@drc007
Copy link

drc007 commented Sep 5, 2016

@mattodd
Copy link
Member Author

mattodd commented Sep 5, 2016

@drc007 @MedChemProf @lpatiny
screen shot 2016-09-05 at 10 42 22 pm
Ugh!

@MedChemProf
Copy link
Member

@mattodd I finally got a chance to read through the preprint and to be honest, nothing really surprised me. More to the point, it underscores that the problems that we have been encountering in the OSM are general in nature and are in need of a solution. My view of the marketed (free and paid) ELNs is that they each solve only parts of the problem and are not truly comprehensive. (The complete packages of ELN, LIMS, etc. tend only to be available at companies and institutions that have a dedicated IT/Programming staff to support those complex systems.) With all that being said, research is becoming more and more collaborative across disciplines and geography, making ELNs an inevitability. Early collaborative drug discovery and research would be greatly benefited by an open ELN with something like SCINDR monitoring for overlap and duplication. I think as long as we are careful to very specifically define the boundaries for the OSM Open ELN/SCINDR, we will eventually find funding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants