Skip to content

Commit

Permalink
Improve docs
Browse files Browse the repository at this point in the history
  • Loading branch information
mcs07 committed Dec 10, 2014
1 parent 5d6dbd6 commit 7202538
Show file tree
Hide file tree
Showing 8 changed files with 211 additions and 26 deletions.
7 changes: 3 additions & 4 deletions chemspipy/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ def __init__(self, security_token=None, user_agent=None, api_url=None):
:param string security_token: (Optional) Your ChemSpider security token.
:param string user_agent: (Optional) Identify your application to ChemSpider servers.
:param string api_url: (Optional) Alternative API server.
"""
log.debug('Initializing ChemSpider')
self.api_url = api_url if api_url else 'http://www.chemspider.com'
Expand Down Expand Up @@ -397,13 +398,11 @@ def search(self, query):
return Results(self, self.async_simple_search, (query,))

# TODO: Wrappers for subscriber role asynchronous searches
# TODO: Ordered results


class ChemSpider(CustomApi, MassSpecApi, SearchApi, SpectraApi, InchiApi):
"""Provides access to the ChemSpider API.
See :class:`BaseChemSpider` further information.
"""
"""Provides access to the ChemSpider API."""

def __repr__(self):
return 'ChemSpider(%r)' % self.security_token
2 changes: 1 addition & 1 deletion chemspipy/objects.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
chemspipy.objects
~~~~~~~~~~~~~~~~~
Objects returned by ChemSpiPy API methods.
:copyright: Copyright 2014 by Matt Swain.
:license: MIT, see LICENSE file for more details.
Expand Down
6 changes: 6 additions & 0 deletions chemspipy/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -160,3 +160,9 @@ def __repr__(self):
return 'Results(%r)' % self._results
else:
return 'Results(%r)' % self.status


# TODO: fetch method that gets the property values for every Compound in the list of results.
# Do this by running get_extended_mol_compound_info_list and then inserting info into Compounds
# Do multiple requests in chuncks of 250 Compounds if necessary
# Compound will need a method to insert info from JSON response
78 changes: 69 additions & 9 deletions docs/source/guide/compound.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,72 @@
Compound
========

TODO

- Many methods return Compound objects
- This is a simple wrapper around a ChemSpider ID that allows further information to be retrieved
- Once retrieved, properties are cached so subsequent access on the same Compound object should be faster.
- Behind the scenes, Compound objects just use other API endpoints:
- get_extended_compound_info
- get_record_mol
- get_compound_thumbnail
Many ChemSpiPy search methods return :class:`~chemspipy.Compound` objects, which provide more functionality that a
simple list of ChemSpider IDs. The primary benefit is allowing easy access to further compound properties after
performing a search.

Creating a Compound
-------------------

The easiest way to create a :class:`~chemspipy.Compound` for a given ChemSpider ID is to use the ``get_compound``
method::

>>> compound = cs.get_compound(2157)

Alternatively, a :class:`~chemspipy.Compound` can be instantiated directly::

>>> compound = Compound(cs, 2157)

Either way, no requests are made to the ChemSpider servers until specific :class:`~chemspipy.Compound` properties are
requested::

>>> print(compound.molecular_formula)
C_{9}H_{8}O_{4}
>>> print(compound.molecular_weight)
180.15742
>>> print(compound.smiles)
CC(=O)OC1=CC=CC=C1C(=O)O
>>> print(compound.common_name)
Aspirin

Properties are cached locally after the first time they are retrieved, speeding up subsequent access and reducing the
number of unnecessary requests to the ChemSpider servers.

Searching for Compounds
-----------------------

See the :ref:`searching documentation <searching>` for full details.

Compound properties
-------------------

- ``csid``: ChemSpider ID.
- ``image_url``: URL of a PNG image of the 2D chemical structure.
- ``molecular_formula``: Molecular formula.
- ``smiles``: SMILES string.
- ``inchi``: InChI string.
- ``inchikey``: InChIKey.
- ``average_mass``: Average mass.
- ``molecular_weight``: Molecular weight.
- ``monoisotopic_mass``: Monoisotopic mass.
- ``nominal_mass``: Nominal mass.
- ``alogp``: AlogP.
- ``xlogp``: XlogP.
- ``common_name``: Common Name.
- ``mol_2d``: MOL file containing 2D coordinates.
- ``mol_3d``: MOL file containing 3D coordinates.
- ``mol_raw``: Unprocessed MOL file.
- ``image``: 2D depiction as binary data in PNG format.
- ``spectra``: List of spectra.

Implementation details
----------------------

Each :class:`~chemspipy.Compound` object is a simple wrapper around a ChemSpider ID. Behind the scenes, the property
methods use the ``get_extended_compound_info``, ``get_record_mol`` and ``get_compound_thumbnail`` API methods
to retrieve the relevant information. It is possible to use these API methods directly if required::

>>> info = cs.get_extended_compound_info(2157)
{u'smiles': u'CC(=O)Oc1ccccc1C(=O)O', u'common_name': u'Aspirin', u'nominal_mass': 180.0, u'molecular_formula': u'C_{9}H_{8}O_{4}', u'inchikey': u'BSYNRYMUTXBXSQ-UHFFFAOYAW', u'molecular_weight': 180.1574, u'inchi': u'InChI=1/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)', u'average_mass': 180.1574, u'csid': 2157, u'alogp': 0.0, u'xlogp': 0.0, u'monoisotopic_mass': 180.042252}

Results are returned as a python dictionary that is derived directly from the ChemSpider API XML response.
2 changes: 1 addition & 1 deletion docs/source/guide/gettingstarted.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Then connect to ChemSpider by creating a ``ChemSpider`` instance using your secu

>>> cs = ChemSpider('<YOUR-SECURITY-TOKEN>')

All your interaction with the ChemSpider database should now happen through this ChemSpider object.
All your interaction with the ChemSpider database should now happen through this ChemSpider object, ``cs``.

Retrieve a Compound
-------------------
Expand Down
24 changes: 24 additions & 0 deletions docs/source/guide/misc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
.. _misc:

Miscellaneous
=============

Constructing API URLs
---------------------

See the `ChemSpider API documentation`_ for more details.

>>> cs.construct_api_url('MassSpec', 'GetExtendedCompoundInfo', csid='2157')
u'http://www.chemspider.com/MassSpec.asmx/GetExtendedCompoundInfo?csid=2157'

Data sources
------------

Get a list of data sources in ChemSpider::

>>> cs.get_databases()
['Abacipharm', 'Abblis Chemicals', 'Abcam', 'ABI Chemicals', 'Abmole Bioscience', 'ACB Blocks', 'Accela ChemBio', ... ]



.. _`ChemSpider API documentation`: http://www.chemspider.com/AboutServices.aspx
115 changes: 105 additions & 10 deletions docs/source/guide/searching.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,108 @@
Searching
=========

TODO

- The main search method
- Explain how search is done in the background
- Check if results are ready - get message, duration, count
- Iterate results just like a python list
- TODO: Ordered results (csid, mass_defect, molecular_weight, reference_count, datasource_count, pubmed_count, rsc_count)
- search_by_formula
- search_by_mass
- simple_search (?)
ChemSpiPy provides a number of different ways to search ChemSpider.

Compound search
---------------

The main ChemSpiPy search method functions in a similar way to the main search box on the ChemSpider website. Just
provide any type of query, and ChemSpider will interpret it and provide the most relevant results::

>>> cs.search('O=C(OCC)C')
Results([Compound(8525)])
>>> cs.search('glucose')
Results([Compound(5589), Compound(58238), Compound(71358), Compound(96749), Compound(9312824), Compound(9484839)])
>>> cs.search('2157')
Results([Compound(2157)])

The supported query types include systematic names, synonyms, trade names, registry numbers, molecular formula, SMILES,
InChI and InChIKey.

The :class:`~chemspipy.Results` object that is returned can be treated just like any regular python list. For example,
you can iterate over the results::

>>> for result in cs.search('Glucose'):
... print(result.csid)
5589
58238
71358
96749
9312824
9484839

The :class:`~chemspipy.Results` object also provides the time take to perform the search, and a message that explains
how the query type was resolved::

>>> r = cs.search('Glucose')
>>> print(r.duration)
u'0:00:00.017'
>>> print(r.message)
u'Found by approved synonym'

Asynchronous searching
----------------------

Certain types of search can sometimes take slightly longer, which can be inconvenient if the search method blocks the
Python interpreter until the search results are returned. Fortunately, the ChemSpiPy search method works asynchronously.

Once a search is executed, ChemSpiPy immediately returns the :class:`~chemspipy.Results` object, which is actually
empty at first::

>>> results = cs.search('O=C(OCC)C')
>>> print(results.ready())
False

In a background thread, ChemSpiPy is making the search request and waiting for the response. But in the meantime, it is
possible to continue performing other tasks in the main Python interpreter process. Call ``ready()`` at any
point to check if the results have been returned and are available.

Any attempt to access the results will just block until the results are ready, like a simple synchronous search. To
manually block the main thread until the results are ready, use the ``wait()`` method::

>>> results.wait()
>>> results.ready()
True

For more detailed information about the status of a search, use the ``status`` property::

>>> results.status
u'Created'
>>> results.wait()
>>> results.status
u'ResultReady'

The possible statuses are ``Unknown``, ``Created``, ``Scheduled``, ``Processing``, ``Suspended``,
``PartialResultReady``, ``ResultReady``.

Simple search
-------------

The asynchronous search is designed to be simple as possible, but it's possible that the additional overhead might be
overkill in some cases. The ``simple_search`` method provides a simpler synchronous alternative. Use it in the same way::

>>> cs.simple_search('Glucose')
[Compound(5589), Compound(58238), Compound(71358), Compound(96749), Compound(9312824), Compound(9484839)]

In this case, the main Python thread will be blocked until the search results are returned, and the results actually are
just in a regular Python list. A maximum of 100 results are returned.

Search by formula
-----------------

Searching by molecular formula is supported by the main ``search`` method, but there is the possibility that a formula
could be interpreted as a name or SMILES or another query type. To specifically search by formula, use::

>>> cs.search_by_formula('C44H30N4Zn')
[Compound(436642), Compound(3232330), Compound(24746832), Compound(26995124)]

Search by mass
--------------

It is also possible to search ChemSpider by mass by specifying a certain range::

>>> cs.search_by_mass(680, 0.001)
[Compound(8298180), Compound(12931939), Compound(12931969), Compound(21182158)]

The first parameter specifies the desired molecular mass, while the second parameter specifies the allowed ± range of
values.
3 changes: 2 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@ A step-by-step guide to getting started with ChemSpiPy.
guide/intro
guide/install
guide/gettingstarted
guide/searching
guide/compound
guide/searching
guide/misc
guide/advanced
guide/contributing

Expand Down

0 comments on commit 7202538

Please sign in to comment.