library for interfacing with NCBI's Entrez eUtils web service
Python
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
EUtils
tests
CHANGELOG
LICENSE
MANIFEST
MANIFEST.in
PKG-INFO
README
dtd2py
dtd2py.py
setup.py
sourcegen.py

README

                        EUtils 1.0p1

EUtils is a client library for the Entrez databases at NCBI.

NCBI provides the EUtils web service so that software can query Entrez
directly, rather than going through the web interface and dealing with
the hassles of web scraping.  For more information see

  http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html

This package provides two levels of interface.  The lowest one makes a
programmatic interface to construct the query URL and make the
request.  The higher level ones support history tracking and parsing
of query results.  These greatly simplify working with the EUtils
server.

This package is distributed under the Biopython License Agreement.
See the file 'LICENSE' for details.

TODO:
  - More documentation (will write docs for money)
  - More examples (see the 'tests' directory for some examples)
  - Work with NCBI to clarify some problems in their server

 == Requirements ==

This was developed using Python 2.2.1.  It make work with earlier
versions of Python.  No other packages are needed.

 == Installation ==

EUtils uses the Python's standard Distutils package.  Here's how to
install the package for most operating systems

  % tar xzf EUtils-1.0p1.tar.gz
  % cd EUtils-1.0p1
  % python setup.py build
  % su
  Password:
  # python setup.py install
  # exit
  %

If you want to run the self-tests as well, the following runs the
tests before the install and after the install.

  % tar xzf EUtils-1.0p1.tar.gz
  % cd EUtils-1.0b1
  % cd tests
  % env PYTHONPATH=.. python runtests.py
  % cd ..
  % python setup.py build
  % su
  Password:
  # python setup.py install
  # exit
  % cd test
  % python runtests.py
  %

The last line of the test output should be

     All 6 regression test files ran successfully.


If you are on MS Windows this will not work.  You will need to set the
PYTHONPATH as needed.  Some Unix operating systems may need one of

  zcat EUtils-1.0p1.tar.gz | tar xf -
  gzcat EUtils-1.0p1.tar.gz | tar xf -



 == Debugging ==

The following two flags are useful when debugging the web interface
calls.

To print the URL when making a request

  Eutils.ThinClient.DUMP_URL = 1

To print the text returned from the server

  Eutils.ThinClient.DUMP_RESULT = 1


 == DTD Parsing and POM ==

The DTD parsing code is built on the POM code from pyNMS.  These are
'dtd2py.py', 'sourcegen.py' and 'POM.py'.  Used with permission.

POM is the 'Python Object Model' for parsing XML.  It uses the DTD to
build a parser for XML and convert it into a form more like Python.
Most of the functions in package use the POM as an intermediate format
to convert the data into true data structure but some, like 'fetch'ing
from a publication database, return the POM directly.

>>> from EUtils import DBIdsClient
>>> import EUtils
>>> from EUtils import DBIdsClient
>>> import EUtils
>>> dbids = EUtils.DBIds("pubmed", ["9390282"])
>>> pom = DBIdsClient.from_dbids(dbids).fetch()
>>> print pom
<PubmedArticleSet>
    <PubmedArticle>
        <MedlineCitation Owner="NLM" Status="Completed">
            <MedlineID>
98051955
            </MedlineID>
            <PMID>
9390282
            </PMID>
            <DateCreated>
                <Year>
1998
                </Year>
                <Month>
01
                </Month>
                <Day>
15
                </Day>
            </DateCreated>
            <DateCompleted>
                <Year>
1998
                </Year>
                <Month>
01
                </Month>
                <Day>
15
                </Day>
            </DateCompleted>
            <DateRevised>
                <Year>
2000
                </Year>
                <Month>
12
                </Month>
                <Day>
18
                </Day>
            </DateRevised>
            <Article>
                <Journal>
                    <JournalIssue>
                        <PubDate>
                            <Year>
1997
                            </Year>
                        </PubDate>
                    </JournalIssue>
                </Journal>
                <ArticleTitle>
Using Tcl for molecular visualization and analysis.
                </ArticleTitle>
                <Pagination>
                    <MedlinePgn>
85-96
                    </MedlinePgn>
                </Pagination>
                <Abstract>
                    <AbstractText>
Reading and manipulating molecular structure data is a standard task
in every molecular visualization and analysis program, but is rarely
available in a form readily accessible to the user. Instead, the
development of new methods for analysis, display, and interaction is
often achieved by writing a new program, rather than building on
pre-existing software. We present the Tcl-based script language used
in our molecular modeling program, VMD, and show how it can access
information about the molecular structure, perform analysis, and
graphically display and animate the results. The commands are
available to the user and make VMD a useful environment for studying
biomolecules.
                    </AbstractText>
                </Abstract>
                <Affiliation>
Beckman Institute, Urbana, IL 61801, USA.
                </Affiliation>
                <AuthorList CompleteYN="Y">
                    <Author>
                        <LastName>
Dalke
                        </LastName>
                        <ForeName>
A
                        </ForeName>
                        <Initials>
A
                        </Initials>
                    </Author>
                    <Author>
                        <LastName>
Schulten
                        </LastName>
                        <ForeName>
K
                        </ForeName>
                        <Initials>
K
                        </Initials>
                    </Author>
                </AuthorList>
  ..........
>>> len(pom)
1
>>> for author in pom[0]["MedlineCitation"]["Article"]["AuthorList"]:
...     print author["LastName"].tostring()
...
Dalke
Schulten
>>>

See pynms.sourceforge.net for more details and license.

The DTDs come from NCBI at
  http://www.ncbi.nlm.nih.gov/entrez/query/DTD/index.html

The POM builders are created from the DTDS with the following:
  1. Go to the top-level EUtils directory
  2. ./dtd2py EUtils/DTDs/*.dtd