Skip to content
This repository
Browse code

Add two files in Medline format to be used for the Bio.Medline unit t…

…ests.
  • Loading branch information...
commit 02a376490878da5be0800837dc1feff314f4a51c 1 parent df16d52
authored August 02, 2008
42  Tests/Medline/pubmed_result1.txt
... ...
@@ -0,0 +1,42 @@
  1
+
  2
+PMID- 12230038
  3
+OWN - NLM
  4
+STAT- MEDLINE
  5
+DA  - 20020916
  6
+DCOM- 20030606
  7
+LR  - 20041117
  8
+PUBM- Print
  9
+IS  - 1467-5463 (Print)
  10
+VI  - 3
  11
+IP  - 3
  12
+DP  - 2002 Sep
  13
+TI  - The Bio* toolkits--a brief overview.
  14
+PG  - 296-302
  15
+AB  - Bioinformatics research is often difficult to do with commercial software. The
  16
+      Open Source BioPerl, BioPython and Biojava projects provide toolkits with
  17
+      multiple functionality that make it easier to create customised pipelines or
  18
+      analysis. This review briefly compares the quirks of the underlying languages and
  19
+      the functionality, documentation, utility and relative advantages of the Bio
  20
+      counterparts, particularly from the point of view of the beginning biologist
  21
+      programmer.
  22
+AD  - tacg Informatics, Irvine, CA 92612, USA. hjm@tacgi.com
  23
+FAU - Mangalam, Harry
  24
+AU  - Mangalam H
  25
+LA  - eng
  26
+PT  - Journal Article
  27
+PL  - England
  28
+TA  - Brief Bioinform
  29
+JT  - Briefings in bioinformatics
  30
+JID - 100912837
  31
+SB  - IM
  32
+MH  - *Computational Biology
  33
+MH  - Computer Systems
  34
+MH  - Humans
  35
+MH  - Internet
  36
+MH  - *Programming Languages
  37
+MH  - *Software
  38
+MH  - User-Computer Interface
  39
+EDAT- 2002/09/17 10:00
  40
+MHDA- 2003/06/07 05:00
  41
+PST - ppublish
  42
+SO  - Brief Bioinform. 2002 Sep;3(3):296-302.
248  Tests/Medline/pubmed_result2.txt
... ...
@@ -0,0 +1,248 @@
  1
+
  2
+PMID- 16403221
  3
+OWN - NLM
  4
+STAT- MEDLINE
  5
+DA  - 20060220
  6
+DCOM- 20060314
  7
+PUBM- Electronic
  8
+IS  - 1471-2105 (Electronic)
  9
+VI  - 7
  10
+DP  - 2006
  11
+TI  - A high level interface to SCOP and ASTRAL implemented in python.
  12
+PG  - 10
  13
+AB  - BACKGROUND: Benchmarking algorithms in structural bioinformatics often involves
  14
+      the construction of datasets of proteins with given sequence and structural
  15
+      properties. The SCOP database is a manually curated structural classification
  16
+      which groups together proteins on the basis of structural similarity. The ASTRAL 
  17
+      compendium provides non redundant subsets of SCOP domains on the basis of
  18
+      sequence similarity such that no two domains in a given subset share more than a 
  19
+      defined degree of sequence similarity. Taken together these two resources provide
  20
+      a 'ground truth' for assessing structural bioinformatics algorithms. We present a
  21
+      small and easy to use API written in python to enable construction of datasets
  22
+      from these resources. RESULTS: We have designed a set of python modules to
  23
+      provide an abstraction of the SCOP and ASTRAL databases. The modules are designed
  24
+      to work as part of the Biopython distribution. Python users can now manipulate
  25
+      and use the SCOP hierarchy from within python programs, and use ASTRAL to return 
  26
+      sequences of domains in SCOP, as well as clustered representations of SCOP from
  27
+      ASTRAL. CONCLUSION: The modules make the analysis and generation of datasets for 
  28
+      use in structural genomics easier and more principled.
  29
+AD  - Bioinformatics, Institute of Cell and Molecular Science, School of Medicine and
  30
+      Dentistry, Queen Mary, University of London, London EC1 6BQ, UK.
  31
+      j.a.casbon@qmul.ac.uk
  32
+FAU - Casbon, James A
  33
+AU  - Casbon JA
  34
+FAU - Crooks, Gavin E
  35
+AU  - Crooks GE
  36
+FAU - Saqi, Mansoor A S
  37
+AU  - Saqi MA
  38
+LA  - eng
  39
+PT  - Evaluation Studies
  40
+PT  - Journal Article
  41
+DEP - 20060110
  42
+PL  - England
  43
+TA  - BMC Bioinformatics
  44
+JT  - BMC bioinformatics
  45
+JID - 100965194
  46
+SB  - IM
  47
+MH  - *Database Management Systems
  48
+MH  - *Databases, Protein
  49
+MH  - Information Storage and Retrieval/*methods
  50
+MH  - Programming Languages
  51
+MH  - Sequence Alignment/*methods
  52
+MH  - Sequence Analysis, Protein/*methods
  53
+MH  - Sequence Homology, Amino Acid
  54
+MH  - *Software
  55
+MH  - *User-Computer Interface
  56
+PMC - PMC1373603
  57
+EDAT- 2006/01/13 09:00
  58
+MHDA- 2006/03/15 09:00
  59
+PHST- 2005/06/17 [received]
  60
+PHST- 2006/01/10 [accepted]
  61
+PHST- 2006/01/10 [aheadofprint]
  62
+AID - 1471-2105-7-10 [pii]
  63
+AID - 10.1186/1471-2105-7-10 [doi]
  64
+PST - epublish
  65
+SO  - BMC Bioinformatics. 2006 Jan 10;7:10.
  66
+
  67
+PMID- 16377612
  68
+OWN - NLM
  69
+STAT- MEDLINE
  70
+DA  - 20060223
  71
+DCOM- 20060418
  72
+LR  - 20061115
  73
+PUBM- Print-Electronic
  74
+IS  - 1367-4803 (Print)
  75
+VI  - 22
  76
+IP  - 5
  77
+DP  - 2006 Mar 1
  78
+TI  - GenomeDiagram: a python package for the visualization of large-scale genomic
  79
+      data.
  80
+PG  - 616-7
  81
+AB  - SUMMARY: We present GenomeDiagram, a flexible, open-source Python module for the 
  82
+      visualization of large-scale genomic, comparative genomic and other data with
  83
+      reference to a single chromosome or other biological sequence. GenomeDiagram may 
  84
+      be used to generate publication-quality vector graphics, rastered images and
  85
+      in-line streamed graphics for webpages. The package integrates with datatypes
  86
+      from the BioPython project, and is available for Windows, Linux and Mac OS X
  87
+      systems. AVAILABILITY: GenomeDiagram is freely available as source code (under
  88
+      GNU Public License) at http://bioinf.scri.ac.uk/lp/programs.html, and requires
  89
+      Python 2.3 or higher, and recent versions of the ReportLab and BioPython
  90
+      packages. SUPPLEMENTARY INFORMATION: A user manual, example code and images are
  91
+      available at http://bioinf.scri.ac.uk/lp/programs.html.
  92
+AD  - Plant Pathogen Programme, Scottish Crop Research Institute, Invergowrie, Dundee
  93
+      DD2 5DA, Scotland, UK. lpritc@scri.ac.uk
  94
+FAU - Pritchard, Leighton
  95
+AU  - Pritchard L
  96
+FAU - White, Jennifer A
  97
+AU  - White JA
  98
+FAU - Birch, Paul R J
  99
+AU  - Birch PR
  100
+FAU - Toth, Ian K
  101
+AU  - Toth IK
  102
+LA  - eng
  103
+PT  - Journal Article
  104
+PT  - Research Support, Non-U.S. Gov't
  105
+DEP - 20051223
  106
+PL  - England
  107
+TA  - Bioinformatics
  108
+JT  - Bioinformatics (Oxford, England)
  109
+JID - 9808944
  110
+SB  - IM
  111
+MH  - Chromosome Mapping/*methods
  112
+MH  - *Computer Graphics
  113
+MH  - *Database Management Systems
  114
+MH  - *Databases, Genetic
  115
+MH  - Information Storage and Retrieval/methods
  116
+MH  - *Programming Languages
  117
+MH  - *Software
  118
+MH  - *User-Computer Interface
  119
+EDAT- 2005/12/27 09:00
  120
+MHDA- 2006/04/19 09:00
  121
+PHST- 2005/12/23 [aheadofprint]
  122
+AID - btk021 [pii]
  123
+AID - 10.1093/bioinformatics/btk021 [doi]
  124
+PST - ppublish
  125
+SO  - Bioinformatics. 2006 Mar 1;22(5):616-7. Epub 2005 Dec 23.
  126
+
  127
+PMID- 14871861
  128
+OWN - NLM
  129
+STAT- MEDLINE
  130
+DA  - 20040611
  131
+DCOM- 20050104
  132
+LR  - 20061115
  133
+PUBM- Print-Electronic
  134
+IS  - 1367-4803 (Print)
  135
+VI  - 20
  136
+IP  - 9
  137
+DP  - 2004 Jun 12
  138
+TI  - Open source clustering software.
  139
+PG  - 1453-4
  140
+AB  - SUMMARY: We have implemented k-means clustering, hierarchical clustering and
  141
+      self-organizing maps in a single multipurpose open-source library of C routines, 
  142
+      callable from other C and C++ programs. Using this library, we have created an
  143
+      improved version of Michael Eisen's well-known Cluster program for Windows, Mac
  144
+      OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to
  145
+      the C Clustering Library, thereby combining the flexibility of a scripting
  146
+      language with the speed of C. AVAILABILITY: The C Clustering Library and the
  147
+      corresponding Python C extension module Pycluster were released under the Python 
  148
+      License, while the Perl module Algorithm::Cluster was released under the Artistic
  149
+      License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well 
  150
+      as the corresponding command-line program, were released under the same license
  151
+      as the original Cluster code. The complete source code is available at
  152
+      http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster. Alternatively,
  153
+      Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available
  154
+      as part of the Biopython distribution.
  155
+AD  - Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1
  156
+      Shirokanedai, Minato-ku, Tokyo, 108-8639 Japan. mdehoon@ims.u-tokyo.ac.jp
  157
+FAU - de Hoon, M J L
  158
+AU  - de Hoon MJ
  159
+FAU - Imoto, S
  160
+AU  - Imoto S
  161
+FAU - Nolan, J
  162
+AU  - Nolan J
  163
+FAU - Miyano, S
  164
+AU  - Miyano S
  165
+LA  - eng
  166
+PT  - Comparative Study
  167
+PT  - Evaluation Studies
  168
+PT  - Journal Article
  169
+PT  - Validation Studies
  170
+DEP - 20040210
  171
+PL  - England
  172
+TA  - Bioinformatics
  173
+JT  - Bioinformatics (Oxford, England)
  174
+JID - 9808944
  175
+SB  - IM
  176
+MH  - *Algorithms
  177
+MH  - *Cluster Analysis
  178
+MH  - Gene Expression Profiling/*methods
  179
+MH  - Pattern Recognition, Automated/methods
  180
+MH  - *Programming Languages
  181
+MH  - Sequence Alignment/*methods
  182
+MH  - Sequence Analysis, DNA/*methods
  183
+MH  - *Software
  184
+EDAT- 2004/02/12 05:00
  185
+MHDA- 2005/01/05 09:00
  186
+PHST- 2004/02/10 [aheadofprint]
  187
+AID - 10.1093/bioinformatics/bth078 [doi]
  188
+AID - bth078 [pii]
  189
+PST - ppublish
  190
+SO  - Bioinformatics. 2004 Jun 12;20(9):1453-4. Epub 2004 Feb 10.
  191
+
  192
+PMID- 14630660
  193
+OWN - NLM
  194
+STAT- MEDLINE
  195
+DA  - 20031121
  196
+DCOM- 20040722
  197
+LR  - 20061115
  198
+PUBM- Print
  199
+IS  - 1367-4803 (Print)
  200
+VI  - 19
  201
+IP  - 17
  202
+DP  - 2003 Nov 22
  203
+TI  - PDB file parser and structure class implemented in Python.
  204
+PG  - 2308-10
  205
+AB  - The biopython project provides a set of bioinformatics tools implemented in
  206
+      Python. Recently, biopython was extended with a set of modules that deal with
  207
+      macromolecular structure. Biopython now contains a parser for PDB files that
  208
+      makes the atomic information available in an easy-to-use but powerful data
  209
+      structure. The parser and data structure deal with features that are often left
  210
+      out or handled inadequately by other packages, e.g. atom and residue disorder (if
  211
+      point mutants are present in the crystal), anisotropic B factors, multiple models
  212
+      and insertion codes. In addition, the parser performs some sanity checking to
  213
+      detect obvious errors. AVAILABILITY: The Biopython distribution (including source
  214
+      code and documentation) is freely available (under the Biopython license) from
  215
+      http://www.biopython.org
  216
+AD  - Department of Cellular and Molecular Interactions, Vlaams Interuniversitair
  217
+      Instituut voor Biotechnologie and Computational Modeling Lab, Department of
  218
+      Computer Science, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels,
  219
+      Belgium. thamelry@vub.ac.be
  220
+FAU - Hamelryck, Thomas
  221
+AU  - Hamelryck T
  222
+FAU - Manderick, Bernard
  223
+AU  - Manderick B
  224
+LA  - eng
  225
+PT  - Comparative Study
  226
+PT  - Evaluation Studies
  227
+PT  - Journal Article
  228
+PT  - Research Support, Non-U.S. Gov't
  229
+PT  - Validation Studies
  230
+PL  - England
  231
+TA  - Bioinformatics
  232
+JT  - Bioinformatics (Oxford, England)
  233
+JID - 9808944
  234
+RN  - 0 (Macromolecular Substances)
  235
+SB  - IM
  236
+MH  - Computer Simulation
  237
+MH  - Database Management Systems/*standards
  238
+MH  - *Databases, Protein
  239
+MH  - Information Storage and Retrieval/*methods/*standards
  240
+MH  - Macromolecular Substances
  241
+MH  - *Models, Molecular
  242
+MH  - *Programming Languages
  243
+MH  - Protein Conformation
  244
+MH  - *Software
  245
+EDAT- 2003/11/25 05:00
  246
+MHDA- 2004/07/23 05:00
  247
+PST - ppublish
  248
+SO  - Bioinformatics. 2003 Nov 22;19(17):2308-10.

0 notes on commit 02a3764

Please sign in to comment.
Something went wrong with that request. Please try again.