Skip to content

Commit

Permalink
new dev version (#30)
Browse files Browse the repository at this point in the history
  • Loading branch information
rasbt committed Jun 1, 2017
1 parent a35b158 commit 0aef567
Show file tree
Hide file tree
Showing 4 changed files with 57 additions and 3 deletions.
2 changes: 1 addition & 1 deletion biopandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
# Project Website: http://rasbt.github.io/biopandas/
# Code Repository: https://github.com/rasbt/biopandas

__version__ = '0.2.1'
__version__ = '0.2.2dev'
__author__ = "Sebastian Raschka <mail@sebastianraschka.com>"
20 changes: 19 additions & 1 deletion docs/sources/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,24 @@
The CHANGELOG for the current development version is available at
[https://github.com/rasbt/biopandas/blob/master/docs/sources/CHANGELOG.md](https://github.com/rasbt/biopandas/blob/master/docs/sources/CHANGELOG.md).

### 0.2.2dev (TBD)

##### Downloads

- [Source code (zip)](https://github.com/rasbt/biopandas/archive/v0.2.2.zip)
- [Source code (tar.gz)](https://github.com/rasbt/biopandas/archive/v0.2.2.tar.gz)

##### New Features

- -

##### Changes

- Add meaningful error message if attempting to overwrite the `df` attributes of `PandasMol2` and `PandasPdb` directly.

##### Bug Fixes


### 0.2.1 (2017-05-11)

##### Downloads
Expand All @@ -18,7 +36,7 @@ The CHANGELOG for the current development version is available at

- The `amino3to1` method of `biopandas.pdb.PandasPDB` objects now returns a pandas `DataFrame` instead of a pandas `Series` object. The returned data frame has two columns, `'chain_id'` and `'residue_name'`, where the former contains the chain ID of the amino acid and the latter contains the 1-letter amino acid code, respectively.
- Significant speed improvements of the `distance` method of both `PandasPdb` and `PandasMol2` (now about 300 percent faster than previously).
- Add meaningful error message if attempting to overwrite the `df` attributes of `PandasMol2` and `PandasPdb` directly.


##### Bug Fixes

Expand Down
1 change: 0 additions & 1 deletion files.txt

This file was deleted.

37 changes: 37 additions & 0 deletions paper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: 'BioPandas: Working with molecular structures in pandas DataFrames'
tags:
- bioinformatics
- computational biology
- protein structure analysis
- protein-ligand docking
- virtual screening
authors:
- name: Sebastian Raschka
orcid: 0000-0001-6989-4493
affiliation: 1
affiliations:
- name: Michigan State University, East-Lansing, USA
index: 1
date: 31 May 2017
---



# Summary

BioPandas is a Python library that reads molecular structures from 3D-coordinate files, such as PDB (Bernstein *and others* 1977) and MOL2, into pandas DataFrames (McKinney 2010) for convenient data analysis and data mining related tasks.

In addition to parsing protein and small molecule data into a data frame format, BioPandas provides additional utility functions for structure analysis. These functions include common computations such as computing the root-mean-squared-deviation between structures and converting protein structures into primary amino acid sequence formats.

Furthermore, useful small-molecule related functions are provided for reading and parsing millions of small molecule structures (from multi-MOL2 files) fast and efficiently in virtual screening applications. Inbuilt functions for filtering molecules by the presence of functional groups and their pair-wise distances to each other make BioPandas a particularly attractive utility library for virtual screening and protein-ligand docking applications.


# References

**McKinney, Wes** (2010). Data structures for statistical computing in Pyt
hon. *Proceedings of the 9th Python in Science Conference. Vol. 445.*

**Bernstein, Frances C., Thomas F. Koetzle, Graheme JB Williams, Edgar F. Meyer, Michael D. Brice, John R. Rodgers, Olga Kennard, Takehiko Shimanouchi, and Mitsuo Tasumi** (1977). The Protein Data Bank. *European Journal of Biochemistry 80, no. 2: 319-324.*

**Tripos, L.** (2007). Tripos Mol2 File Format. *St. Louis, MO: Tripos.*

0 comments on commit 0aef567

Please sign in to comment.