Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion biopandas/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
# Project Website: http://rasbt.github.io/biopandas/
# Code Repository: https://github.com/rasbt/biopandas

__version__ = '0.2.1'
__version__ = '0.2.2dev'
__author__ = "Sebastian Raschka <mail@sebastianraschka.com>"
20 changes: 19 additions & 1 deletion docs/sources/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,24 @@
The CHANGELOG for the current development version is available at
[https://github.com/rasbt/biopandas/blob/master/docs/sources/CHANGELOG.md](https://github.com/rasbt/biopandas/blob/master/docs/sources/CHANGELOG.md).

### 0.2.2dev (TBD)

##### Downloads

- [Source code (zip)](https://github.com/rasbt/biopandas/archive/v0.2.2.zip)
- [Source code (tar.gz)](https://github.com/rasbt/biopandas/archive/v0.2.2.tar.gz)

##### New Features

- -

##### Changes

- Add meaningful error message if attempting to overwrite the `df` attributes of `PandasMol2` and `PandasPdb` directly.

##### Bug Fixes


### 0.2.1 (2017-05-11)

##### Downloads
Expand All @@ -18,7 +36,7 @@ The CHANGELOG for the current development version is available at

- The `amino3to1` method of `biopandas.pdb.PandasPDB` objects now returns a pandas `DataFrame` instead of a pandas `Series` object. The returned data frame has two columns, `'chain_id'` and `'residue_name'`, where the former contains the chain ID of the amino acid and the latter contains the 1-letter amino acid code, respectively.
- Significant speed improvements of the `distance` method of both `PandasPdb` and `PandasMol2` (now about 300 percent faster than previously).
- Add meaningful error message if attempting to overwrite the `df` attributes of `PandasMol2` and `PandasPdb` directly.


##### Bug Fixes

Expand Down
1 change: 0 additions & 1 deletion files.txt

This file was deleted.

37 changes: 37 additions & 0 deletions paper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
title: 'BioPandas: Working with molecular structures in pandas DataFrames'
tags:
- bioinformatics
- computational biology
- protein structure analysis
- protein-ligand docking
- virtual screening
authors:
- name: Sebastian Raschka
orcid: 0000-0001-6989-4493
affiliation: 1
affiliations:
- name: Michigan State University, East-Lansing, USA
index: 1
date: 31 May 2017
---



# Summary

BioPandas is a Python library that reads molecular structures from 3D-coordinate files, such as PDB (Bernstein *and others* 1977) and MOL2, into pandas DataFrames (McKinney 2010) for convenient data analysis and data mining related tasks.

In addition to parsing protein and small molecule data into a data frame format, BioPandas provides additional utility functions for structure analysis. These functions include common computations such as computing the root-mean-squared-deviation between structures and converting protein structures into primary amino acid sequence formats.

Furthermore, useful small-molecule related functions are provided for reading and parsing millions of small molecule structures (from multi-MOL2 files) fast and efficiently in virtual screening applications. Inbuilt functions for filtering molecules by the presence of functional groups and their pair-wise distances to each other make BioPandas a particularly attractive utility library for virtual screening and protein-ligand docking applications.


# References

**McKinney, Wes** (2010). Data structures for statistical computing in Pyt
hon. *Proceedings of the 9th Python in Science Conference. Vol. 445.*

**Bernstein, Frances C., Thomas F. Koetzle, Graheme JB Williams, Edgar F. Meyer, Michael D. Brice, John R. Rodgers, Olga Kennard, Takehiko Shimanouchi, and Mitsuo Tasumi** (1977). The Protein Data Bank. *European Journal of Biochemistry 80, no. 2: 319-324.*

**Tripos, L.** (2007). Tripos Mol2 File Format. *St. Louis, MO: Tripos.*