The CHANGELOG for the current development version is available at https://github.com/rasbt/biopandas/blob/main/docs/sources/CHANGELOG.md.
- Remove walrus operator for Python 3.7 compatibility.
-
Adds support for extracting structures from PDB files containing multiple models. See the documentation for details. (via Arian Jamasb, PR #101).
-
Adds support for fetching mmCIF (
PandasMmcif().fetch_mmcif(uniprot_id='Q5VSL9', source='alphafold2-v2')
) and PDB structures (e.g.,PandasPdb().fetch_pdb(uniprot_id='Q5VSL9', source="alphafold2-v2")
) (via Arian Jamasb, PR #102).
- Instead of raising a warning when no ATOM entries are loaded, raise the warning only when neither ATOM nor HETAM entries are loaded.
- None
- Adds support for parsing mmCIF protein structure files (via Arian Jamasb, PR #94)
- Fixes a bug where coordinates with more than 4 digits before the decimal point caused a column shift when saving a PDB file. (via PR #90)
- Fixes a bug where the invert parameter in get_carbon was selecting the wrong case. (via Arian Jamasb PR #96)
- Now also allow
.ent
and.ent.gz
file endings for PDB files. (via PR #82 - Added Python 3.8 and 3.9 to setup.py in order to support these versions via conda-forge. (via PR #87
- A
PandasPdb.read_pdb_from_list
method was added analogous to the existingPandasMol2.read_mol2_from_list
(via PR #72 by dominiquesydow)
ValueError
raising and improved file format error messages forread_pdb
andread_mol2
functionality. (via PR #73 by dominiquesydow)
- Fix Manifest file to include license file in the PyPI tar.gz file so that BioPandas can be packaged by conda-forge.
- Uses more modern
https
queries for the RCSB server via thefetch_pdb
function. - Updates the documentation (incl. a code of conduct)
- The
PandasPdb.amino3to1
method now also considers insertion codes when converting the amino acid codes; before, inserted amino acides were skipped.
- Minor adjustments to support to address deprecation warnings in pandas >= 23.0
PandasMol2.distance_df
was added as a static method that allows distance computations based for external data frames with its behavior otherwise similar toPandasMol2.distance
.PandasPdb.distance_df
was added as a static method that allows distance computations based for external data frames with its behavior otherwise similar toPandasPdb.distance
.PandasPdb.distance
now supports multiple record sections to be considered (e.g.,records=('ATOM', 'HETATM')
to include both protein and ligand in a query. Now also defaults torecords=('ATOM', 'HETATM')
for concistency with the impute method.PandasPdb.get(...)
now supports external data frames and lets the user specify the record section to be considered (e.g.,records=('ATOM', 'HETATM')
to include both protein and ligand in a query. Now also defaults torecords=('ATOM', 'HETATM')
for concistency with the impute method.- The
section
parameter ofPandasPdb.impute_element(...)
was renamed torecords
for API consistency.
- Raises a meaningful error message if attempting to overwrite the
df
attributes ofPandasMol2
andPandasPdb
directly. - Added
PandasPdb.pdb_path
andPandasMol2.mol2_path
attributes that store the location of the data file last read.
- The
rmsd
methods ofPandasMol2
andPandasPdb
don't return a NaN anymore if the array indices of to structures are different.
- The
amino3to1
method ofbiopandas.pdb.PandasPDB
objects now returns a pandasDataFrame
instead of a pandasSeries
object. The returned data frame has two columns,'chain_id'
and'residue_name'
, where the former contains the chain ID of the amino acid and the latter contains the 1-letter amino acid code, respectively. - Significant speed improvements of the
distance
method of bothPandasPdb
andPandasMol2
(now about 300 percent faster than previously).
- The
amino3to1
method ofbiopandas.pdb.PandasPDB
objects now handles multi-chain proteins correctly. - The
amino3to1
method ofbiopandas.pdb.PandasPDB
objects now also works as expected if the'ATOM'
entry DataFrame contains disordered DataFrame indices or duplicate DataFrame index values.
- Added an
amino3to1
method toPandasPdb
data frames to convert 3-amino acid letter codes to 1-letter codes. - Added a
distance
method toPandasPdb
data frames to compute the Euclidean distance between atoms and a reference point. - Added the
PandasMol2
class for working with Tripos MOL2 files in pandas DataFrames.
PandasPDB
was renamed toPandasPdb
.- Raises a warning if
PandasPdb
is written to PDB and ATOM and HETAM section contains unexpected columns; these columns will now be skipped.
- Added an
impute_element
method toPandasPDB
objects to infer the Element Symbol from the Atom Name column. - Added two new selection types for
PandasPDB
ATOM and HETATM coordinate sections:'heavy'
and'carbon'
.
- Include test data in the PyPI package; add install_requires for pandas.
- The
'hydrogen'
atom selection inPandasPDB
methods is now based on the element type instead of the atom name. - By default, the RMSD is now computed on all atoms unless a specific selection is defined.
- Needed to bump the version number due to a bug in the PyPI setup.py script.
- Support for the old pandas sorting syntax (
DataFrame.sort
vsDataFrame.sort_values
) incl. DeprecationWarning.
- Exception handling in tests if PDB goes down (which just happened).
- Added a separate ANISOU engine to handle those records correctly.
- First Release.