Skip to content

Latest commit



190 lines (109 loc) · 3.96 KB

File metadata and controls

190 lines (109 loc) · 3.96 KB



Object for working with Tripos Mol2 structure files.


  • df : pandas.DataFrame

    DataFrame of a Mol2's ATOM section

  • mol2_text : str

    Mol2 file contents in string format

  • code : str

    ID, code, or name of the molecule stored

  • pdb_path : str

    Location of the MOL2 file that was read in via read_mol2


distance(xyz=(0.0, 0.0, 0.0))

Computes Euclidean distance between atoms in self.df and a 3D point.


  • xyz : tuple (0.00, 0.00, 0.00)

    X, Y, and Z coordinate of the reference center for the distance computation


  • pandas.Series : Pandas Series object containing the Euclidean

    distance between the atoms in the atom section and xyz.

distance_df(df, xyz=(0.0, 0.0, 0.0))

Computes Euclidean distance between atoms and a 3D point.


  • df : DataFrame

    DataFrame containing entries similar to the PandasMol2.df format for the the distance computation to the xyz reference coordinates.

  • xyz : tuple (0.00, 0.00, 0.00)

    X, Y, and Z coordinate of the reference center for the distance computation


  • pandas.Series : Pandas Series object containing the Euclidean

    distance between the atoms in the atom section and xyz.

read_mol2(path, columns=None)

Reads Mol2 files (unzipped or gzipped) from local drive

Note that if your mol2 file contains more than one molecule,
only the first molecule is loaded into the DataFrame


  • path : str

    Path to the Mol2 file in .mol2 format or gzipped format (.mol2.gz)

  • columns : dict or None (default: None)

    If None, this methods expects a 9-column ATOM section that contains the following columns:

    {0:('atom_id', int), 1:('atom_name', str), 2:('x', float), 3:('y', float), 4:('z', float), 5:('atom_type', str), 6:('subst_id', int), 7:('subst_name', str), 8:('charge', float)}

    If your Mol2 files are formatted differently, you can provide your own column_mapping dictionary in a format similar to the one above. However, note that not all assert_raise_message methods may be supported then.



read_mol2_from_list(mol2_lines, mol2_code, columns=None)

Reads Mol2 file from a list into DataFrames


  • mol2_lines : list

    A list of lines containing the mol2 file contents. For example, ['@MOLECULE\n', 'ZINC38611810\n', ' 65 68 0 0 0\n', 'SMALL\n', 'NO_CHARGES\n', '\n', '@ATOM\n', ' 1 C1 -1.1786 2.7011 -4.0323 C.3 1 <0> -0.1537\n', ' 2 C2 -1.2950 1.2442 -3.5798 C.3 1 <0> -0.1156\n', ...]

  • mol2_code : str or None

    Name or ID of the molecule.

  • columns : dict or None (default: None)

    If None, this methods expects a 9-column ATOM section that contains the following columns: {0:('atom_id', int), 1:('atom_name', str), 2:('x', float), 3:('y', float), 4:('z', float), 5:('atom_type', str), 6:('subst_id', int), 7:('subst_name', str), 8:('charge', float)} If your Mol2 files are formatted differently, you can provide your own column_mapping dictionary in a format similar to the one above. However, note that not all assert_raise_message methods may be supported then.



rmsd(df1, df2, heavy_only=True)

Compute the Root Mean Square Deviation between molecules


  • df1 : pandas.DataFrame

    DataFrame with HETATM, ATOM, and/or ANISOU entries

  • df2 : pandas.DataFrame

    Second DataFrame for RMSD computation against df1. Must have the same number of entries as df1

  • heavy_only : bool (default: True)

    Which atoms to compare to compute the RMSD. If True (default), computes the RMSD between non-hydrogen atoms only.


  • rmsd : float

    Root Mean Square Deviation between df1 and df2



Acccesses the pandas DataFrame