Skip to content

Read the full contents of CTAB .rdf files in python. Captures RXN and MOL record using RDKit and reads additional data fields (including solvents/catalysts/agents).

License

Notifications You must be signed in to change notification settings

deepmatterltd/rdfreader

Repository files navigation

RDF READER

Coverage Status pre-commit.ci status Tests License Code style: black Python versions

User Guide

Installation

pip install rdfreader

Basic Usage

from rdfreader import RDFParser

rdf_file_name = "reactions.rdf"

with open(rdf_file_name, "r") as rdf_file:

    # create a RDFParser object, this is a generator that yields Reaction objects
    rdfreader = RDFParser(
        rdf_file,
        except_on_invalid_molecule=False,  # will return None instead of raising an exception if a molecule is invalid
        except_on_invalid_reaction=False,  # will return None instead of raising an exception if a reaction is invalid 
    )

    for rxn in rdfreader:
        if rxn is None:
            continue # the parser failed to read the reaction, go to the next one
  
        # rxn is a Reaction object, it is several attributes, including:
        print(rxn.smiles) # reaction SMILES string
        print(rxn.properties) # a dictionary of properties extracted from the RXN record
        
        reactants = rxn.reactants # a list of Molecule objects
        products = rxn.products
        solvents = rxn.solvents 
        catalysts = rxn.catalysts 
 
        # Molecule objects have several attributes, including:
        print(reactants[0].smiles)
        print(reactants[0].properties) # a dictionary of properties extracted from the MOL record (often empty)
        reactants[0].rd_mol # an RDKit molecule object

Developer Guide

The project is managed and packaged using poetry.

Installation

git clone https://github.com/deepmatterltd/rdfreader
poetry install  # create a virtual environment and install the project dependencies
pre-commit install  # install pre-commit hooks, these mostly manage codestyle

Contributions

Contributions are welcome via the fork and pull request model.

Before you commit changes, ensure these pass the hooks installed by pre-commit. This should be run automatically on each commit if you have run pre-commit install, but can be run manually from the terminal with pre-commit run.

Releases

Releases are managed by GitHub releases/workflow. The version number in the pyproject file should ideally be kept up to date to the current release but is ignored by the release workflow.

To release a new version:

  • Update the pyproject.toml version number.
  • Push the changes to GitHub and merge to main via a pull request.
  • Use the github website to create a release. Tag the commit to be released with a version number, e.g. v1.2.3. The tag should be in v*.. and match the version number in the pyproject.toml file.
  • When the release is published, a github workflow will run, build a wheel and publish it to PyPI.

Example Data

You can find example data in the test/resources directory. spresi-100.rdf contains 100 example records from SPRESI.

About

Read the full contents of CTAB .rdf files in python. Captures RXN and MOL record using RDKit and reads additional data fields (including solvents/catalysts/agents).

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages