# RDKit and Explicit Hydrogens in Mapped Reaction SMILES
This notebook demonstrates one of the issues in using explicit hydrogens encountered with RDKit. When starting with a mapped reaction SMILES, the addHs method only partially identifies them.  Without the atom maps, it finds all of them.  However, it then loses the association of the maps with the atoms.

In [11]:
!pip install rdkit
from rdkit import Chem
from rdkit.Chem import AllChem
from collections import Counter



In [12]:
def count_atoms(mol):
    return Counter(atom.GetSymbol() for atom in mol.GetAtoms())

def show_atom_maps(mol, label):
    print(f"{label}:")
    for atom in mol.GetAtoms():
        print(f"  {atom.GetSymbol():>2} - map {atom.GetAtomMapNum()} - idx {atom.GetIdx()}")

## Step 1: Load Mapped Reaction SMILES
We start from a mapped SMILES expression that does **not** include explicit hydrogens.

In [13]:
rxn_smiles = "[C:1][C:2](=[O:3])O.[N:4][C:5]>>[C:1][C:2](=[O:3])[N:4][C:5]"
reaction = AllChem.ReactionFromSmarts(rxn_smiles, useSmiles=True)
reactants = [reaction.GetReactantTemplate(i) for i in range(reaction.GetNumReactantTemplates())]
products = [reaction.GetProductTemplate(i) for i in range(reaction.GetNumProductTemplates())]

### Atom counts before adding hydrogens

In [14]:
for i, mol in enumerate(reactants):
    print(f"Reactant {i}: {dict(count_atoms(mol))}")
for i, mol in enumerate(products):
    print(f"Product {i}: {dict(count_atoms(mol))}")

Reactant 0: {'C': 2, 'O': 2}
Reactant 1: {'N': 1, 'C': 1}
Product 0: {'C': 3, 'O': 1, 'N': 1}


## Step 2: Rebuild molecules and add explicit hydrogens
We sanitize and rebuild each molecule, then call `Chem.AddHs` to ensure hydrogens are added.

In [15]:
def rebuild_and_add_hs(mol):
    mol = Chem.Mol(mol)
    Chem.SanitizeMol(mol)
    mol = Chem.AddHs(mol)
    return mol

reactants_h = [rebuild_and_add_hs(mol) for mol in reactants]
products_h = [rebuild_and_add_hs(mol) for mol in products]

### Atom counts after adding hydrogens
Notice that only the hydrogen on the carboxyl group is added

In [16]:
for i, mol in enumerate(reactants_h):
    print(f"Reactant {i}: {dict(count_atoms(mol))}")
for i, mol in enumerate(products_h):
    print(f"Product {i}: {dict(count_atoms(mol))}")

Reactant 0: {'C': 2, 'O': 2, 'H': 1}
Reactant 1: {'N': 1, 'C': 1}
Product 0: {'C': 3, 'O': 1, 'N': 1}
