## DFT Screening - Construction of the chemical space

**Author:** Quentin Duez

This notebook generates files for DFT calculations and parses relevant chemical descriptors from the DFT results

**Outputs:** DFT input files, 'global_data.xlsx' and 'atom_data.xlsx' after parsing descriptors from the DFT *.log files

**Disclaimer :**
Parts of this notebook correspond to code available on the auto-qchem GitHub repository 

Please refer to the [original repository](https://github.com/doyle-lab-ucla/auto-qchem/) and [related publication](https://pubs.rsc.org/en/content/articlelanding/2022/re/d2re00030j#!divCitation)

In [3]:
import glob as glob
import logging
import os


from autoqchem.gaussian_input_generator import gaussian_input_generator
from autoqchem.gaussian_log_extractor import *
from autoqchem.molecule import molecule
from autoqchem.rdkit_utils import *

logging.basicConfig(level=logging.INFO)

RMSD_threshold = 0.35
k_in_kcal_per_mol_K = 0.0019872041
Hartree_in_kcal_per_mol = 627.5
T = 298

### Read diamine list and prepare files for DFT calculations

In [2]:
batch_df = pd.read_excel("Diamines_batch_all.xlsx")
smiles_str_list = batch_df["SMILES"].values

In [4]:
# Check if all SMILES are unique

def allUnique(x):
    seen = list()
    return not any(i in seen or seen.append(i) for i in x)


print(allUnique(smiles_str_list))

True


In [5]:
# Generate up to 5 conformers per compound
mols = [molecule(s, num_conf=5) for s in smiles_str_list]




### Preparing files for DFT calculations

In [7]:
def gaussprep(
    molecule,
    workflow_type="equilibrium",
    theory="APFD",
    solvent="None",
    light_basis_set="6-31G*",
    heavy_basis_set="LANL2DZ",
    generic_basis_set="genecp",
    max_light_atomic_number=36,
    wall_time="23:59:00",
):
    molecule_workdir = os.path.join(molecule.inchikey)
    gig = gaussian_input_generator(
        molecule,
        workflow_type,
        molecule_workdir,
        theory,
        solvent,
        light_basis_set,
        heavy_basis_set,
        generic_basis_set,
        max_light_atomic_number,
    )

    gig.create_gaussian_files()


for mol in mols:
    gaussprep(mol, theory="B3LYP", light_basis_set="6-31G**", solvent="water")

INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 5 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 2 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 2 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 2 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 1 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 2 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 4 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 2 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 3 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input files for 4 conformations.
INFO:autoqchem.gaussian_input_generator:Generating Gaussian input file

### Store all the DFT folders in a folder called DFT_Data & run the DFT calculations

### Parsing the DFT *.log files to collect descriptors

In [9]:
def rdmol_from_mol(mol, postDFT=True) -> Chem.Mol:
    mol_name = mol.inchikey
    mol_name = mol_name[
        :14
    ]  # We only take the first block of the inchikey because the second block might change (rdkit rearranges the charges)
    elements, connectivity_matrix, charges = (
        mol.elements,
        mol.connectivity_matrix,
        mol.charges,
    )
    conformer_coordinates = []
    energies = []
    loglist = glob.glob(f"DFT_Data/{mol_name}*/*.log")
    keys = [*range(0, len(loglist), 1)]
    print(keys)
    for logfile in loglist:
        # print(logfile)
        le = gaussian_log_extractor(logfile)
        le.check_for_exceptions()
        le.get_atom_labels()
        # verify that the labels are in the same order in gaussian after running it
        labels_ok = True
        if tuple(le.labels) != tuple(elements):
            labels_ok = False
        le.get_geometry()
        conformer_coordinates.append(le.geom[list("XYZ")].values)
        le.get_descriptors()
        energies.append(le.descriptors["G"] * Hartree_in_kcal_per_mol)

    rdmol = get_rdkit_mol(elements, conformer_coordinates, connectivity_matrix, charges)
    return rdmol, energies, keys

In [10]:
global_df = pd.DataFrame()
atomic_df = pd.DataFrame()
for mol in mols:
    mol_data = {
        "can": mol.can,
        "inchi": mol.inchi,
        "inchikey": mol.inchikey,
        "elements": mol.elements,
        "charges": mol.charges.tolist(),
        "connectivity_matrix": mol.connectivity_matrix.flatten().tolist(),
    }
    print(mol.can, mol.inchikey)
    rdmol, energies, keys = rdmol_from_mol(mol)
    keep = prune_rmsds(rdmol, RMSD_threshold)
    logger.info(
        f"Molecule {mol.inchikey} has {len(keys) - len(keep)} / {len(keys)} duplicate conformers."
    )
    mol_name = mol.inchikey
    mol_name = mol_name[:14]

    # loop over the conformers and extract info
    conformations = []
    loglist = glob.glob(f"DFT_Data/{mol_name}*/*.log")
    for conf in loglist:
        index = loglist.index(conf)
        # Check if the conformer number is in the "keep" list
        if index in keep:
            # extract descriptors for this conformer from log file
            log = glob.glob(f"DFT_Data/{mol_name}*/{mol_name}*_conf_{index}.log")
            le = gaussian_log_extractor(log[0])
            # add descriptors to conformations list
            conformations.append(le.get_descriptors())
        else:
            continue

    # compute weights
    free_energies = np.array(
        [Hartree_in_kcal_per_mol * c["descriptors"]["G"] for c in conformations]
    )  # in kcal_mol
    free_energies -= free_energies.min()  # to avoid huge exponentials
    weights = np.exp(-free_energies / (k_in_kcal_per_mol_K * T))
    weights /= weights.sum()

    mol_global_df = pd.DataFrame()
    mol_atomic_df = pd.DataFrame()

    for c in conformations:
        index = conformations.index(c)
        global_data = {
            "can": mol_data["can"],
            "inchi": mol_data["inchi"],
            "inchi_key": mol_data["inchikey"],
            "index": keys[index],
            "weight": weights[index],
        }
        global_data.update(c["descriptors"])
        temp_df = pd.DataFrame.from_dict(global_data, orient="index").T
        mol_global_df = pd.concat([mol_global_df, temp_df], ignore_index=True)
        atomic_data = {
            "can": mol_data["can"],
            "inchi": mol_data["inchi"],
            "inchi_key": mol_data["inchikey"],
            "elements": mol_data["elements"],
            "connectivity_matrix": mol_data["connectivity_matrix"],
            "index": keys[index],
            "weight": weights[index],
        }
        atomic_data.update(c["atom_descriptors"])
        temp_df = pd.DataFrame.from_dict(atomic_data, orient="index").T
        mol_atomic_df = pd.concat([mol_atomic_df, temp_df], ignore_index=True)

    global_df = pd.concat([global_df, mol_global_df], ignore_index=True)
    atomic_df = pd.concat([atomic_df, mol_atomic_df], ignore_index=True)

global_df.to_excel("global_data.xlsx")
atomic_df.to_excel("atom_data.xlsx")

NCCCCCCCCCCCCN QFTYSVGGYOXFRQ-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule QFTYSVGGYOXFRQ-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NC1C(N)CCCC1 SSJXIUAHEKJCMH-WDSKDSINSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule SSJXIUAHEKJCMH-WDSKDSINSA-N has 0 / 2 duplicate conformers.


NCCCN XFNJVJPLKCPIBV-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule XFNJVJPLKCPIBV-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


CC(N)CN AOHJOMMDDJHIJH-GSVOUGTGSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule AOHJOMMDDJHIJH-GSVOUGTGSA-N has 0 / 2 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule GEYOCULIXLDCMW-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NC1=CC=CC=C1N GEYOCULIXLDCMW-UHFFFAOYSA-N
[0]
NCCCCN KIDHWZJUCRJVML-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule KIDHWZJUCRJVML-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


CNCCOCCNC VXPJBVRYAHYMNY-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule VXPJBVRYAHYMNY-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NC1=CC=CC2=C(N)C=CC=C12 KQSABULTKYLFEV-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule KQSABULTKYLFEV-UHFFFAOYSA-N has 1 / 2 duplicate conformers.


NCCCCCN VHRGRCVQAFMJIZ-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule VHRGRCVQAFMJIZ-UHFFFAOYSA-N has 0 / 4 duplicate conformers.


NCCCCCCN NAQMVNRVTILPCV-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule NAQMVNRVTILPCV-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


NCCCCCCCN PWSKHLMYTZNYKO-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule PWSKHLMYTZNYKO-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NC1=CC=CC2=CC=CC(N)=C12 YFOOEYJGMMJJLS-UHFFFAOYSA-N
[0]


INFO:autoqchem.gaussian_log_extractor:Molecule YFOOEYJGMMJJLS-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NCCCCCCCCN PWGJDPKCLMLPJW-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule PWGJDPKCLMLPJW-UHFFFAOYSA-N has 0 / 5 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule PIICEJLVQHRZGT-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NCCN PIICEJLVQHRZGT-UHFFFAOYSA-N
[0]
NC1=C(N)C=C2C=CC=CC2=C1 XTBLDMQMUSHDEN-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule XTBLDMQMUSHDEN-UHFFFAOYSA-N has 0 / 2 duplicate conformers.


CC(N)CCCN(CC)CC CAPCBAYULRXQAN-VIFPVBQESA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule CAPCBAYULRXQAN-VIFPVBQESA-N has 0 / 5 duplicate conformers.


NCC1=CC=CC=C1N GVOYKJPMUUJXBS-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule GVOYKJPMUUJXBS-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


NCCCN(CC)CC QOHMWDJIBGVPIF-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule QOHMWDJIBGVPIF-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


CC1=CC=C(N)C(N)=C1 DGRGLKZMKWPMOH-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule DGRGLKZMKWPMOH-UHFFFAOYSA-N has 0 / 2 duplicate conformers.


NCCCN(C)C IUNMPGNGSSIWFP-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule IUNMPGNGSSIWFP-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


NC1=CC(C)=C(C)C=C1N XSZYBMMYQCYIPC-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule XSZYBMMYQCYIPC-UHFFFAOYSA-N has 0 / 2 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule BXIXXXYDDJVHDL-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NC1=CC=C(Cl)C=C1N BXIXXXYDDJVHDL-UHFFFAOYSA-N
[0]


INFO:autoqchem.gaussian_log_extractor:Molecule KWEWNOOZQVJONF-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NC1=CC=C(F)C=C1N KWEWNOOZQVJONF-UHFFFAOYSA-N
[0]
NC1=CC=C(OC)C=C1N AGAHETWGCFCMDK-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule AGAHETWGCFCMDK-UHFFFAOYSA-N has 2 / 3 duplicate conformers.


NC1=CC=C([N+]([O-])=O)C=C1N RAUWPNXIALNKQM-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule RAUWPNXIALNKQM-UHFFFAOYSA-N has 0 / 2 duplicate conformers.


NC1=CC=C(C2=CC=C(N)C=C2)C=C1 HFACYLZERDEVSX-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule HFACYLZERDEVSX-UHFFFAOYSA-N has 1 / 3 duplicate conformers.


NCCSSCCN APQPRKLAWCIJEK-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule APQPRKLAWCIJEK-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NCCNC1=C2C=CC=CC2=CC=C1 NULAJYZBOLVQPQ-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule NULAJYZBOLVQPQ-UHFFFAOYSA-N has 1 / 5 duplicate conformers.


CN(C)CCN(C)C KWYHDKDOAIKMQN-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule KWYHDKDOAIKMQN-UHFFFAOYSA-N has 1 / 4 duplicate conformers.


NCCN(CC)CC UDGSVBYJWHOHNN-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule UDGSVBYJWHOHNN-UHFFFAOYSA-N has 0 / 4 duplicate conformers.


CNCCCNC UQUPIHHYKUEXQD-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule UQUPIHHYKUEXQD-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


CNCCNC KVKFRMCSXWQSNT-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule KVKFRMCSXWQSNT-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


CCNCCCNCC BEPGHZIEOVULBU-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule BEPGHZIEOVULBU-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NC1=CC=C(N(CC)CC)C=C1 QNGVNLMMEQUVQK-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule QNGVNLMMEQUVQK-UHFFFAOYSA-N has 1 / 5 duplicate conformers.


CCNCCN SCZVXVGZMZRGRU-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule SCZVXVGZMZRGRU-UHFFFAOYSA-N has 0 / 2 duplicate conformers.


NCCCNC QHJABUZHRJTCAR-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule QHJABUZHRJTCAR-UHFFFAOYSA-N has 0 / 2 duplicate conformers.


CNCCN KFIGICHILYTCJF-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule KFIGICHILYTCJF-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


NCCNC1=CC=CC=C1 OCIDXARMXNJACB-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule OCIDXARMXNJACB-UHFFFAOYSA-N has 0 / 3 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule CBCKQZAAMUWICA-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NC1=CC=C(N)C=C1 CBCKQZAAMUWICA-UHFFFAOYSA-N
[0]
NCC1=CC=C(CN)C=C1 ISKQADXMHQSTHK-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule ISKQADXMHQSTHK-UHFFFAOYSA-N has 0 / 4 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule WZCQRUWWHSTZEM-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NC1=CC(N)=CC=C1 WZCQRUWWHSTZEM-UHFFFAOYSA-N
[0]
NCC1=C(CN)C=CC=C1 GKXVJHDEWHKBFH-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule GKXVJHDEWHKBFH-UHFFFAOYSA-N has 0 / 3 duplicate conformers.


NCC1=CC=CC(CN)=C1 FDLQZKYLHJJBHD-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule FDLQZKYLHJJBHD-UHFFFAOYSA-N has 0 / 2 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule IMNIMPAHZVJRPE-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


N12CCN(CC2)CC1 IMNIMPAHZVJRPE-UHFFFAOYSA-N
[0]
NC1CC(N)CCC1 GEQHKFFSPGPGLN-PHDIDXHHSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule GEQHKFFSPGPGLN-PHDIDXHHSA-N has 0 / 2 duplicate conformers.


CCNCCNCC CJKRXEBLWJVYJD-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule CJKRXEBLWJVYJD-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


C1(CCCC2CCNCC2)CCNCC1 OXEZLYIDQPBCBB-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule OXEZLYIDQPBCBB-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NC1=C(C)C=C(CC2=CC(C)=C(C(CC)=C2)N)C=C1CC QJENIOQDYXRGLF-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule QJENIOQDYXRGLF-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NC1=CC=C(C(C2=CC=CC=C2)=C(C3=CC=C(N)C=C3)C4=CC=CC=C4)C=C1 FZIHVNZJLMYTFH-QPLCGJKRSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule FZIHVNZJLMYTFH-QPLCGJKRSA-N has 2 / 5 duplicate conformers.


C1(NCC2=CC=C(CNC3=CC=CC=C3)C=C2)=CC=CC=C1 DXWQPWMYKQYRDS-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule DXWQPWMYKQYRDS-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


CC1(C)CC(N)CC(C)(C)N1 FTVFPPFZRRKJIH-UHFFFAOYSA-N
[0]


INFO:autoqchem.gaussian_log_extractor:Molecule FTVFPPFZRRKJIH-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


NCC1CC(CN)CCC1 QLBRROYTTDFLDX-HTQZYQBOSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule QLBRROYTTDFLDX-HTQZYQBOSA-N has 0 / 3 duplicate conformers.


NC1=CC=C(CC2=CC=C(N)C=C2)C=C1 YBRVSVVVWCFQMG-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule YBRVSVVVWCFQMG-UHFFFAOYSA-N has 1 / 3 duplicate conformers.


O=C(C1=CC=C(N)C=C1)C2=CC=C(N)C=C2 ZLSMCQSGRWNEGX-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule ZLSMCQSGRWNEGX-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


NC1=CC=C(C2=CC=C(N)C=C2C(F)(F)F)C(C(F)(F)F)=C1 NVKGJHAQGWCWDI-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule NVKGJHAQGWCWDI-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


NC1=CC=C(C2=CC=C(N)C=C2C)C(C)=C1 QYIMZXITLDTULQ-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule QYIMZXITLDTULQ-UHFFFAOYSA-N has 2 / 4 duplicate conformers.


NC1=CC=C(SC2=CC=C(N)C=C2)C=C1 ICNFHJVPAJKPHW-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule ICNFHJVPAJKPHW-UHFFFAOYSA-N has 2 / 4 duplicate conformers.


NC1=CC=C(/N=N/C2=CC=C(N)C=C2)C=C1 KQIKKETXZQDHGE-FOCLMDBBSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule KQIKKETXZQDHGE-FOCLMDBBSA-N has 3 / 4 duplicate conformers.


NC1=CC=C(OC2=CC=C(N)C=C2)C=C1 HLBLWEWZXPIGSM-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule HLBLWEWZXPIGSM-UHFFFAOYSA-N has 1 / 3 duplicate conformers.


NCC(C)(C)CN DDHUNHGZUHZNKB-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule DDHUNHGZUHZNKB-UHFFFAOYSA-N has 0 / 2 duplicate conformers.


NCCOCCOCCOCCN NIQFAJBKEHPUAM-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule NIQFAJBKEHPUAM-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NCCOCCOCCN IWBOPFCKHIJFMS-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule IWBOPFCKHIJFMS-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


CC1=CC=C(NCCOCCOCCNC2=CC=C(C)C=C2)C=C1 RRZROWKVNHGBPP-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule RRZROWKVNHGBPP-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


O=S(C1=CC=C(N)C=C1)(C2=CC=C(N)C=C2)=O MQJKPEGWNLWLTK-UHFFFAOYSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule MQJKPEGWNLWLTK-UHFFFAOYSA-N has 2 / 3 duplicate conformers.


CC1CC(CC2CC(C)C(N)CC2)CCC1N IGSBHTZEJMPDSZ-WCBJTDJXSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule IGSBHTZEJMPDSZ-WCBJTDJXSA-N has 0 / 5 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule GLUUGHFHXGJENI-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


N1CCNCC1 GLUUGHFHXGJENI-UHFFFAOYSA-N
[0]
CC1=CC=C(N)C=C1N VOZKAJLKRJDJLL-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule VOZKAJLKRJDJLL-UHFFFAOYSA-N has 1 / 2 duplicate conformers.


CC1=CC=C(NCCCCNC2=CC=C(C)C=C2)C=C1 IXVDJDOFHBHDMT-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule IXVDJDOFHBHDMT-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


N[C@@H](CCCN)C(O)=O AHLPHDHHMVZTML-BYPYZUCNSA-N
[0, 1, 2]


INFO:autoqchem.gaussian_log_extractor:Molecule AHLPHDHHMVZTML-BYPYZUCNSA-N has 0 / 3 duplicate conformers.


NC1=CC=C2C=CC=CC2=C1C3=C4C=CC=CC4=CC=C3N DDAPSNKEOHDLKB-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule DDAPSNKEOHDLKB-UHFFFAOYSA-N has 2 / 4 duplicate conformers.


NCCCOCCOCCOCCCN JCEZOHLWDIONSP-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule JCEZOHLWDIONSP-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NCCCCCCCCCN SXJVFQLYZSNZBT-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule SXJVFQLYZSNZBT-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NCCCCCCCCCCN YQLZOAVZWJBZSY-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule YQLZOAVZWJBZSY-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NCCCCCCCCCCCN KLNPWTHGTVSSEU-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule KLNPWTHGTVSSEU-UHFFFAOYSA-N has 0 / 4 duplicate conformers.


NC1=CC=C(CC2=CC=C(N)C(Cl)=C2)C=C1Cl IBOFVQJTBBUKMU-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule IBOFVQJTBBUKMU-UHFFFAOYSA-N has 1 / 5 duplicate conformers.


CC1CC(CC2CCC(N)C(C)C2)CCC1N IGSBHTZEJMPDSZ-JRULLXGZSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule IGSBHTZEJMPDSZ-JRULLXGZSA-N has 0 / 5 duplicate conformers.


NC1CCC(CC2CCC(N)CC2)CC1 DZIHTWJGPDVSGE-ROTMJFINSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule DZIHTWJGPDVSGE-ROTMJFINSA-N has 0 / 4 duplicate conformers.


NC1=CC=C(C2(C3)CC4(C5=CC=C(N)C=C5)CC3CC(C4)C2)C=C1 LALHUWOVOZGIAW-WGKKAQPQSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule LALHUWOVOZGIAW-WGKKAQPQSA-N has 2 / 5 duplicate conformers.


FC(C(C1=CC=C(OC2=CC=C(N)C=C2)C=C1)(C3=CC=C(OC4=CC=C(N)C=C4)C=C3)C(F)(F)F)(F)F HHLMWQDRYZAENA-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule HHLMWQDRYZAENA-UHFFFAOYSA-N has 1 / 5 duplicate conformers.


C[Si](C)(CCCN)O[Si](C)(C)O[Si](C)(CCCN)C ZWRBLCDTKAWRHT-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule ZWRBLCDTKAWRHT-UHFFFAOYSA-N has 0 / 5 duplicate conformers.


NC1=CC(OC2=CC(OC3=CC=CC(N)=C3)=CC=C2)=CC=C1 DKKYOQYISDAQER-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule DKKYOQYISDAQER-UHFFFAOYSA-N has 2 / 5 duplicate conformers.


NC1=CC=C(SSC2=CC=C(N)C=C2)C=C1 MERLDGDYUMSLAY-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule MERLDGDYUMSLAY-UHFFFAOYSA-N has 3 / 5 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule OXIKYYJDTWKERT-OCAPTIKFSA-N has 0 / 1 duplicate conformers.


NCC1CCC(CN)CC1 OXIKYYJDTWKERT-OCAPTIKFSA-N
[0]
NC1=CC=C(C2=CC=C(N)C=C2S(=O)(O)=O)C(S(=O)(O)=O)=C1 MBJAPGAZEWPEFB-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule MBJAPGAZEWPEFB-UHFFFAOYSA-N has 1 / 5 duplicate conformers.


NC1=CC=C(OC2=CC=C(C3=CC=C(OC4=CC=C(N)C=C4)C=C3)C=C2)C=C1 HYDATEKARGDBKU-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule HYDATEKARGDBKU-UHFFFAOYSA-N has 1 / 5 duplicate conformers.


CC(C1=CC=C(OC2=CC=C(N)C=C2)C=C1)(C3=CC=C(OC4=CC=C(N)C=C4)C=C3)C KMKWGXGSGPYISJ-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule KMKWGXGSGPYISJ-UHFFFAOYSA-N has 2 / 5 duplicate conformers.


NC1=CC=C(OC2=CC=C(N)C=C2C(F)(F)F)C(C(F)(F)F)=C1 NKYXYJFTTIPZDE-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule NKYXYJFTTIPZDE-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


NC(C=C1)=CC=C1C2(C3=CC=C(N)C=C3)C4=C(C=CC=C4)C5=C2C=CC=C5 KIFDSGGWDIVQGN-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule KIFDSGGWDIVQGN-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


O=C(NC1=CC=C(N)C=C1)C2=CC=C(N)C=C2 XPAQFJJCWGSXGJ-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule XPAQFJJCWGSXGJ-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


O=C(C1=CC(N)=C(C(O)=O)C=C1N)O WIOZZYWDYUOMAY-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule WIOZZYWDYUOMAY-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


CCC(NC1CCC(CC2CCC(NC(C)CC)CC2)CC1)C PICLBAFTFYTURY-RLLAOZKESA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule PICLBAFTFYTURY-RLLAOZKESA-N has 0 / 5 duplicate conformers.


NCC(C)CCCN JZUHIOJYCPIVLQ-LURJTMIESA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule JZUHIOJYCPIVLQ-LURJTMIESA-N has 0 / 4 duplicate conformers.


NC1=CC=C(OC2=CC=CC(N)=C2)C=C1 ZBMISJGHVWNWTE-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule ZBMISJGHVWNWTE-UHFFFAOYSA-N has 3 / 4 duplicate conformers.


OC1=CC(N)=CC=C1OC2=CC=C(N)C=C2 JTUXBPXZECVPKS-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule JTUXBPXZECVPKS-UHFFFAOYSA-N has 0 / 4 duplicate conformers.


NC1=CC2=C(C(C)(C)CC2(C3=CC=C(N)C=C3)C)C=C1 GDGWSSXWLLHGGV-GOSISDBHSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule GDGWSSXWLLHGGV-GOSISDBHSA-N has 2 / 4 duplicate conformers.
INFO:autoqchem.gaussian_log_extractor:Molecule RLYCRLGLCUXUPO-UHFFFAOYSA-N has 0 / 1 duplicate conformers.


CC1=C(N)C=CC=C1N RLYCRLGLCUXUPO-UHFFFAOYSA-N
[0]
N[C@@H](CCCCN)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule KDXKERNSBIXSRK-YFKPBYRVSA-N has 0 / 5 duplicate conformers.


CC1=CC=C(NCCCNC2=CC=C(C)C=C2)C=C1 JSKXMAPMKXVJKT-UHFFFAOYSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule JSKXMAPMKXVJKT-UHFFFAOYSA-N has 1 / 4 duplicate conformers.


O[C@@H]1[C@H](NC2=CC=CC=C2N[C@H]3[C@H]([C@H]([C@@H]([C@@H](CO)O3)O)O)O)O[C@H](CO)[C@@H](O)[C@@H]1O LIBPKBFUVQJULU-KPHUVODXSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule LIBPKBFUVQJULU-KPHUVODXSA-N has 0 / 5 duplicate conformers.


NC1CC(CN)(CC(C)(C1)C)C RNLHGQLZWXBQNY-PSASIEDQSA-N
[0, 1, 2, 3]


INFO:autoqchem.gaussian_log_extractor:Molecule RNLHGQLZWXBQNY-PSASIEDQSA-N has 0 / 4 duplicate conformers.


NC1=CC=C(OC2=CC(OC3=CC=C(N)C=C3)=CC=C2)C=C1 WUPRYUDHUFLKFL-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule WUPRYUDHUFLKFL-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


NC1=CC=C(OC2=CC=C(OC3=CC=C(N)C=C3C(F)(F)F)C=C2)C(C(F)(F)F)=C1 LACZRKUWKHQVKS-UHFFFAOYSA-N
[0, 1, 2, 3, 4]


INFO:autoqchem.gaussian_log_extractor:Molecule LACZRKUWKHQVKS-UHFFFAOYSA-N has 3 / 5 duplicate conformers.


NC1=CC=CC=C1NC2=CC=CC=C2 NFCPRRWCTNLGSN-UHFFFAOYSA-N
[0, 1]


INFO:autoqchem.gaussian_log_extractor:Molecule NFCPRRWCTNLGSN-UHFFFAOYSA-N has 0 / 2 duplicate conformers.
