Description

This script performs the following tasks:

    Reads a CSV file containing a column named "smiles".
    Defines a function to draw the 2D structure of a molecule from its SMILES string.
    Iterates over the DataFrame with a progress bar to draw and display the structures.
    Measures and prints the runtime of the script.


Instructions for Use

    Ensure you have the required libraries installed:

'''
        pip install pandas rdkit-pypi tqdm
'''

Update the input_file_path variable with the path to your input CSV file.

Run the script in a Jupyter Notebook. The script will read the input CSV file, process the SMILES strings to draw and display the 2D structures, and measure the runtime.


In [None]:
import pandas as pd
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem.Draw import rdMolDraw2D
from IPython.display import SVG, display
from tqdm import tqdm
import time

# Function to draw the 2D structure of a molecule from its SMILES string
def draw_2d_structure(smiles, size=(300, 300)):
    mol = Chem.MolFromSmiles(smiles)
    if mol is not None:
        drawer = rdMolDraw2D.MolDraw2DSVG(size[0], size[1])
        drawer.DrawMolecule(mol)
        drawer.FinishDrawing()
        svg = drawer.GetDrawingText()
        display(SVG(svg))

# Read CSV file with a column named "smiles"
input_file_path = 'input.csv'  # Update with your input file path
df = pd.read_csv(input_file_path)

# Check the structure of your DataFrame
print(df.head())

# Measure the runtime
start_time = time.time()

# Draw structures for the first 1000 SMILES strings in the DataFrame with a progress bar
for index, row in tqdm(df.head(1000).iterrows(), total=1000):
    smiles = row['smiles']
    draw_2d_structure(smiles)

end_time = time.time()

# Print the runtime
print(f"Runtime: {end_time - start_time:.2f} seconds")