<h1 style="color:#001f3f">1. Introduction:</h1>

<h2 style="color:#0050a0">1.1 Research Topic</h2>
<div style="text-align: justify">
Stereochemistry is an important concept in organic chemistry for understanding molecules' behavior in biological systems. Indeed, the spatial arrangement of atoms in a molecule can significantly influence how that molecule interacts with enzymes and other biomolecules. Stereoisomers—compounds that have the same molecular formula but differ in the 3D orientation of their atoms—often exhibit different properties. A thorough understanding of stereochemistry is therefore essential in fields such as pharmaceutical chemistry, where the difference between isomers can mean the difference between a therapeutic and a toxic effect.
</div>

<h2 style="color:#0050a0">1.2 Settings: Niche</h2>
<div style="text-align: justify">
While there are tools for drawing molecules and naming stereoisomers, few are designed to be interactive learning tools. Most tools and materials explain stereochemistry in a serious and technical way, but they are not fun or easy to use. Because of that, students often find it hard to really understand the topic. Furthermore, current tools often do not simplify the challenge of identifying stereocenters or visualizing all possible stereoisomers for a given molecule, especially from linear notations like SMILES.
</div>

<h2 style="color:#0050a0">1.3 Problem</h2>
<div style="text-align: justify">
Students often struggle when it comes to understanding and visualizing stereoisomers, particularly when translating 2D representations into 3D structures. As a result, students may fail to understand how chirality and molecular orientation influence chemical behavior. After struggling with the concepts of stereochemistry, the idea of this project seemed obvious to us: we wanted to create an interface to get acquainted with stereoisomers, nomenclature, and chirality. Thus, in this project, we tried to put as many functionalities as possible into a Streamlit interface, which we hope will serve as a helpful and intuitive tool for users.
</div>

<h2 style="color:#0050a0">1.4 Solution</h2>
<div style="text-align: justify">
This project presents a Python-based tool that automatically detects chiral centers and calculates the number of possible stereoisomers of a given molecule. Using cheminformatics libraries such as RDKit, the tool processes SMILES input and outputs an intuitive stereochemical analysis. The aim is to create an accessible, educational resource that not only aids students in mastering stereochemistry but also serves as a basis for further computational chemistry applications.
</div>

<h1 style="color:#001f3f">2. Materials and methods</h1>

<div style="text-align: justify">
This application was developed using Python and Streamlit to create an interactive tool for exploring and studying stereochemistry, with a particular focus on stereoisomer identification and naming. The tool integrates molecule drawing, stereoisomer generation, and chiral center identification into a user-friendly interface.
</div>

<h2 style="color:#0050a0">2.1 Technologies and libraries</h2>
<div style="text-align: justify">
The primary technologies used in the development of this project include:
</div>
<ul>
<li><strong>Python 3.8</strong>: The main programming language.</li>
<li><strong>Streamlit</strong>: Used for building the web interface and handling user interactions.</li>
<li><strong>RDKit</strong>: Employed for molecule parsing, stereoisomer generation, chiral center identification, and image rendering.</li>
<li><strong>PubChemPy</strong>: Utilized for retrieving IUPAC names from the PubChem database.</li>
<li><strong>streamlit_ketcher</strong>: Integrated to allow users to draw chemical structures directly in the application interface.</li>
</ul>
<div style="text-align: justify">
All dependencies were installed via pip and the application was run locally through the Streamlit CLI.
</div>

<h2 style="color:#0050a0">2.2 Streamlit application structure</h2>
<div style="text-align: justify">
The application is organized into three main tabs:
</div>
<ol>
<li><strong>Input a Molecule</strong>:<br>
<div style="text-align: justify">
Users can either draw a molecule using the Ketcher editor or enter a molecule name which is then interpreted using PubChemPy. For both input methods, the application converts the structure into a canonical SMILES representation and removes any stereochemistry, ensuring that isomer enumeration starts from a consistent base.
</div>
</li>
<li><strong>Draw and Guess Stereoisomers</strong>:<br>
<div style="text-align: justify">
The core functionality of this tab involves the generation of all possible stereoisomers using a custom function, <code>generate_isomers</code>, which relies on RDKit’s stereoisomer enumeration capabilities. Users attempt to draw valid stereoisomers of the input molecule, which are then checked for correctness by comparing their canonical isomeric SMILES to the generated set. The application provides feedback, tracks user guesses, and includes features such as scorekeeping, hints, and timing mechanisms. Users can also validate their drawn isomers by inputting corresponding IUPAC names, which are compared against PubChem-derived names.
A score system is also established, assigning a point for all correct answers.
</div>
</li>
<li><strong>Chirality</strong>:<br>
<div style="text-align: justify">
In this section, users can try to identify the chiral centers of the input molecule. The RDKit library is used to determine the atoms with possible chirality, which are then visually highlighted in molecular images. Users select atoms they believe to be chiral using checkboxes.
</div>
</li>
</ol>
<div style="text-align: justify">
A sidebar is also present on the left side of the application. Here, it is possible to input a molecule by name, and the current chosen molecule (without sterical information) is displayed using RDKit for easier visualization during the stereoisomer guessing.
</div>

<h2 style="color:#0050a0">2.3 Variables management</h2>
<div style="text-align: justify">
The application relies on Streamlit's session state system to store variables during its usage. This mechanism allows the app to remember variables between user interactions. Without this method, information would be lost because Streamlit re-runs the entire script after any change. Therefore, session state is used to keep track of the user's progress and decisions throughout the application; the session state variables are only updated or deleted when explicitly instructed to do so.
</div>

<div style="text-align: justify">
Variables stored in session state includes, among others, the input molecule selected by the user, the stereoisomers they have guessed so far, their current score, any names they have submitted for validation, and whether they have requested a hint or chosen to reveal the correct answers.
</div>

<div style="text-align: justify">
By storing this data in session state, the application can provide a seamless and responsive experience where progress is preserved and the interface remains interactive.
</div>

<h2 style="color:#0050a0">2.4 User experience</h2>
<div style="text-align: justify">
The interface includes a custom background image for visual appeal and uses Streamlit’s layout capabilities (such as columns and placeholders) to dynamically update feedback messages, images, and scores. Correct answers are rewarded with balloons as a playful visual effect.
</div>

<h2 style="color:#0050a0">2.5 Implementation details</h2>
<div style="text-align: justify">
This section describes the main ideas used to implement the functionality in the "Draw isomers" and "Chirality" tabs, followed by an overview of two central functions that support the application's internal logic while keeping the overall code cleaner.
</div>

<h3 style="color:#3399ff">Guessing the isomers (Tab 2)</h3>
<div style="text-align: justify">
In this tab, users are asked to guess all the possible isomers of the molecule that has previously input. The programming logic begins by generating the full set of stereoisomers from the SMILES string of the input molecule. These stereoisomers are stored as a set of canonical SMILES strings that retain stereochemical detail (i.e., using @ for R/S centers or \ and / for E/Z bonds). This set is treated as the reference for comparison with the isomers drawn by the user.
</div>

<div style="text-align: justify">
When users enter their guesses through the ketcher interface, the retrieved SMILES string is canonicalized and checked by comparing it with the reference set. If the guess matches a stereoisomer that is present in the set of "solutions" and that had not been guessed yet, it is recorded as correct and added to the set of correct isomers that have been found. This structure enables real-time feedback without storing redundant data, and avoids false positives due to duplicate or equivalent representations.
</div>

<div style="text-align: justify">
The second part of the tab challenges users to name the specific stereoisomers that they have drawn using correct IUPAC nomenclature. Users must correctly describe its configuration (R/S or E/Z) in their answer. The program validates this by comparing the input name with the name retrieved from the molecule's SMILES using PubChem.
</div>

<h3 style="color:#3399ff">Chirality (Tab 3)</h3>
<div style="text-align: justify">
This tab asks the user to identify all the chiral centers of the input molecule. When a molecule (as a SMILES string) is available in the session, it converts the SMILES into an RDKit molecule object. It then iterates through each atom, checking the RDKit <code>_ChiralityPossible</code> property to detect potential chiral centers, and stores their indices in a list.
</div>

<div style="text-align: justify">
The program then compares the user’s selection through checkboxes (which is also a list) with the actual chiral atom list (chiral_atoms). If they match, a success message and animation are triggered, controlled via a session state flag (balloons_shown). Finally, optional buttons allow users to toggle the display of the correct chiral atoms, storing the visibility preference in <code>st.session_state.show_chiral_atoms</code>.
</div>

<h3 style="color:#3399ff">Helper functions</h3>
<div style="text-align: justify">
Two helper functions support much of the core functionality described above: <code>generate_isomers</code> and <code>update_input_molecule</code>.
</div>

<div style="text-align: justify">
The <code>generate_isomers</code> function takes a SMILES string and returns a set of all unique stereoisomers corresponding to that molecule. It relies on RDKit’s built-in enumeration tools to identify all undefined stereocenters and systematically generate the missing stereochemical variants. Importantly, it uses enumeration options that ignore already-defined centers, so only ambiguous or incomplete stereochemistry is expanded. The output is returned as a set of canonical SMILES strings with full stereochemical specification, which are then used for validation, comparison, and selection throughout the app.
</div>


In [None]:
def generate_isomers(smiles: str) -> set:
    # Convert the SMILES string into an RDKit molecule object
    mol_gen_iso = Chem.MolFromSmiles(smiles)
    
    if mol_gen_iso is None:
        return {"molecule not found"}
    
    # Set up options for stereoisomer enumeration:
    # - onlyUnassigned=True: only generate isomers for undefined stereocenters
    # - unique=True: ensures only unique stereoisomers are returned (avoids duplicates)
    opts = StereoEnumerationOptions(onlyUnassigned=True, unique=True)
    
    # Generate a list of all possible stereoisomers using the specified options
    isomers = list(EnumerateStereoisomers(mol_gen_iso, options=opts))
    
    # Convert each stereoisomer molecule back into a canonical SMILES string with stereochemistry
    # Store them in a set to automatically remove duplicates
    return {Chem.MolToSmiles(iso, isomericSmiles=True) for iso in isomers}

The `update_input_molecule` function is used to reset the application state each time a new molecule is selected.
It updates the main SMILES string and clears all relevant session variables, such as previously guessed isomers, score, answer visibility, validated names, and timers. This ensures the user starts fresh with every new challenge.

In [None]:

def update_input_molecule(new_smiles):
    st.session_state.main_smiles = new_smiles
    st.session_state.guessed_molecules = set()
    st.session_state.score = 0
    st.session_state.show_answers = False
    st.session_state.hint = False
    st.session_state.show_chiral_atoms = False
    st.session_state.validated_names = set()
    st.session_state.name_validation_status = {}
    st.session_state.all_iupac_validated = False
    st.session_state.balloons_shown = False
    st.session_state.start_time = None
    st.session_state.end_time_structures = None
    st.session_state.chrono_text = ""

    # Reset chrono
    for key in ["start_time", "end_time_structures"]:
        if key in st.session_state:
            del st.session_state[key]

    # Reset validation
    for key in ["validated_names", "all_iupac_validated", "balloons_shown"]:
        if key in st.session_state:
            del st.session_state[key]

    # Reset atom selection checkboxes
    for key in list(st.session_state.keys()):
        if key.startswith("Atom"):
            st.session_state[key] = False

<h1 style="color:#001f3f">3. Results</h1>

<h2 style="color:#0050a0">3.1 Challenges encountered</h2>
<div style="text-align: justify">
Each of the following paragraphs focuses on a specific technical or practical challenge that emerged during development, as well as the strategy adopted to address it.
</div>

<h3 style="color:#3399ff">Installation and Environment Management</h3>
<div style="text-align: justify">
Installing RDKit proved to be challenging, particularly in ensuring compatibility with other required packages. Managing different environments (e.g., conda vs. pip) created inconsistencies that complicated deployment. In some cases, the application could not be exported or reproduced easily, which limited sharing and usability.
</div>

<h3 style="color:#3399ff">Compatibility issues across operating systems</h3>
<div style="text-align: justify">
Some functionalities, such as molecule rendering or specific Streamlit behaviors, performed differently on macOS compared to Windows and Linux. This discrepancy created confusion during development and required additional debugging efforts. These differences also impacted how team members collaborated and tested features across platforms.
</div>

<h3 style="color:#3399ff">Incomplete PubChem data</h3>
<div style="text-align: justify">
Not all molecules returned complete stereochemistry information or IUPAC names when queried through PubChem. This made it difficult to validate user-submitted names and sometimes resulted in false negatives or mismatches. As a result, comparison and validation features lost accuracy in these cases.
</div>

<h3 style="color:#3399ff">Use of Streamlit Balloons</h3>
<div style="text-align: justify">
The balloon animation, while visually engaging, occasionally caused issues when triggered multiple times or at unexpected moments. Despite being a source of minor bugs, it also contributed to a more playful and rewarding user experience, particularly when completing a game or correctly identifying a structure.
</div>

<h2 style="color:#0050a0">3.2 Limitations and future improvements</h2>
<div style="text-align: justify">
This section addresses the known limitations of the current implementation and suggests what could be improved, added, or refined if more time were available.
</div>

<h3 style="color:#3399ff">Nomenclature Accuracy and Edge Cases</h3>
<div style="text-align: justify">
The current IUPAC name validation does not always work reliably for complex molecules or those with unconventional naming conventions. In particular, coordination complexes are not supported at all. A fallback validation mechanism or broader nomenclature library would improve coverage and accuracy in these edge cases.
</div>

<h3 style="color:#3399ff">Application Deployment</h3>
<div style="text-align: justify">
At this stage, the application is not downloadable or deployable as a standalone tool. This is mainly due to external dependencies, local environment requirements, and time constraints. A packaged version or hosted instance would significantly increase accessibility.
</div>

<h3 style="color:#3399ff">Limited Chirality Feedback</h3>
<div style="text-align: justify">
While the app flags atoms as chiral, it does not currently determine or communicate their absolute configuration (R or S). Providing direct feedback on R/S configuration would greatly enhance the educational value and deepen user understanding of stereochemical principles.
</div>
