# Welcome to GreeenChemPanion 🍃
----
Welcome to this interactive Jupyter Notebook! 🎉 miaw

**About This Notebook:**
The principles of Green Chemistry seek to transform the way chemical processes are designed, with the goal of reducing or eliminating the use and generation of dangerous substances. Sustainable chemistry insists on improving reaction efficiency, minimizing waste, and ensuring the long-term safety of both products and processes. Key metrics such as the E-Factor, Process Mass Intensity (PMI), and Atom Economy have been developed to quantitatively measure the environmental impact of chemical reactions. Also, solvent choice and product properties are key to assessing the sustainability of chemical transformations.

In this context, we have developed GreenChemPanion, an interactive Python-based platform designed to help chemists assess and optimize the sustainability of their reactions. GreenChemPanion integrates core green chemistry metrics. It includes the E-Factor, PMI, and Atom Economy, together with evaluations of solvent sustainability, molecular greenness (based on atomic composition), and reaction conditions.

The platform allows users to input key reaction parameters such as reactants, products, solvents, and catalysts. It then automatically computes sustainability scores, classifies solvents according to established green chemistry guidelines, analyzes molecular structures for environmentally concerning elements, and provides a comprehensive "green grade" for the reaction. In addition to offering quantitative feedback, GreenChemPanion highlights areas where improvements can be made to align more closely with sustainable practices.

Through this project, we aim to provide chemists with a centralized, intuitive, and practical tool that supports greener decision-making. GreenChemPanion bridges the gap between synthetic chemistry and green chemistry principles, helping users evaluate their current reactions and design more sustainable processes following the 12 Principles of Green Chemistry.

**🚀 Getting started:**
To get started, simply follow the structure outlined in the ReadMe. Each part of the project includes clear explanations, illustrative code examples, and visual outputs to help you understand and apply the concepts efficiently.

**How to use:**
Feel free to modify the inputs, test different reactions, and explore how different molecular structures impact sustainability metrics. The code is modular and documented to support experimentation and learning.

**Questions?**
For any questions, please contact marc.alhachem@epfl.ch, ralph.gebran@epfl.ch, tais.thomas@epfl.ch or valentine.wien@epfl.ch

Let’s dive into chemistry with purpose, making it cleaner, smarter, and greener! 🚀

### 🔧 First, let’s import everything!

To begin, run the following cell to import all necessary libraries and custom functions. This includes standard tools such as pandas and math, along with the cheminformatics toolkit RDKit, which allows us to represent, manipulate, and analyze molecular structures.

In [None]:
import streamlit as st
import pandas as pd
import math
from rdkit import Chem
from rdkit.Chem import Draw
from rdkit.Chem import Descriptors
from streamlit_ketcher import st_ketcher
from functions import Atom_Count_With_H, Reaction, compute_PMI, canonicalize_smiles, compute_E 
from functions import get_solvent_info, waste_efficiency, PMI_assesment, Atom_ec_assesment, logP_assessment_molecule, atoms_assessment, structural_assessment

### 🔥 Let’s Get Started!

To begin evaluating the sustainability of a chemical reaction, users must define the reaction they wish to assess. With the Reaction class, you can specify not only the reactants and products, but also optional information such as solvents, catalysts, and reaction yield. 

Our tool allows you to input SMILES strings for each molecule (a standardized way to represent molecular structures) and then automatically computes a wide range of green chemistry factors including Atom Economy, E-factor, and PMI.

### 🧪 Example Usage 📚 – GreenChemPanion

**✅ Step 1: Define your chemical reaction**

Before you can assess the sustainability of a reaction, you must first describe it chemically. GreenChemPanion uses SMILES (Simplified Molecular Input Line Entry System) to represent molecules. The stoichiometry is also specified, and the reaction is checked to ensure that it is well balanced in terms of atoms. An internal function performs this check automatically to ensure the consistency of the material balance before any indicator calculations.

👉 If you don't know SMILES, you can find them on PubChem or generate your own with RDKit.

The reaction is then instantiated via a dedicated class, which serves as the basis for subsequent calculations (atom economy, PMI, E-factor, ...). 

In the following example, we consider the oxidation of ethanol to acetic acid:

- **Reactants**: ethanol (`CCO`), oxygen (`O=O`)
- **Products**: acetic acid (`CC(=O)O`), ​​water (`O`)

In [None]:
reactants = ['CCO', 'O=O'] 
products = ['CC(=O)O', 'O'] 

reaction = Reaction(reactants, products)

This step creates a Reaction object that encodes the chemical transformation. SMILES are automatically converted into internal molecular objects (using RDKit), and the reaction is stored with the corresponding stoichiometry.

**🔬 Step 2: Calculating the atom economy**

Atom economy is a fundamental indicator in green chemistry, measuring the efficiency of using atoms in reactants to form the main product. A reaction with good atom economy limits waste production and maximizes the value of raw materials. The higher the atom economy, the better. The atom economy is expressed as a percentage and can never exceed 100%.

GreenChemPanion allows you to calculate this indicator in two ways:
 - By number of atoms: evaluates the proportion of atoms present in the main product compared to all the reactants. Implicit hydrogens are included via an internal atom counting function on RDKit objects.
 - By molar mass: weights atoms by their mass to give a more accurate measurement in an industrial context. the mass of each molecule is obtained automatically from its structure, using the exact atomic masses provided by RDKit.


In [None]:
ae_atoms = Reaction.Atom_Economy_A()
ae_mass = Reaction.Atom_Economy_M()

**🧮 Step 3 – Calculation of PMI and E Factor**

These two indicators quantify the material efficiency of a chemical process by integrating the waste generated. Unlike the atom economy, they take into account all material flows, including by-products and solvents, when specified.

 - **PMI (Process Mass Intensity)** is the ratio between the total mass of materials used (reagents, solvents, catalysts, etc.) and the mass of the main product.
 - **E-factor** measures the mass of waste generated per unit of product

Both indicators are directly calculated from the defined chemical reaction, taking into account stoichiometry. 

Molar masses are automatically calculated from SMILES structures via RDKit. The algorithm considers all products that are not designated as the main product to be treated as waste in the calculation of the E-factor and PMI.
An internal function determines the "main product", either by default the first given product, or according to a manually specified index.

In [None]:
pmi = reaction.PMI()
efactor = reaction.E_factor()

print(f"PMI : {pmi:.2f}")
print(f"E-factor : {efactor:.2f}")

**🌍 Step 4: Evaluation of the solvents used**

Solvents can be responsible for a large part of the environmental impact of a chemical process. In this project, we integrated an automatic method for classifying solvents used in a reaction, based on their SMILES structure.

*Principle* - each solvent is compared to three pre-defined categories:
 - ✅Green: solvents considered environmentally friendly according to green chemistry guides (water, ethanol,...).
 - 🟨​Acceptable: Default category for unclassified solvents, assumed to be acceptable but not optimal.
 - ❌Bad: solvents that are problematic for health or the environment (benzene, n-hexane, halogenated solvents,...).

**Implementation** - A meter associates one of three categories with each solvent:

In [None]:
Green = {"O", "CCO", "CC(=O)OCC", "CC1COCC1", "O=C=O", "CC(O)C", "CO"}
Bad = {"ClCCl", "ClC(Cl)Cl", "c1ccccc1", "ClC(Cl)(Cl)Cl", "CCCCCC", "CCCCC"}

extras = Reaction.get_extras()

seen = {"Green": 0, "Acceptable": 0, "Bad": 0}
for mol in extras:
    smi = Chem.MolToSmiles(mol)
    if smi in Green:
        seen["Green"] += 1
    elif smi in Bad:
        seen["Bad"] += 1
    else:
        seen["Acceptable"] += 1


Based on the count, an overall verdict is issued:

 - If a "Bad" solvent is detected: a red warning is displayed.
 - If only "Acceptable" solvents are detected: a yellow warning.
 - If all solvents are "Green": a green message validates the selection.

**🧪 Step 5: Product Evaluation**

One of the sustainability criteria for a chemical reaction concerns the nature of the resulting products. Some compounds may be toxic, non-biodegradable, or derived from rare elements, which affects the overall sustainability of the process.

In this project, product evaluation is based on two aspects:

 1. 🧑‍🔬Presence of problematic atomic elements
 
  An initial filter checks whether the molecules produced contain elements that pose a risk from a green chemistry perspective. The list includes elements such as chlorine, bromine, lead, and certain heavy metals. If any of these elements are present, a warning is displayed.



In [None]:
# Example utilization
product_evaluation = Reaction.evaluate_products()
print(product_evaluation)

This feedback indicates whether problematic atomic elements are present in the main product. The goal is to encourage the avoidance of products containing unsustainable elements.

2. 💧 Evaluation of logP (hydrophobicity)
 
 A second indicator is the logP value, the logarithm of the octanol/water partition coefficient, which provides an indication of the compound's solubility and environmental behavior.

 A low logP value indicates a hydrophilic molecule, which is generally more favorable for biodegradability. Conversely, a high logP may suggest a molecule that is likely to accumulate in living organisms and the environment.

 The function evaluates the hydrophobicity of the main product by calculating its logP value via RDKit and classifying it into different categories according to its value

In [None]:
from rdkit.Chem import Crippen
logP = Crippen.MolLogP(product)

In [None]:
logp_evaluation = Reaction.logP_assessment_molecule()
print(logp_evaluation)

The assessment is based on a simple classification:

 - ✅ 1.5<= logP <= 2.5 → Hydrophilic, favorable product
 - 🚸​ 0 <= logP <= 1.5 and 2.5<= logP <= 4 → Moderately hydrophobic product
 - 🚫 logP > 4 → Potentially problematic product
 
These two assessments are intended to provide additional clues about the environmental quality of the main reaction product.

**⚗️ Step 6: Structural Assessment**

Beyond atomic composition and physicochemical properties, a product's molecular structure can significantly influence its environmental impact. Certain substructures or functional units are known to be problematic: they can affect biodegradability, cause toxic effects, or make the molecule persistent in the environment.

In GreenChemPanion, we implemented an evaluation based on the detection of risky motifs in the main product molecule.

*🔍 Analysis method:*

The structural_assessment() function allows you to assess whether products contain risky structural motifs, based on:

1. SMARTS substructures known to be problematic, such as:
 - carbon oxides (`CO`, `CO2`)
 - nitro groups (`NO2`)
 - azo groups (`N=N`)
 - highly halogenated aromatic rings (`dichlorobenzenes`) 

2. The presence of long chains of heavy atoms (more than 10 non-hydrogen atoms), often associated with biodegradability problems.

*🧪 How it works*

For each product:

- The molecule is examined for substructures corresponding to the risky motifs using mol.HasSubstructMatch(...)

- A structural similarity analysis (molecular fingerprinting + Tanimoto) is used as a complementary method: if a structure is sufficiently close to a known motif, a warning is issued.

- If a product exceeds 10 heavy atoms (excluding hydrogens), a flag is raised.

*🖥️ Example result*

In [None]:
msg, color = structural_assessment(Reaction)
display(Markdown(msg))

### 📊User interface with Streamlit

To make sustainability assessment accessible to chemists without programming skills, a simple and interactive user interface was developed using Streamlit.

**🎯 Objective**

The objective is to allow the user to:

 - Define a chemical reaction using SMILES,
 - Automatically launch the sustainability analysis based on the defined indicators (atom economy, E-factor, solvents, logP, structure, etc.),
 - Visualize the results in a clear and colorful interface.

 
**💻 Interface Features**

The Streamlit interface includes:

1. Reactant and Product Entry:
 - Text field for entering SMILES separated by commas.
 - Choice of the main product via a drop-down menu.
2. Automated Analysis:
 - Creation of the Reaction object from user data.
 - Calculation of all sustainability indicators using the previously defined functions.
3. Result Display:
 - Results presented as text accompanied by emoticons ✅ 🚸​ 🚨
 - Use of colors (green, yellow, red) to reflect the environmental assessment.
 - Option to have an overall summary at the end.

## 🌿 GreenChemPanion: Challenges, Features and Limitations

### **🧪 Introduction**

GreenChemPanion is an interactive notebook designed to assess the environmental sustainability of chemical reactions based on the principles of green chemistry. It is based on the analysis of indicators such as atom economy, E-factor, solvent use, and the nature of the products generated.

### **🎯 Motivations**

This project is aimed at anyone, students, teachers, or researchers, who wishes to integrate sustainability concepts from the design phase of a chemical synthesis. By offering a rapid, visual, and multi-criteria assessment of a reaction based on its SMILES representation, GreenChemPanion makes it possible to identify the main weak points of a transformation from a green chemistry perspective in just a few seconds. Using various indicators (atom savings, waste generated, nature of solvents, atomic elements present, molecular structure of the product), the project encourages a more critical and responsible approach to chemistry.

Beyond the technical aspect, GreenChemPanion is also intended as an educational and awareness-raising tool. It reminds us that every choice made during a synthesis—from the solvent to the product formed—can have a measurable environmental impact. By facilitating access to criteria often considered secondary in laboratory planning, this project contributes to making sustainable chemistry a concrete, accessible, and applicable priority starting in university education.

### **🌟 Main Features**

- **Atom Economy Calculation**: Measures the efficiency with which reactants are converted into the main product.
E-Factor & PMI: Evaluate the mass of waste generated and the mass of material consumed.

- **Solvent Assessment**: Classifies solvents into three categories ("Green", "Acceptable", "Bad") according to their environmental impact.

- **LogP Analysis**: Provides an indication of the product's biodegradability and persistence in a biological environment.

- **Elemental Risk Scan**: Detects the presence of problematic elements (heavy metals, halogens, etc.).

- **Structural Assessment**: Searches for structural motifs associated with environmental risks or toxicity.

- **Streamlit Interface**: Interactive interface for easily testing different reactions.

### **📉 Challenges Encountered**

- **Stoichiometry and balancing**: The code had to ensure that the reactions were well balanced before running the calculations.

- **Reliable atom counting**: Implicit hydrogens had to be taken into account to obtain a realistic count.

- **Choice of reference solvents**: The classification of solvents was based on an arbitrary selection of representative SMILES, which sometimes required questionable decisions.

- **Product evaluation**: Certain criteria (such as logP or the presence of "at risk" atoms) may vary depending on the application context, which introduces a degree of subjectivity.

- **Using Streamlit**: Integrating it into a dynamic interface while maintaining explicit and colorful user feedback required technical adjustments.

### **🚧 Limitations**

- **Subjectivity in some criteria**: The evaluation of problematic solvents or structures is based on lists chosen by the developers, which may not cover all cases or depend on questionable sources.

- **SMILES only**: Some complex or ambiguous reactions cannot be properly analyzed if the SMILES are not well written or ambiguous.

- **Simplification of waste**: All products that are not the main product are considered waste, which can be reductive in the case of recoverable co-products.

- **Lack of energy factors**: The software does not take into account experimental conditions (such as temperature and pressure) which also influence durability.

- **Limit to one reaction at a time**: The system evaluates only one transformation at a time and does not yet allow a global evaluation of a multi-step synthetic route.

### **✅ Conclusion**

GreenChemPanion represents a first step toward an interactive and accessible green chemistry assistant. While still imperfect, it allows for an initial assessment of chemical transformations from a sustainability perspective. By combining various indicators in a simple interface, it serves as an educational tool and awareness-raising tool. The project remains open to future expansions: consideration of energy, multi-step approach, reaction database, or even the integration of artificial intelligence for more refined recommendations.

