# <a id='toc1_'></a>[PCE:Couplage de Rdkit Crest-xTB avec AlGORITHME DE SCHARBER ](#toc0_)

1. **MVOTO KONGO Patrick Sorrel**, sorrel.mvoto@facsciences-uy1.cm
    * Department of Physics, Faculty of Science, University of Yaounde I 
Etudiant de Master au Laboratoire de 
    * Physique Atomique Moleculaire et Biophysique

12 JUIN 2024

####  <a id='toc1_3_'></a>[DESCRIPTION](#toc0_)

* Nos travaux de mémoire sont basés sur le flux de travail de Tartarus. L'algorithme est le suivant :
<img src="./Graphics/tartarusoverview.png" width="1500"></center>

 Une illustration du cadre Tartarus, mettant en évidence les tâches de conception réelles qui sont définies et associées à des flux de travail et des ensembles de données de simulation dans Tartarus




*  Notre objectif de recherche consiste à concevoir des petites molécules dotées de propriétés électroniques spécifiques, notamment des molécules capables d'effectuer la séparation des charges, inspirées de la conception photovoltaïque organique (OPV). Nous avons deux tâches individuelles :

- Recherche d'une molécule donneuse organique à utiliser avec l'ester méthylique de l'acide l'ester méthylique de
    l'acide ${[6,6]­phényl­C61­butyrique(PCBM)}$.
- Recherche d'une molécule accepteuse à utiliser dans des dispositifs basés sur le $ {poly[N­90­heptadécanyl­2,7­carbazole­alt­5,5­(40,70­di­2­thiényl­20,10,30­benzothiaMachine  (PCDTBT))}$.

*  Notre travail actuel porte sur le modèle ci-dessous :
<img src="./Graphics/opv.png" width="2000"></center>

 schématique du flux de travail de simulation de propriétés pour la conception de références photovoltaïques organiques
  

### <a id='toc1_'></a>[Utilisation de Pandas  pour extraire les information dans le fichiers hce.csv et creer un DataFrame](#toc0_)
<!-- ![MolecularDimension.png](attachment:MolecularDimension.png) -->
![MolecularDimension.png](./Graphics/Pandas.jpg)

In [1]:
import pandas.util  # Assuming 'util' is an alias for pandas.util
import pandas as pd
# Read the CSV file
df1 = pd.read_csv("hce.csv")

# Filter rows with pce_1 > 10.79
df_acc = df1[df1["pce_1"] > 10.8][["smiles", "pce_1","pce_2","pce_pcbm_sas", "pce_pcdtbt_sas","sas"]]

# Filter rows with pce_2 > 33.3'
df_don = df1[df1["pce_2"] > 33.9][["smiles","pce_1", "pce_2","pce_pcbm_sas", "pce_pcdtbt_sas","sas"]]

# Concatenate DataFrames and reset index
my1_df = pd.concat([df_acc, df_don], ignore_index=True)
smiles1=['smiles0','smiles1','smile3']
# Create a new column with 'smiles n' format (assuming 'n' starts from 1)
my1_df['smiles_key'] = smiles1  # Ensure smiles are strings

# Reorder columns
my1_df = my1_df[['smiles_key', 'smiles', 'pce_1', 'pce_2',"pce_pcbm_sas", "pce_pcdtbt_sas","sas"]]


print(my1_df.loc[0,'smiles'])

c1ncc(s1)-c1sc(-c2cnc(s2)-c2scc3cc[se]c23)c2nccnc12


In [2]:
my1_df

Unnamed: 0,smiles_key,smiles,pce_1,pce_2,pce_pcbm_sas,pce_pcdtbt_sas,sas
0,smiles0,c1ncc(s1)-c1sc(-c2cnc(s2)-c2scc3cc[se]c23)c2nc...,10.802524,15.454414,6.7958,11.447689,4.006724
1,smiles1,c1c-c2cc3cnc4ccc5ccccc5c4c3cc2-nc1,0.0,33.912133,-2.203358,31.708776,2.203358
2,smile3,c1sc(-c2cc3cc4sc5ccc6c[nH]cc6c5c4cc3c3ccccc23)...,0.0,33.961095,-3.268084,30.693011,3.268084


####  <a id='toc1_3_'></a>[Représentation des molécules 2D avec Draw en fonction du PCE des différentes molécules](#toc0_)

####  <a id='toc1_3_'></a>[Utilisation de crest et xTB pour la recherches des conformers](#toc0_)
 <center> <img src = "./Graphics/crest.png" width = "600">
 <img src = "./Graphics/xtb.jpeg" width = "600"> </center> 
 


####  <a id='toc1_3_'></a>[Workflow de Crest et xTB dans la recherches des conformers ](#toc0_)
 <center> <img src = "./Graphics/crestfeatures2.png" width = "600">
 <img src = "./Graphics/workflow.jpeg" width = "600"> </center> 


In [3]:
import xtb

In [4]:
!xtb --version

      -----------------------------------------------------------      
     |                           x T B                           |     
     |                         S. Grimme                         |     
     |          Mulliken Center for Theoretical Chemistry        |     
     |                    University of Bonn                     |     
      -----------------------------------------------------------      

   * xtb version 6.3.3 (71d3805) compiled by 'conda@b85dec0bf610' on 2021-01-07

normal termination of xtb


In [5]:
!crest --version


       |                                            |
       |                 C R E S T                  |
       |                                            |
       |  Conformer-Rotamer Ensemble Sampling Tool  |
       |          based on the GFN methods          |
       |             P.Pracht, S.Grimme             |
       |          Universitaet Bonn, MCTC           |
       Version 2.12,   Thu 19. Mai 16:32:32 CEST 2022
  Using the xTB program. Compatible with xTB version 6.4.0

   Cite work conducted with this code as

   • P.Pracht, F.Bohle, S.Grimme, PCCP, 2020, 22, 7169-7192.
   • S.Grimme, JCTC, 2019, 15, 2847-2862.

   and for works involving QCG as

   • S.Spicher, C.Plett, P.Pracht, A.Hansen, S.Grimme,
     JCTC, 2022, 18 (5), 3174-3189.

   with help from:
   C.Bannwarth, F.Bohle, S.Ehlert, S.Grimme,
   C.Plett, P.Pracht, S.Spicher

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   M

In [6]:
import os, sys
import inspect
from pathlib import Path
import tempfile
from utils import run_command

import rdkit
from rdkit.Chem import AllChem, DataStructs
from rdkit.Chem import AllChem as Chem
from rdkit.Chem import RDConfig
sys.path.append(os.path.join(RDConfig.RDContribDir, 'SA_Score'))
import sascorer

import numpy as np
import torch
import torch.nn as nn


In [7]:
MY_DATA = os.path.join(os.getcwd(), 'MY_DATA')
os.makedirs(MY_DATA, exist_ok=True)


In [None]:
import os
import pandas as pd
from rdkit import Chem


def calculate_properties(df, working_dir):
    """
    Calculates molecular properties for SMILES strings in a DataFrame,
    handling xtb convergence errors and filtering unsuccessful molecules.

    Args:
        df (pandas.DataFrame): A DataFrame containing a "smiles" column.
        working_dir (str): The directory to use for calculations and temporary files.

    Returns:
        pandas.DataFrame: A DataFrame containing calculated properties for successful molecules.
    """

    dtb = []
    yesso = ["HOMO-LUMO GAP (eV)", "TOTAL ENERGY (eV)", "HOMO Energy (eV)", "LUMO Energy (eV)"]

    filtered_df = df.copy()  # Create a copy to avoid modifying the original DataFrame

    for i in range(len(filtered_df)):
        smiles = filtered_df.loc[i, "smiles"]
        smiles_key = filtered_df.loc[i, "smiles_key"]
         
        mol = Chem.MolFromSmiles(smiles)
        mol = Chem.AddHs(mol)
        charge = Chem.rdmolops.GetFormalCharge(mol)
        atom_number = mol.GetNumAtoms()

        # Create a single directory for all calculations
        directory = os.path.join(working_dir, smiles_key )
        os.makedirs(directory, exist_ok=True)

        # Change directory for calculations
        os.chdir(directory)
        system = lambda x: run_command(x, verbose=5)

        # Write the smile to a file
        with open('test.smi', 'w') as f:
            f.write(smiles)

        # Prepare the input file
        os.system('obabel test.smi --gen3D -O test.xyz')

        # Run the preliminary xtb: 
        command_pre = 'CHARGE={};xtb {} --gfn 2 --opt normal -c $CHARGE --iterations 4000'.format(charge, 'test.xyz')
       
        try:
            # Run xtb with error handling
            result = os.system(command_pre)
            if result != 0:
                print(f"xtb preliminary optimization failed for '{smiles}'. Skipping.")
                filtered_df.drop(i, inplace=True)  # Remove molecule from DataFrame
                continue  # Skip to the next molecule
        except Exception as e:
            print(f"An error occurred while running xtb for '{smiles}': {e}")
            filtered_df.drop(i, inplace=True)  # Remove molecule from DataFrame
            continue  # Skip to the next molecule
            

        os.system("rm ./gfnff_charges ./gfnff_topo")  # Optional removal
         
        # Run crest conformer ensemble
        command_crest = 'CHARGE={};crest {} -gff -mquick -chrg $CHARGE --noreftopo'.format(charge, 'xtbopt.xyz')
        os.system(command_crest)
       
        os.system('rm ./gfnff_charges ./gfnff_topo')  # Optional removal
        os.system('head -n {} crest_conformers.xyz > crest_best.xyz'.format(atom_number+2))

        # Run the calculation
        command = 'CHARGE={};xtb {} --opt normal -c $CHARGE --iterations 4000 > out_dump'.format(charge, 'crest_best.xyz')

        try:
            # Run xtb with error handling
            result = os.system(command)
            if result != 0:
                print(f"xtb final optimization failed for '{smiles}'. Skipping.")
                filtered_df.drop(i, inplace=True)  # Remove molecule from DataFrame
                continue  # Skip to the next molecule
        except Exception as e:
            print(f"An error occurred while running xtb for '{smiles}': {e}")
            filtered_df.drop(i, inplace=True)  # Remove molecule from DataFrame
            continue  # Skip to the next molecule

        # Read the output (implementation details omitted)
        # Read the output
        with open('./out_dump', 'r') as f:
            text_content = f.readlines()
        output_index = [i for i in range(len(text_content)) if 'Property Printout' in text_content[i]]
        text_content = text_content[output_index[0]:]
        homo_data = [x for x in text_content if '(HOMO)' in x]
        lumo_data = [x for x in text_content if '(LUMO)' in x]
        homo_lumo_gap = [x for x in text_content if 'HOMO-LUMO GAP' in x]
        mol_dipole = [text_content[i:i+4] for i, x in enumerate(text_content) if 'molecular dipole:' in x]
        lumo_val = float(lumo_data[0].split(' ')[-2])
        homo_val = float(homo_data[0].split(' ')[-2])
        homo_lumo_val = float(homo_lumo_gap[0].split(' ')[-5])
        mol_dipole_val = float(mol_dipole[0][-1].split(' ')[-1])



        # Write the properties to a single file (modify as needed)
        with open(os.path.join(directory, 'properties.txt'), 'a') as f:
            f.write(f'SMILES: {smiles}\n')
            f.write(f'LUMO: {lumo_val}\n')
            f.write(f'HOMO: {homo_val}\n')
            f.write(f'HOMO-LUMO GAP: {homo_lumo_val}\n')
        dtb.append([homo_lumo_val, mol_dipole_val, homo_val, lumo_val])
        os.chdir('..')

    df_xtb = pd.DataFrame(dtb, columns=yesso)
    return df_xtb

df_xtb = calculate_properties(my1_df, MY_DATA)

1 molecule converted


      -----------------------------------------------------------      
     |                           x T B                           |     
     |                         S. Grimme                         |     
     |          Mulliken Center for Theoretical Chemistry        |     
     |                    University of Bonn                     |     
      -----------------------------------------------------------      

   * xtb version 6.3.3 (71d3805) compiled by 'conda@b85dec0bf610' on 2021-01-07

   xtb is free software: you can redistribute it and/or modify it under
   the terms of the GNU Lesser General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.
   
   xtb is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU Lesser General Public License for mo

normal termination of xtb
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
rm: impossible de supprimer './gfnff_charges': Aucun fichier ou dossier de ce type
rm: impossible de supprimer './gfnff_topo': Aucun fichier ou dossier de ce type



       |                                            |
       |                 C R E S T                  |
       |                                            |
       |  Conformer-Rotamer Ensemble Sampling Tool  |
       |          based on the GFN methods          |
       |             P.Pracht, S.Grimme             |
       |          Universitaet Bonn, MCTC           |
       Version 2.12,   Thu 19. Mai 16:32:32 CEST 2022
  Using the xTB program. Compatible with xTB version 6.4.0

   Cite work conducted with this code as

   • P.Pracht, F.Bohle, S.Grimme, PCCP, 2020, 22, 7169-7192.
   • S.Grimme, JCTC, 2019, 15, 2847-2862.

   and for works involving QCG as

   • S.Spicher, C.Plett, P.Pracht, A.Hansen, S.Grimme,
     JCTC, 2022, 18 (5), 3174-3189.

   with help from:
   C.Bannwarth, F.Bohle, S.Ehlert, S.Grimme,
   C.Plett, P.Pracht, S.Spicher

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   M

rm: impossible de supprimer './gfnff_charges': Aucun fichier ou dossier de ce type
normal termination of xtb
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
1 molecule converted


      -----------------------------------------------------------      
     |                           x T B                           |     
     |                         S. Grimme                         |     
     |          Mulliken Center for Theoretical Chemistry        |     
     |                    University of Bonn                     |     
      -----------------------------------------------------------      

   * xtb version 6.3.3 (71d3805) compiled by 'conda@b85dec0bf610' on 2021-01-07

   xtb is free software: you can redistribute it and/or modify it under
   the terms of the GNU Lesser General Public License as published by
   the Free Software Foundation, either version 3 of the License, or
   (at your option) any later version.
   
   xtb is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU Lesser General Public License for mo

#### Extraction  du  E_Homo,E_lumo,le E_gap
### OM
<center><img src="./Graphics/Molecule_HOMO-LUMO_diagram.png" width="600">

In [None]:
df_xtb

In [None]:
# Save the xtb results dataframe to a file


## Fusion de DataFrame

In [None]:
my_df=df_xtb

In [None]:
import pandas as pd

def calibrate_data(my_df):
    # Iterate through each row in the DataFrame
    for i in range(len(my_df)):
        # Extract the current row's data
        homo_lumo_val = my_df.loc[i, "HOMO-LUMO GAP (eV)"]
        HL_range_rest = homo_lumo_val  # Initial value of HL_range_rest

        # Determine the calibrated value of HL_range_rest based on its range
        if 0.8856 <= HL_range_rest <= 3.2627:
            HL_range_rest = 1.0
        elif HL_range_rest < 0.8856:
            HL_range_rest = 0.1144 + homo_lumo_val
        else:
            HL_range_rest = 4.2627 - HL_range_rest
        # Calculate calibrated homo and lumo levels
        homo_cal = my_df.loc[i, "HOMO Energy (eV)"] * 0.8051030400316004 + 2.5376777453204133
        lumo_cal= my_df.loc[i, "LUMO Energy (eV)"] * 0.8787863933542347 + 3.7912767464357200
        HL_range_rest1=abs(homo_cal)-abs(lumo_cal)
        # Update the DataFrame with calibrated values
        my_df.loc[i, "Gap_calibrated"] = HL_range_rest1
        my_df.loc[i, "homo_calibrated"] = homo_cal
        my_df.loc[i, "lumo_calibrated"] = abs(lumo_cal)
    return my_df
# Assuming 'df' is your DataFrame containing the data
my_df=calibrate_data(my_df)


In [None]:
my_df

 
## <a id='toc1_4_'></a>[MODEL DE SHARBER](#toc0_)
<div  class="alert alert-info">
Courbes densité de courant-tension (J/V) dans l’obscurité (ronds noirs) et sous illumination
(ronds blancs*) et caractéristiques principales des cellules solaires

</div>
<center><img src="./Graphics/Densite.png" width="600">


 ###  <a id='toc1_4_'></a>[Densité de courant de court-circuit(${J_{sc}}$)](#toc0_)


 <div  class="alert alert-info">
Dépend du nombre de photons absorbes par le matériau et donc de l’ ́épaisseur
de la couche active ainsi que de son spectre d’absorption et représente la quantité maximale courant électrique que cellule solaire organique peut générer dans des conditions de court-circuit

</div> 

\begin{equation}
\begin{split}

{J_{sc}}= Ae^{-E_{GAP}^2/B}\\.
\end{split}
\end{equation}
A et B représente la paramètres d'ajustements et déjà fixés dans TartarusA = 433.11633173034136 ,B = 2.3353220382662894
*  ${E_{GAP}}$  représente le énergies du Gap du donneurs correspondant a la différences de deux orbitales moléculaire Homo et Lumo sa valeur maximale dans les travaux CEP est estimée ${3.8 eV}$ et donc les gamme de valeurs normales se situe entre ${[0.8856;3.2627 ]eV} $
* Pour la representation des longueurs des opv ${\lambda(Å)}$ se calcule commment


\begin{equation}
\begin{split}


{\lambda(Å)} = \frac{12 398(eV)}{E_{GAP}(eV)}

\end{split}
\end{equation}

* ${\lambda(m)}$ sa gamme de valeurs  ${[3.7999203114*10^{-7},1.3999548329*10^{-6}]}(m)$  qui situe dans le domain du  Ultraviolet (UV) et le Visible(V)

###  <a id='toc1_4_'></a>[Tension de circuit ouvert(V_oc)](#toc0_)


 <div  class="alert alert-info">
Représentent la tension maximale qu’une cellule solaire peut appliquer a une charge externes

* Ecart entre la HOMO du composé donneur et la LUMO du composé accepteur

* Influencée par la recombinaison des charges qui ne peut être totalement évitée 

</div> 
        

\begin{equation}
\begin{split}




{V_{oc}}={\frac{1}{e}|E^{Do}HOMO|-|E^{AC}LUMO| -0.3}\\.

\end{split}
\end{equation}

* ${e}$ represente la charge de l'electrons 
* $E^{Do}HOMO$ represente l'énergie  Homo du donneur sa valeur maximale dans les travaux CEP est estimée ${-3,62 eV}$  et donc la gamme de valeurs normales se situe entre ${[-5.7 , -4.5]eV}$
* $E^{AC}LUMO$ represente l'énergie Lumo de l'accepteur sa valeur maximale dans les travaux CEP est estimée ${-1,15 eV}$ et donc la gamme de valeurs normales se situe entre ${[-4 , -3]eV}$
* -0.3 est la tension seuil en dessous du quelle il n'y'a separation des excitons

###  <a id='toc1_4_'></a>[Facteur de forme / facteur de remplissage  (FF) ](#toc0_)


 <div  class="alert alert-info">
 
* Informe sur la capacité du transport des charges dans le dispositif et sur la qualité de l’interface entre le donneur et l’accepteur estimé a 65 %

* Est proportionnelle a l'efficacite quantique externe lorsques des énergies photons absorbé >${E_{GAP}}$  
</div> 
        

\begin{equation}
\begin{split}
                 
{FF}= \frac{V_{oc}}{V_{oc}+a*{K_{b}}*T}
\end{split}
\end{equation}

* ${a}$ le coefficient d'absorption 
* $ {K_{b}}$ la constance de Boltzmann
* ${T}$ la temperature externes



###  <a id='toc1_4_'></a>[Efficacité de conversion (PCE)](#toc0_)


 <div  class="alert alert-info">
Rapport entre la puissance électrique produite par la cellule et la puissance lumineuse incidente  sa valeurs actuelles est estimée a 12% l'objectif est les 20%

</div> 
        

\begin{equation}
\begin{split}
                 
{PCE}={100} \frac{V_{oc}*FF*J_{sc}}{P_{in}}
\end{split}
\end{equation}
* puissance lumineuse incidente $ {P_{in}} $ 



In [None]:
def gaussian(x, A, B):
    return A * np.exp(-x** 2 / B)


In [None]:

import numpy as np
# Define parameters for Scharber model
A = 433.11633173034136
B = 2.3353220382662894
Pin = 900.1393292842149

In [None]:
import pandas as pd
from scipy.stats import norm  # Assuming 'norm' is used for the Gaussian function

def calculate_voc_pce_jsc(my_df, Pin, A, B):
    """
    This function calculates VOC, PCE, and Jsc for each row in the DataFrame and adds them as new columns.

    Args:
        my_df (pandas.DataFrame): The DataFrame containing molecule data with 'homo_calibrated' and 'lumo_calibrated' columns.
        Pin (float): The incident light power density.
        A (float): Gaussian function parameter A.
        B (float): Gaussian function parameter B.

    Returns:
        pandas.DataFrame: The modified DataFrame with VOC, PCE, and Jsc columns.
    """
    for i in my_df.index:
        # Scharber model objective 1: Optimization of donor for phenyl-C61-butyric acid methyl ester (PCBM) acceptors
        
        voc_1 = (abs(my_df.loc[i,"homo_calibrated"]) - abs(-4.3)) - 0.3
        if voc_1 < 0.0:
            voc_1 = 0.0
        lumo_offset_1 = my_df.loc[i, "lumo_calibrated"]  + 4.3
        if lumo_offset_1 < 0.3:
            pce_1 = 0.0
        else:
            jsc_1 = gaussian(my_df.loc[i, "Gap_calibrated"] , A, B)
        if jsc_1 > 415.22529811760637:
            jsc_1 = 415.22529811760637
        pce_1 = 100 * voc_1 * 0.65 * jsc_1 / Pin

        # Scharber model objective 2: Optimization of acceptor for poly[N-90-heptadecanyl-2,7-carbazole-alt-5,5-(40,70-di-2-thienyl-20,10,30-benzothiadiazole)] (PCDTBT) donor
        voc_2 = (abs(-5.5) - abs(my_df.loc[i, "lumo_calibrated"])) - 0.3
        if voc_2 < 0.0:
            voc_2 = 0.0
        lumo_offset_2 = -3.6 + my_df.loc[i, "lumo_calibrated"]
        if lumo_offset_2 < 0.3:
            pce_2 = 0.0
        else:
            jsc_2 = gaussian(my_df.loc[i, "Gap_calibrated"], A, B)
        if jsc_2 > 415.22529811760637:
            jsc_2 = 415.22529811760637
        pce_2 = 100 * voc_2 * 0.65 * jsc_2 / Pin



        # Add separate VOC, PCE, and Jsc for each objective
        my_df.loc[i, "voc_pcbm"] = voc_1
        my_df.loc[i, "jsc_pcbm"] = jsc_1
        my_df.loc[i, "pce_pcbm"] = pce_1

        my_df.loc[i, "voc_pcdtbt"] = voc_2
        my_df.loc[i, "jsc_pcdtbt"] = jsc_2
        my_df.loc[i, "pce_pcdtbt"] = pce_2

    return my_df

# Assuming you have defined the values for Pin, A, and B

my_df = calculate_voc_pce_jsc(my_df, Pin, A, B)

In [None]:
my_df1=my_df 
my_df 

In [None]:
my1_df

In [None]:
 for i in my_df1.index:
     my_df1.at[i, 'dif_pce1'] = abs(my_df.at[i, 'pce_pcbm'] - my1_df.at[i, 'pce_1'])
     my_df1.at[i, 'dif_pce2'] = abs(my1_df.at[i, 'pce_2'] - my_df.at[i, 'pce_pcdtbt'])


In [None]:
my_df1

### 


###  <a id='toc1_4_'></a>[La différences des PCE est d'environs 2 % avec celles de la base et sascore est d’environs 5 % en comparaisons avec celui de tartarus le model tartarus ](#toc0_)

In [None]:
### model du SAscore_predict

In [None]:
from pathlib import Path
from rdkit.Chem import RDConfig
import os, sys
sys.path.append(os.path.join(RDConfig.RDContribDir, 'SA_Score'))
import sascorer

In [None]:
for i in range(len(my1_df)):
    mol_rdkit = Chem.MolFromSmiles(my1_df.loc[i, 'smiles'])

    if mol_rdkit is not None:
        # Ajoute les hydrogènes explicites
        mol = Chem.AddHs(mol_rdkit)
        charge = Chem.rdmolops.GetFormalCharge(mol)
        atom_number = mol.GetNumAtoms()
        sas = sascorer.calculateScore(mol)
        my_df1.at[i, 'sas1']=sas
        my_df1.at[i, 'pce_pcbm_sas'] = my_df1.at[i, 'pce_pcbm']- sas
        my_df1.at[i, 'pce_pcdtbt_sas'] = my_df1.at[i, 'pce_pcdtbt'] - sas
        my_df1.at[i, 'dif_pce1_sas'] = abs(my_df1.at[i, 'pce_pcbm_sas'] - my1_df.at[i, 'pce_1'])
        my_df1.at[i, 'dif_pce2_sas'] = abs(my1_df.at[i, 'pce_2'] - my_df1.at[i, 'pce_pcdtbt_sas'])
        my_df1.at[i, 'dif_pce2_sas'] = abs(my1_df.at[i, 'sas'] - my_df1.at[i, 'sas1'])

        # Génère la conformation 3D initiale de la moléculeet optimisation avec GFN-XTB

    my_df1

In [None]:

my_df1

In [None]:

my1_df

In [None]:
# Assuming you have a DataFrame named 'df'
my_df1.to_csv('my_crest_xtb.csv', index=True)  # Save without index
my1_df.to_csv('tartarus_crest_xtb.csv', index=True)   # Save with index
