### Step 1: Install Necessary Libraries
First, we need to install the required Python libraries, RDKit and PySoftK. This installation cell handles the process by cloning the pysoftk repository and then using `pip install .` to install the package and all its dependencies as defined in its `setup.py` file.

In [None]:
# Clone the pysoftk repository from GitHub
!git clone https://github.com/alejandrosantanabonilla/pysoftk

# Navigate to the pysoftk directory
%cd pysoftk

# Install the package from the current directory, which reads from setup.py
!pip install .

### Conformer Generation with a Genetic Algorithm Approach

This notebook provides a tutorial on generating molecular conformers using a genetic algorithm (GA) as implemented by **Open Babel** and wrapped in the **PySoftK** library. We will explain how the GA works, run the provided script, and visualize the results using **RDKit**.

#### 1. The Genetic Algorithm Method Explained

Genetic algorithms are a class of optimization algorithms inspired by the process of natural selection. In the context of conformer generation, they are a powerful heuristic for searching a vast conformational space for low-energy structures. 

The process works as follows:
1.  **Initial Population**: The algorithm starts with a diverse set of randomly generated conformers, known as the "population".
2.  **Fitness Evaluation**: Each conformer in the population is evaluated based on its "fitness," which in this case is its potential energy calculated using a molecular mechanics force field (e.g., **MMFF94**). Lower energy means a higher fitness.
3.  **Selection**: A new population of conformers is created by selecting the "fittest" individuals from the current generation. These high-fitness conformers are more likely to be chosen for reproduction.
4.  **Crossover and Mutation**: New offspring are created from the selected parents through two primary operations:
    * **Crossover**: The algorithm combines parts of two parent conformers to create a new offspring. For example, it might swap sections of their torsion angles.
    * **Mutation**: Random changes are introduced to the torsion angles of some conformers. This introduces new variations into the population, helping the search escape local energy minima.

This cycle of evaluation, selection, crossover, and mutation repeats for many generations until a specified convergence criterion is met, or the maximum number of generations is reached. The final result is a diverse set of low-energy conformers.

#### 2. Python Script for Conformer Generation

The following script uses the **PySoftK** library's `ga_generate_conformers` method, which leverages a genetic algorithm to find stable conformers for a flexible molecule like decane.

In [None]:
from rdkit import Chem
from rdkit.Chem import AllChem
from pysoftk.torsional.mol_conformer import ConformerGenerator

# 1. Instantiate the ConformerGenerator class.
ga_generator = ConformerGenerator()

# 2. Define the molecule input (SMILES string).
smiles_string = 'CCCCCCCCCC'  # Decane

# 3. Define the output directory.
output_directory = 'decane_ga_conformers'

# 4. Call the ga_generate_conformers method with output arguments.
num_conformers_generated = ga_generator.ga_generate_conformers(
    smiles=smiles_string,
    output_dir=output_directory,
    base_filename='decane_ga',
    num_conformers=20,
    forcefield='mmff94',
    mutability=10,
    convergence=100
)

# 5. Check the return value to confirm success.
if num_conformers_generated > 0:
    print(f"\n✅ All {num_conformers_generated} conformer files are located in the '{output_directory}' directory.")
else:
    print("\n❌ Conformer generation failed. Please check the console output for error messages.")

#### 3. Visualizing the Generated Conformers

The visualization part of the script loads the generated `.xyz` files and converts them into a format that **RDKit** can plot, allowing us to see the different conformations.

In [None]:
from rdkit import Chem
from rdkit.Chem import AllChem
from rdkit.Chem.Draw import MolsToGridImage
import os
from IPython.display import display

# 1. Load all conformer files from the output directory.
conformers = []
output_directory = 'decane_ga_conformers'
files = sorted([f for f in os.listdir(output_directory) if f.endswith('.xyz')])

for file in files:
    mol = Chem.MolFromXYZFile(os.path.join(output_directory, file))
    if mol:
        # The XYZ file format does not contain bond information, so we need to add it.
        # We'll create a molecule from the SMILES string to get the bonding information
        # and transfer the 3D coordinates from the loaded conformer.
        smiles = 'CCCCCCCCCC'
        mol_with_bonds = Chem.MolFromSmiles(smiles)
        mol_with_bonds = Chem.AddHs(mol_with_bonds)
        
        # Create a new conformer for the molecule with bonds
        new_conf = Chem.Conformer(mol.GetConformer())
        mol_with_bonds.AddConformer(new_conf, assignId=True)
        conformers.append(mol_with_bonds)

print(f"Loaded {len(conformers)} conformers from the directory.")

# 2. Align the conformers for better visual comparison.
if conformers:
    ref_mol = conformers[0]
    for i in range(1, len(conformers)):
        AllChem.AlignMol(conformers[i], ref_mol)

# 3. Display the conformers in a grid.
#    We will display a maximum of 10 conformers for clarity.
conformers_to_display = conformers[:10]
legends = [f'Conf_{i}' for i in range(len(conformers_to_display))]

if conformers_to_display:
    img = MolsToGridImage(conformers_to_display, molsPerRow=5, subImgSize=(200, 200), legends=legends)
    display(img)
else:
    print("No conformers to display.")