# Chemistry Exploration with ggen

This notebook demonstrates how to systematically explore a chemical space by:
1. Specifying a chemical system (e.g., "Li-Co-O")
2. Generating candidate structures across different stoichiometries
3. Storing all data in SQLite for persistence
4. Building a phase diagram to identify thermodynamically stable candidates


In [1]:
%pip install -e ..

Obtaining file:///Users/mmoderwell/ouro/ggen
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: ggen
  Building editable for ggen (pyproject.toml) ... [?25ldone
[?25h  Created wheel for ggen: filename=ggen-0.1.0-0.editable-py3-none-any.whl size=7756 sha256=ca39e4e362bf559abe7cd95395ba89e60f1672b1023bda8ef884651bc428dbf3
  Stored in directory: /private/var/folders/zw/zcpqh2ss43v0d8_8mdqcds440000gn/T/pip-ephem-wheel-cache-1eqxooi6/wheels/ac/6f/ca/e5e77e53988e12825fc0b1323c5ae56703f9c5eab46d137fdd
Successfully built ggen
Installing collected packages: ggen
  Attempting uninstall: ggen
    Found existing installation: ggen 0.1.0
    Uninstalling ggen-0.1.0:
      Successfully uninstalled ggen-0.1.0
Successfully installed ggen-0.1.0
Note: you may need to

In [2]:
import logging
logging.basicConfig(level=logging.INFO)


In [3]:
from ggen import ChemistryExplorer


## Initialize the Explorer

Create a `ChemistryExplorer` instance. You can optionally specify:
- `calculator`: Custom ASE calculator (defaults to ORB)
- `random_seed`: For reproducibility
- `output_dir`: Where to store results


In [4]:
explorer = ChemistryExplorer(
    # random_seed=144,
    output_dir="./exploration_runs"
)

## Preview Stoichiometries

Before running the full exploration, you can preview what stoichiometries will be generated:


In [5]:
# Parse the chemical system
elements = explorer.parse_chemical_system("Fe-Mn-Bi")
print(f"Elements: {elements}")

# Enumerate stoichiometries
stoichiometries = explorer.enumerate_stoichiometries(
    elements=elements,
    max_atoms=8,
    min_atoms=2,
    include_binaries=True,
    include_ternaries=True,
)

print(f"\nTotal stoichiometries: {len(stoichiometries)}")
print("\nFirst 10:")
for s in stoichiometries[:10]:
    formula = "".join(f"{el}{c if c > 1 else ''}" for el, c in sorted(s.items()))
    print(f"  {formula}: {s}")


Elements: ['Bi', 'Fe', 'Mn']

Total stoichiometries: 115

First 10:
  BiFeMn: {'Bi': 1, 'Fe': 1, 'Mn': 1}
  BiFeMn2: {'Bi': 1, 'Fe': 1, 'Mn': 2}
  BiFe2Mn: {'Bi': 1, 'Fe': 2, 'Mn': 1}
  Bi2FeMn: {'Bi': 2, 'Fe': 1, 'Mn': 1}
  BiFeMn3: {'Bi': 1, 'Fe': 1, 'Mn': 3}
  BiFe2Mn2: {'Bi': 1, 'Fe': 2, 'Mn': 2}
  BiFe3Mn: {'Bi': 1, 'Fe': 3, 'Mn': 1}
  Bi2FeMn2: {'Bi': 2, 'Fe': 1, 'Mn': 2}
  Bi2Fe2Mn: {'Bi': 2, 'Fe': 2, 'Mn': 1}
  Bi3FeMn: {'Bi': 3, 'Fe': 1, 'Mn': 1}


## Run the Exploration

Now let's run the full exploration. This will:
1. Generate structures for each stoichiometry
2. Optimize them using the ORB calculator
3. Store results in SQLite + CIF files
4. Build a phase diagram


In [6]:
result = explorer.explore(
    chemical_system="Fe-Sn-B",
    # Fe-Sn-Co
    # Fe-Mn-Ni
    # Fe-Mn-
    max_atoms=20,          
    min_atoms=2,
    num_trials=15,          # Trials per stoichiometry
    optimize=True,
    include_binaries=True,
    include_ternaries=True,
    max_stoichiometries=100,
    crystal_systems=["hexagonal", "tetragonal"],
    load_previous_runs=True,      # Load from all previous runs
    skip_existing_formulas=False,  # Skip formulas we already have
    preserve_symmetry=True,
)

INFO:ggen.explorer:Starting exploration of B-Fe-Sn
INFO:ggen.explorer:Found 2 previous runs for B-Fe-Sn
INFO:ggen.explorer:Loaded 0 candidates from exploration_B-Fe-Sn_20260105_150839
INFO:ggen.explorer:Loaded 7 candidates from exploration_B-Fe-Sn_20260105_144834
INFO:ggen.explorer:Loaded 7 unique structures from previous runs
INFO:ggen.explorer:Exploring 100 stoichiometries
INFO:ggen.explorer:[1/100] Generating B4Fe4Sn7
INFO:ggen.explorer:Generating structure for B4Fe4Sn7
INFO:cached_path:cache of https://orbitalmaterials-public-models.s3.us-west-1.amazonaws.com/forcefields/orb-v3/orb-v3-conservative-inf-mpa-20250404.ckpt is up-to-date
INFO:ggen.ggen:Starting crystal generation for B4Fe4Sn7 (elements=['B', 'Fe', 'Sn'], counts=[4, 4, 7])
INFO:ggen.ggen:Found 32 compatible space groups
INFO:ggen.ggen:Filtered to 16 space groups in systems: ['hexagonal', 'tetragonal']
INFO:ggen.ggen:Multi-spacegroup mode (symmetry_bias=0.00): trying 5 space groups: [111, 168, 177, 174, 83]
INFO:ggen.ggen

## Explore the Results


In [7]:
print(f"Chemical System: {result.chemical_system}")
print(f"Elements: {result.elements}") 
print(f"Total candidates attempted: {result.num_candidates}")
print(f"Successful generations: {result.num_successful}")
print(f"Failed generations: {result.num_failed}")
print(f"Phases on convex hull: {len(result.hull_entries)}")
print(f"Total time: {result.total_time_seconds:.1f}s")
print(f"\nResults saved to: {result.run_directory}")
print(f"Database: {result.database_path}")


Chemical System: B-Fe-Sn
Elements: ['B', 'Fe', 'Sn']
Total candidates attempted: 104
Successful generations: 104
Failed generations: 0
Phases on convex hull: 3
Total time: 3604.9s

Results saved to: exploration_runs/exploration_B-Fe-Sn_20260105_150839
Database: exploration_runs/exploration_B-Fe-Sn_20260105_150839/exploration.db


## View Stable Candidates

Get the phases that are on or near the convex hull:


In [8]:
# Get candidates within 150 meV/atom of the hull
stable = explorer.get_stable_candidates(result, e_above_hull_cutoff=0.15)

print(f"Found {len(stable)} stable/near-stable phases:\n")
for c in stable:
    e_above = c.generation_metadata.get('e_above_hull', 0)
    # Extract timestamp from run name (e.g., "exploration_Co-Fe-Mn_20260102_121508" -> "121508")
    source_run = c.generation_metadata.get('source_run', '')
    run_time = source_run if source_run else 'current'
    print(f"  {c.formula:10s}  E={c.energy_per_atom:.4f} eV/atom  "
          f"SG={c.space_group_symbol:10s}  E_hull={e_above*1000:.1f} meV  run={run_time}")

Found 10 stable/near-stable phases:

  B9Fe3Sn     E=-6.7929 eV/atom  SG=C2          E_hull=0.0 meV  run=current
  B8Fe9       E=-7.7849 eV/atom  SG=C2/m        E_hull=0.0 meV  run=current
  B15Sn       E=-6.1392 eV/atom  SG=P1          E_hull=0.0 meV  run=current
  B4Fe12Sn    E=-7.8091 eV/atom  SG=P1          E_hull=37.1 meV  run=current
  Fe11Sn6     E=-6.7637 eV/atom  SG=P1          E_hull=87.7 meV  run=current
  B13Sn2      E=-5.8719 eV/atom  SG=P1          E_hull=101.7 meV  run=current
  FeSn13      E=-4.1632 eV/atom  SG=P-62m       E_hull=104.1 meV  run=current
  BSn17       E=-3.9631 eV/atom  SG=P1          E_hull=113.5 meV  run=current
  B11Fe4Sn2   E=-6.5772 eV/atom  SG=P1          E_hull=129.9 meV  run=current
  B4Fe8Sn     E=-7.5332 eV/atom  SG=P1          E_hull=131.7 meV  run=current


## View the Phase Diagram


In [9]:
# Plot the phase diagram (for ternary systems)
if result.phase_diagram is not None:
    try:
        fig = explorer.plot_phase_diagram(result, show_unstable=0.15)
        fig.show()
    except Exception as e:
        print(f"Phase diagram plotting not available: {e}")
else:
    print("No phase diagram available (need at least 2 valid candidates)")


## Export Summary


In [10]:
# Export a JSON summary of the exploration
summary = explorer.export_summary(
    result,
    output_path=result.run_directory / "summary.json"
)

print("Summary exported!")
print(f"Hull entries: {summary['hull_entries']}")


Summary exported!
Hull entries: [{'formula': 'B9Fe3Sn', 'energy_per_atom': -6.79293705866887, 'space_group': 'C2 (#5)', 'space_group_number': 5, 'space_group_symbol': 'C2'}, {'formula': 'B8Fe9', 'energy_per_atom': -7.784901338465073, 'space_group': 'C2/m (#12)', 'space_group_number': 12, 'space_group_symbol': 'C2/m'}, {'formula': 'B15Sn', 'energy_per_atom': -6.139235973358154, 'space_group': 'P1 (#1)', 'space_group_number': 1, 'space_group_symbol': 'P1'}]


## Inspect Individual Structures


In [11]:
# Get the most stable structure
if result.hull_entries:
    best = result.hull_entries[0]
    print(f"Most stable: {best.formula}")
    print(f"Energy: {best.energy_per_atom:.4f} eV/atom")
    print(f"Space group: {best.space_group_symbol} (#{best.space_group_number})")
    print(f"CIF file: {best.cif_path}")
    
    # View with pymatviz if available
from pymatviz import StructureWidget
StructureWidget(best.structure)

Most stable: B9Fe3Sn
Energy: -6.7929 eV/atom
Space group: C2 (#5)
CIF file: exploration_runs/exploration_B-Fe-Sn_20260105_150839/structures/B9Fe3Sn_C2.cif


<pymatviz.widgets.structure.StructureWidget object at 0xdbf71e690>

## Load a Previous Run

You can reload a previous exploration from its directory:


In [12]:
# Load a previous run
# loaded = ChemistryExplorer.load_run("./exploration_runs/your_run_name")
# print(f"Loaded {loaded.num_candidates} candidates")


## Query the SQLite Database Directly

The SQLite database allows flexible querying:


In [13]:
import sqlite3
import pandas as pd

# Connect to the database
conn = sqlite3.connect(str(result.database_path))

# Query all candidates
df = pd.read_sql_query("""
    SELECT formula, energy_per_atom, space_group_symbol, 
           e_above_hull, is_on_hull, is_valid
    FROM candidates
    WHERE is_valid = 1
    ORDER BY e_above_hull ASC
""", conn)

print("All valid candidates:")
df


All valid candidates:


Unnamed: 0,formula,energy_per_atom,space_group_symbol,e_above_hull,is_on_hull,is_valid
0,B,-6.034542,Cmmm,,0,1
1,Fe,-8.435699,Im-3m,,0,1
2,Sn,-3.946685,Fm-3m,,0,1
3,B9Fe3Sn,-6.792937,C2,0.000000,1,1
4,B8Fe9,-7.784901,C2/m,0.000000,1,1
...,...,...,...,...,...,...
102,Fe17Sn,-6.327442,P6/mmm,1.858867,0,1
103,Fe17Sn2,-6.094091,P6/mmm,1.869080,0,1
104,B6Fe5Sn5,-4.148744,P1,2.297677,0,1
105,B5Fe5Sn,-4.749363,P1,2.614382,0,1
