# Using the Builder Pattern with Complex Crystal Structures

## Introduction

The site_analysis package's builder pattern allows the analysis of 
complex crystal structures with multiple crystallographically distinct sites that mobile ions can occupy.
This tutorial demonstrates how to:

1. Define multiple crystallographically distinct site types in the same structure
2. Handle structure alignment between reference and target structures 
3. Map sites between structures with different chemical compositions
4. Analyze how site occupations change with chemical disorder

## Case Study: Argyrodite Solid Electrolytes

We will apply these techniques to analyse the Li-ion distribution over sites in the argyrodite Li<sub>6</sub>PS<sub>5</sub>Cl.

The argyrodite structure is an example of a tetrahedrally close-packed structure, where the anions (S<sup>2&minus;</sup>, Cl<sup>&minus;</sup>) form the vertices of tetrahedra whose centers constitute interstitial sites. In the argyrodite structure, these tetrahedral sites can be classified into six crystallographically distinct types, numbered 0 to 5.

In Li<sub>6</sub>PS<sub>5</sub>Cl:
- Type 0 sites are occupied by phosphorus atoms (forming PS<sub>4</sub><sup>3&minus;</sup> tetrahedra)
- Types 1-5 sites, in principle, are available for lithium occupation as lithium diffuses through the anion host framework.

The structure derives from an MgCu<sub>2</sub>-type Laves phase arrangement. However, in argyrodites, the anionic framework adopts a modified configuration where the Fd3̄m 8a sites split into symmetry-inequivalent F4̄3m 4a and 4c sites. In Li<sub>6</sub>PS<sub>5</sub>Cl, these sites are occupied by a mix of sulfur and chlorine atoms.

This anion disorder is critically important to the functional properties of argyrodites. The disorder modifies the local coordination environments around the Li sites, affecting the distribution of lithium over the available tetrahedral sites and creating new diffusion pathways through the structure. Increased disorder has been shown to activate previously unfavorable site types, particularly type 4 sites, leading to enhanced ionic conductivity.

In this tutorial, we will analyse data from three MD simulations with different degrees of S/Cl site exchange (0%, 50%, and 100%), and calculate the time-average distribution of Li-ions over the five interstitial site types. For more details on the mechanism of superionic conduction in these materials, see [this paper](https://doi.org/10.1021/acs.chemmater.0c03738).

In [1]:
# Import necessary libraries
from pymatgen.io.vasp import Poscar, Xdatcar
from pymatgen.core import Structure, Lattice
import numpy as np
from collections import Counter
from site_analysis import TrajectoryBuilder

## Creating a Reference Structure

To define multiple site types, we need a reference structure that contains distinct atoms at each site. This approach allows us to define each tetrahedral site type separately using the TrajectoryBuilder's polyhedral sites functionality.

In [2]:
# Create a reference structure with the argyrodite topology
# The key approach: use different atom types to differentiate each site type
# - P occupies the t0 tetrahedra (phosphorus in PS4 units)
# - Different dummy atoms (Li, Mg, Na, Be, K) occupy the t1-t5 tetrahedra
#   to allow us to define each tetrahedral site type separately
# - S occupies all the anion sites

lattice = Lattice.cubic(a=10.155)  # Use the experimental lattice parameter

coords = np.array(
    [[0.5,     0.5,     0.5],     # P (t0) - PS4 tetrahedra positions
     [0.9,     0.9,     0.6],     # t1 - first type of Li site (represented by Li atoms)
     [0.23,    0.92,    0.09],    # t2 - second type of Li site (represented by Mg atoms)
     [0.25,    0.25,    0.25],    # t3 - third type of Li site (represented by Na atoms)
     [0.15,    0.15,    0.15],    # t4 - fourth type of Li site (represented by Be atoms)
     [0.0,     0.183,   0.183],   # t5 - fifth type of Li site (represented by K atoms)
     [0.0,     0.0,     0.0],     # S - anion position (4a site)
     [0.75,    0.25,    0.25],    # S - anion position (4c site)
     [0.11824, 0.11824, 0.38176]] # S - anion position (16e site)
) 

# Create the reference structure with F-43m space group symmetry
# and replicate it as a 2x2x2 supercell to match the MD simulations
reference_structure = Structure.from_spacegroup(
    sg="F-43m",
    lattice=lattice,
    species=['P', 'Li', 'Mg', 'Na', 'Be', 'K', 'S', 'S', 'S'],
    coords=coords) * [2, 2, 2]

print(f"Reference structure contains {len(reference_structure)} atoms")
print(f"Composition: {reference_structure.composition.formula}")

Reference structure contains 1664 atoms
Composition: K384 Na32 Li128 Mg768 Be128 P32 S192


## Implementing the Trajectory Builder

The key part of this tutorial is the implementation of the `build_trajectory` function, which demonstrates how to use the builder pattern to create a Trajectory object with multiple site types and proper species mapping.

In [3]:
def build_trajectory(structure):
    """
    Build a Trajectory object for analyzing Li ion diffusion in argyrodite structures.
    
    This function demonstrates advanced usage of the builder pattern:
    1. Defining multiple site types (5 different tetrahedral sites)
    2. Using structure alignment with specific alignment species
    3. Mapping between different species in reference and target structures
    
    Args:
        structure: A pymatgen Structure object from the MD trajectory
        
    Returns:
        A site_analysis Trajectory object configured for argyrodite analysis
    """
    builder = TrajectoryBuilder()
    
    # 1. Set the reference and target structures
    builder.with_reference_structure(reference_structure) 
    builder.with_structure(structure)
    
    # 2. Specify that Li is the mobile species we want to track
    builder.with_mobile_species('Li')
    
    # 3. Define 5 different types of tetrahedral sites that Li can occupy
    # Note how we call with_polyhedral_sites multiple times with different parameters
    
    # Type 1 sites (represented by Li in the reference)
    builder.with_polyhedral_sites( 
        centre_species='Li',  # The type 1 sites have Li occupying them in our reference
        vertex_species='S',   # The vertices of these tetrahedra are S atoms
        cutoff=3.0,           # Distance cutoff for finding vertices
        n_vertices=4,         # Each site has 4 vertices (tetrahedral)
        label='type 1')       # Label for these sites
    
    # Type 2 sites (represented by Mg in the reference)
    builder.with_polyhedral_sites(
        centre_species='Mg',  # Type 2 sites have Mg in the reference
        vertex_species='S',
        cutoff=3.0,
        n_vertices=4,
        label='type 2')
    
    # Type 3 sites (represented by Na in the reference)
    builder.with_polyhedral_sites(
        centre_species='Na',  # Type 3 sites have Na in the reference
        vertex_species='S',
        cutoff=3.0,
        n_vertices=4,
        label='type 3')
    
    # Type 4 sites (represented by Be in the reference)
    builder.with_polyhedral_sites(
        centre_species='Be',  # Type 4 sites have Be in the reference
        vertex_species='S',
        cutoff=3.0,
        n_vertices=4,
        label='type 4')
    
    # Type 5 sites (represented by K in the reference)
    builder.with_polyhedral_sites(
        centre_species='K',   # Type 5 sites have K in the reference
        vertex_species='S',
        cutoff=3.0,
        n_vertices=4,
        label='type 5')
    
    # 4. Configure the alignment between reference and target structures
    # This is critical when structures might have different origins or when
    # we need to ensure correct mapping between site definitions
    builder.with_structure_alignment(align_species='P') 
    
    # 5. Configure mapping between different species
    # This is crucial for handling anion disorder (S/Cl site exchange)
    # The reference structure has all-S anions, but the real structures
    # have a mix of S and Cl at these positions
    builder.with_site_mapping(mapping_species=['S', 'Cl']) 
    
    # 6. Build and return the Trajectory object
    trajectory = builder.build()
    return trajectory

## Analysis Function

This function analyzes the site occupations from the trajectory data, calculating the percentage of time that Li ions spend in each site type.

In [4]:
def print_site_occupations(trajectory, title=None):
    """
    Print the percentage occupation for each site type in the trajectory.
    
    Args:
        trajectory: A Trajectory object containing atoms and sites information
        title: Optional title to include in the output
    """
    site_types = ['type 5', 'type 4', 'type 3', 'type 2', 'type 1']
    
    # Collect all site labels from the atom trajectories
    site_labels = []
    for atom in trajectory.atoms:
        for site_idx in atom.trajectory:
            if site_idx is not None:  # Check that the atom is in a site
                site_labels.append(trajectory.sites[site_idx].label)
    
    # Count occurrences of each site type
    c = Counter(site_labels)
    
    # Calculate and print percentages
    total_sites = sum(c.values())
    
    if title:
        print(f"\nSite occupation analysis - {title}:")
        print("-" * 40)
    
    for t in site_types:
        percentage = (c.get(t, 0) / total_sites * 100) if total_sites > 0 else 0
        print(f'{t}: {percentage:.2f}%')

## Analysis of Argyrodite with Different Levels of Anion Disorder

Now we'll analyze three different Li<sub>6</sub>PS<sub>5</sub>Cl systems with varying degrees of S/Cl anion disorder.

### 1. Fully Ordered Structure (0% Anion Site Exchange)

In [5]:
# Analyze Li6PS5Cl with fully ordered anion sites (0% site exchange)
print("=" * 50)
print("Analyzing Li6PS5Cl with 0% anion site exchange (fully ordered)")
print("=" * 50)

md_structures = Xdatcar('data/Li6PS5Cl_0p_XDATCAR.gz').structures
print(f"Loaded trajectory with {len(md_structures)} frames")

# Build trajectory and analyze structures
trajectory_0p = build_trajectory(md_structures[0])
trajectory_0p.trajectory_from_structures(md_structures, progress=True)

# Analyze site occupations
print_site_occupations(trajectory_0p, "0% anion disorder")

Analyzing Li6PS5Cl with 0% anion site exchange (fully ordered)
Loaded trajectory with 140 frames


100%|█████████████████████████████████████████████████████████████████████| 140/140 [01:25<00:00,  1.64 steps/s]


Site occupation analysis - 0% anion disorder:
----------------------------------------
type 5: 80.20%
type 4: 0.02%
type 3: 0.00%
type 2: 19.78%
type 1: 0.01%





### 2. Partially Disordered Structure (50% Anion Site Exchange)

In [6]:
# Analyze Li6PS5Cl with 50% anion site exchange (maximally disordered)
print("\n" + "=" * 50)
print("Analyzing Li6PS5Cl with 50% anion site exchange (maximally disordered)")
print("=" * 50)

md_structures = Xdatcar('data/Li6PS5Cl_50p_XDATCAR.gz').structures
print(f"Loaded trajectory with {len(md_structures)} frames")

# Build trajectory and analyze structures
trajectory_50p = build_trajectory(md_structures[0])
trajectory_50p.trajectory_from_structures(md_structures, progress=True)

# Analyze site occupations
print_site_occupations(trajectory_50p, "50% anion disorder")


Analyzing Li6PS5Cl with 50% anion site exchange (maximally disordered)
Loaded trajectory with 140 frames


100%|█████████████████████████████████████████████████████████████████████| 140/140 [01:16<00:00,  1.84 steps/s]


Site occupation analysis - 50% anion disorder:
----------------------------------------
type 5: 65.92%
type 4: 2.63%
type 3: 0.00%
type 2: 31.43%
type 1: 0.02%





### 3. Completely Inverted Structure (100% Anion Site Exchange)

In [None]:
# Analyze Li6PS5Cl with 100% anion site exchange (complete anion antisites)
print("\n" + "=" * 50)
print("Analyzing Li6PS5Cl with 100% anion site exchange (complete antisites)")
print("=" * 50)

md_structures = Xdatcar('data/Li6PS5Cl_100p_XDATCAR.gz').structures
print(f"Loaded trajectory with {len(md_structures)} frames")

# Build trajectory and analyze structures
trajectory_100p = build_trajectory(md_structures[0])
trajectory_100p.trajectory_from_structures(md_structures, progress=True)

# Analyze site occupations
print_site_occupations(trajectory_100p, "100% anion disorder")


Analyzing Li6PS5Cl with 100% anion site exchange (complete antisites)
Loaded trajectory with 140 frames


 63%|████████████████████████████████████████████                          | 88/140 [00:51<00:33,  1.54 steps/s]

## Summary

This tutorial demonstrated advanced usage of the site_analysis builder pattern to analyze a complex structure with multiple site types and chemical disorder.

Key findings from our site occupation analysis:

1. In the ordered structure (0% exchange):
   - Li ions predominantly occupy type 5 sites (~80%)
   - Type 2 sites are the secondary preference (~20%)
   - Almost no occupation of other site types

2. With increasing disorder (50% and 100% exchange):
   - Type 5 site occupation decreases
   - Type 2 site occupation increases
   - Type 4 site occupation emerges and grows
   
The builder pattern allowed us to:
1. Define and analyze 5 distinct tetrahedral site types simultaneously
2. Handle alignment between reference and target structures
3. Map between different species (S/Cl) in disordered structures

This approach is applicable to many complex materials where multiple site types must be tracked and where chemical disorder is present.