A high-performance tool for analyzing atomic stacking configurations in bilayer materials. This repository contains the stacking analysis code used in:
Aditya, A. et al., “Emerging Ferroelectric Domains: Stacking and Rotational Landscape of MoS₂ Moiré Bilayers,” ACS Nano (2026). Accepted; in press.
Stacking Analysis is a computational tool designed for identifying and classifying atomic stacking configurations in bilayer transition metal dichalcogenides (TMDs) and similar layered materials. It uses spatial partitioning and parallel processing to efficiently analyze large-scale atomic structures from LAMMPS molecular dynamics simulations.
The tool was developed for analyzing moiré TMD bilayers simulated using classical MD with hybrid Stillinger-Weber and Kolmogorov-Crespi potentials, following the methodology established in the published papers.
- 🚀 High Performance: Parallel processing with automatic CPU core detection
- 🎯 Accurate Classification: Identifies 6 distinct stacking types (AA, AA', A'B, AB, AB', BA)
- 📊 Comprehensive Output: Detailed statistics and labeled atomic structures
- 🔧 Flexible Configuration: Customizable parameters for different material systems
- 💻 Easy to Use: Simple command-line interface and Python API
- 📁 LAMMPS Integration: Direct support for LAMMPS dump files
If you use this software in your research, please cite our paper:
@article{doi:10.1021/acsnano.5c14092,
author = {Aditya, Anikeya and Irie, Ayu and Dasgupta, Nabankur and Kalia, Rajiv K. and Nakano, Aiichiro and Vashishta, Priya},
title = {Emerging Ferroelectric Domains: Stacking and Rotational Landscape of MoS2 Moiré Bilayers},
journal = {ACS Nano},
volume = {0},
number = {0},
pages = {null},
year = {0},
doi = {10.1021/acsnano.5c14092},
note ={PMID: 41648955},
URL = {
https://doi.org/10.1021/acsnano.5c14092
},
eprint = {
https://doi.org/10.1021/acsnano.5c14092
}
}Paper: [Link to published paper]
DOI: 10.1021/acsnano.5c14092
The classification algorithm identifies stacking configurations based on local atomic coordination patterns. The accuracy of the classification has been validated against both visual inspection and potential energy landscape calculations.
Figure: (a) Potential energy landscape showing energy variation with stacking configuration. High-energy regions (red/yellow, +4.5 meV) correspond to unfavorable stackings (AA, X), while low-energy regions (blue, -4.5 meV) correspond to stable AB stacking. (b) Classified stacking domains showing the correlation between energy landscape and identified stacking types. The classification accurately captures domain boundaries at energy transition regions.
This tool is designed for bilayer TMD systems simulated with hybrid Stillinger-Weber (SW) and Kolmogorov-Crespi (KC) potentials, following the standard approach for moiré TMD bilayers in classical MD:
- Stillinger-Weber: Governs intra-layer interactions within each MoS₂ monolayer
- Kolmogorov-Crespi: Governs inter-layer interactions between the two layers
Atom Type Convention:
In LAMMPS, atom types are numbered 1-6 even though the system contains only Mo and S atoms. This is required by the hybrid SW/KC force field implementation:
Layer 1 (Bottom): Type 1 (Mo), Type 2 (S-bottom), Type 3 (S-top)
Layer 2 (Top): Type 4 (Mo), Type 5 (S-bottom), Type 6 (S-bop)
Each layer's atoms must have distinct type numbers for the force field to correctly apply intra-layer (SW) and inter-layer (KC) interactions. This convention follows the TWISTER paper and is standard practice for investigating moiré TMD bilayers using these force fields.
Analysis Target: By default, the code analyzes Type 4 atoms (top layer Mo), as these best represent the stacking configuration between layers. More detailed description is in the paper mentioned above.
The distance tolerance r_tol = 0.614 Å was carefully chosen through validation against both:
- Visual inspection of stacking domain boundaries
- Potential energy landscape calculations
This value provides optimal discrimination between stacking types while accurately capturing domain boundaries at energy transition regions (see figure above). The correlation between classified domains and energy landscape confirms that this tolerance correctly identifies the local coordination environment.
The spatial partition size of 150 Å was determined through systematic grid search optimization, balancing:
- Computational efficiency: Large enough for effective NumPy vectorization within each voxel
- Parallel scaling: Small enough for good load balancing across CPU cores with multiprocessing
- Memory management: Prevents out-of-memory errors even for very large systems (>1.33 B atoms)
For typical moiré bilayer systems, 150 Å voxels provide the best performance across system sizes ranging from 1M to 1.33 B atoms.
- Python 3.7 or higher
- NumPy >= 1.20.0
- Pandas >= 1.3.0
- Numba >= 0.54.0
- Matplotlib >= 3.4.0 (optional, for visualization)
# Clone the repository
git clone https://github.com/anikeya9/stacking-analysis.git
cd stacking-analysis
# Install dependencies
pip install -r requirements.txtpip install stacking-analysis# Basic usage
python stacking_cli.py dump.lammpstrj
# Specify output file
python stacking_cli.py dump.lammpstrj -o results.stack
# Custom parameters
python stacking_cli.py dump.lammpstrj --r-tol 0.7 --voxel-size 200
# Use specific number of CPU cores
python stacking_cli.py dump.lammpstrj --processes 8
# Quiet mode (no progress output)
python stacking_cli.py dump.lammpstrj --quietfrom stacking_analysis import StackingAnalyzer
# Create analyzer
analyzer = StackingAnalyzer()
# Load, analyze, and save (method chaining)
analyzer.load_structure('dump.lammpstrj').analyze().save_results('output.stack')
# Get statistics
stats = analyzer.get_statistics()
print(stats)The input should be a single-frame LAMMPS dump file from a bilayer simulation. The code automatically reads the column headers from line 9 of the dump file.
Required columns (must be present in order):
id- Unique atom identifiertype- Atom type (1-6 for SW/KC bilayer simulations)x- X coordinatey- Y coordinatez- Z coordinate
Additional columns are automatically ignored - The code only uses the first 5 required columns. You can have additional columns like fx, fy, fz, c_myPE, velocities, stresses, etc., and they will be read but not used in analysis.
Example LAMMPS dump file format:
ITEM: TIMESTEP
1000
ITEM: NUMBER OF ATOMS
12345
ITEM: BOX BOUNDS pp pp pp
0.0 100.0
0.0 100.0
0.0 50.0
ITEM: ATOMS id type x y z fx fy fz c_myPE
1 1 0.000 0.000 0.000 0.0 0.0 0.0 -3.2
2 4 1.584 0.915 3.172 0.0 0.0 0.0 -3.1
3 6 0.000 1.830 1.586 0.0 0.0 0.0 -2.8
...
Or with additional columns (stresses, etc.):
ITEM: ATOMS id type x y z fx fy fz c_myPE v_sxx v_syy v_szz v_sxy v_sxz v_syz
1 1 0.000 0.000 0.000 0.0 0.0 0.0 -3.2 1.0 2.0 3.0 0.0 0.0 0.0
...
If you have a multi-frame LAMMPS trajectory file, extract a single frame first:
# Method 1: Using tail (get last frame)
# Replace N with your actual number of atoms
tail -n $((N + 9)) dump.lammpstrj > single_frame.dump
# Method 2: Using split (split into individual frames)
# This creates frame_00000, frame_00001, etc.
split -l $((N + 9)) -d -a 5 dump.lammpstrj frame_
# Then analyze a single frame
python stacking_cli.py frame_00000Note: Replace N with your actual number of atoms. The 9 additional lines are the LAMMPS dump header.
The output .stack file contains all atoms of the specified type (default: type 4, top-layer Mo) with additional columns:
S_TYPE: Stacking type classification (AA, AA', A'B, AB, AB', BA, or X)S_CODE: Numeric code for the stacking type (0-6)
| Type | Code | Description | Energy |
|---|---|---|---|
| BA | 0 | BA stacking | Low |
| AB | 1 | AB stacking | Low |
| AA' | 2 | AA' stacking | Low |
| A'B | 3 | A'B stacking | Medium |
| AB' | 4 | AB' stacking | High |
| AA | 5 | AA stacking | High |
| X | 6 | Unclassified/transition | High |
from stacking_analysis import StackingAnalyzer
analyzer = StackingAnalyzer(
r_tol=0.614, # Distance tolerance (Å) - validated against energy landscape
voxel_size=150.0, # Spatial partition size (Å) - optimized for performance
s_neighbor_distance=3.0, # S-neighbor distance threshold (Å)
n_processes=8, # Number of CPU cores
verbose=True # Print progress
)
analyzer.load_structure('dump.lammpstrj', skiprows=9)
analyzer.analyze()
analyzer.save_results('output.stack', atom_type=4)# Get the results DataFrame
results_df = analyzer.results_df
# Filter by stacking type
aa_atoms = results_df[results_df['S_TYPE'] == 'AA']
ab_atoms = results_df[results_df['S_TYPE'] == 'AB']
# Get full dataset with coordinates
full_data = analyzer.df
# Export to CSV
results_df.to_csv('results.csv', index=False)Check if your LAMMPS dump file is correctly formatted before running analysis:
python stacking_cli.py dump.lammpstrj --validateOutput:
✓ Valid LAMMPS dump file
File: dump.lammpstrj
View file contents without running analysis:
python stacking_cli.py dump.lammpstrj --show-metadataOutput:
LAMMPS Dump File Metadata
==================================================
File: dump.lammpstrj
Timestep: 1000
Number of atoms: 12345
Box bounds:
x: 0.000 to 100.000
y: 0.000 to 100.000
z: 0.000 to 50.000
Columns: ['id', 'type', 'x', 'y', 'z', 'fx', 'fy', 'fz', 'c_myPE']
See the examples/ directory for more usage examples:
usage_example.py: Comprehensive Python API examples- Additional example data files (coming soon)
The classification algorithm:
- Spatial Partitioning: Divides the structure into overlapping 150 Å voxels
- Local Environment Analysis: For each target atom (type 4 Mo), identifies neighbors within distance thresholds
- Pattern Recognition: Classifies stacking based on local coordination vector patterns
- Parallel Processing: Processes voxels independently across multiple CPU cores
- Validation: Results validated against potential energy landscape
-
r_tol (0.614 Å): Distance tolerance for identifying nearest neighbors in the stacking direction. This value was validated by:
- Visual inspection of domain boundaries
- Correlation with potential energy landscape transitions
- Accurate identification of stacking types matching energy minima/maxima
-
voxel_size (150 Å): Size of spatial partitions, optimized through grid search for:
- Best balance between NumPy vectorization and multiprocessing
- Memory efficiency for large systems
- Minimal overhead while maintaining parallel scalability
-
s_neighbor_distance (3.0 Å): Maximum distance for identifying sulfur neighbors, based on MoS₂ lattice structure
These defaults are optimized for MoS₂ bilayers simulated with SW/KC potentials but can be adjusted for other materials.
Problem: "FileNotFoundError"
# Solution: Check your input file path
ls dump.lammpstrj # Verify file existsProblem: "Column not found" or parsing errors
# Solution: Verify your dump file has the required columns
head -n 9 dump.lammpstrj # Check the ITEM: ATOMS line
python stacking_cli.py dump.lammpstrj --validateProblem: Slow performance
# Solution: Adjust voxel size or number of processes
python stacking_cli.py dump.lammpstrj --voxel-size 200 --processes 16Problem: All atoms classified as 'X'
# Solution: Adjust distance tolerance
# Try values between 0.5-0.8 Å
python stacking_cli.py dump.lammpstrj --r-tol 0.7Problem: High memory usage
# Solution: Increase voxel size to reduce memory per core
python stacking_cli.py dump.lammpstrj --voxel-size 200 --processes 8Problem: Multi-frame dump file
# Solution: Extract single frame first
tail -n $((N_atoms + 9)) dump.lammpstrj > single_frame.dump
python stacking_cli.py single_frame.dumpContributions are welcome! Please feel free to:
- Report bugs by opening an issue
- Suggest features or improvements
- Submit pull requests
pytest tests/This project follows PEP 8 style guidelines.
This project is licensed under the MIT License - see the LICENSE file for details.
- Anikeya Aditya - GitHub
- This research was supported by the U.S. Department of Energy, Office of Basic Energy Sciences, Division of Materials Sciences and Engineering, Neutron Scattering and Instrumentation Sciences program under award DE-SC0023146.
- Computational resources from USC CARC
- Based on research published in ACS Nano
For questions or support:
- GitHub Issues: https://github.com/anikeya9/stacking-analysis/issues
