# Silicon Diffusion Simulation Tutorial
This notebook provides a step-by-step guide for simulating and analyzing the diffusion process of silicon using GPUMD. It covers input preparation, NEP training, model generation, simulation setup, and data analysis.

## Required Dependencies
- Python 3.x
- numpy
- matplotlib
- GPUMD (for MD simulation)
- NEP (for potential training, if needed)

Install Python packages with:
```python
!pip install numpy matplotlib
```
GPUMD: [https://gpumd.org/](https://gpumd.org/)
NEP: [https://github.com/zhyan0603/NEP](https://github.com/zhyan0603/NEP)

## Step 1: Generate Atomic Structure
You can use the provided MATLAB script (`create_xyz.m`) to generate a silicon supercell in `model.xyz`.
Alternatively, use Python or ASE to generate the structure.

In [None]:
# Example: Generate silicon supercell with ASE
from ase.build import bulk
from ase.io import write
si = bulk('Si', 'diamond', a=5.43)
supercell = si.repeat((10, 10, 10))
write('model.xyz', supercell)

## Step 2: NEP Potential Training (if needed)
If you need to train a NEP model, use the NEP package and your training dataset. Example training command:
```bash
nep train --input training_data.extxyz --output Si_2022_NEP3_3body.txt
```
For this tutorial, a pre-trained NEP file is provided.

In [None]:
# Check input files
import os
for fname in ['model.xyz', 'run.in', 'Si_2022_NEP3_3body.txt']:
    print(f'{fname}:', 'Found' if os.path.exists(fname) else 'Missing')

## Step 3: GPUMD Simulation Setup
The `run.in` file configures the MD simulation. Example:
```
```bash
gpumd
```
This generates `msd.out`, `sdc.out`, and other output files.

## Step 4: Analyze Diffusion Data
The `msd.out` file contains mean squared displacement (MSD) data. The `sdc.out` file contains self-diffusion coefficient (SDC) data. You can use Python to read and plot these quantities.

In [None]:
# Example: Plot MSD and SDC from output files
import numpy as np
import matplotlib.pyplot as plt
msd_data = np.loadtxt('/opt/GPUMD/examples/09/msd.out')
time = msd_data[:, 0]
msd = msd_data[:, 1:4].sum(axis=1)
plt.plot(time, msd)
plt.xlabel('Time (ps)')
plt.ylabel('MSD (Å$^2$)')
plt.title('Mean Squared Displacement of Si')
plt.show()
if os.path.exists('sdc.out'):
    sdc_data = np.loadtxt('sdc.out')
    time_sdc = sdc_data[:, 0]
    sdc = sdc_data[:, 1:4].sum(axis=1)
    plt.plot(time_sdc, sdc)
    plt.xlabel('Time (ps)')
    plt.ylabel('SDC (Å$^2$/ps)')
    plt.title('Self-Diffusion Coefficient of Si')
    plt.show()

## Reference
For details on the method and further reading, see the GPUMD documentation: [https://gpumd.org/](https://gpumd.org/)