# Tutorial 01: Quickstart (Silicon)

Welcome to the **MLIP AutoPipeC** quickstart tutorial. In this notebook, we will demonstrate the "Zero-Configuration" workflow for generating a Machine Learning Interatomic Potential (MLIP) for Silicon.

We will run the pipeline in **Mock Mode**, which simulates the physics engines (DFT, MD) to allow this tutorial to run instantly on any machine without external dependencies.

In [None]:
import os
import sys
import yaml
from pathlib import Path

# Add src to path if running from repo root
sys.path.append(str(Path.cwd() / "src"))

from mlip_autopipec.main import main

# ENABLE MOCK MODE
# This tells the system to use simulated physics engines
os.environ["PYACEMAKER_MOCK_MODE"] = "1"

## 1. Prepare Data

We need a small initial dataset to start the active learning loop. We will create a simple `.xyz` file containing two Silicon structures.

In [None]:
data_content = """2
Lattice="5.43 0.0 0.0 0.0 5.43 0.0 0.0 0.0 5.43" Properties=species:S:1:pos:R:3:forces:R:3 energy=-10.0
Si 0.0 0.0 0.0 0.0 0.0 0.0
Si 1.35 1.35 1.35 0.0 0.0 0.0
"""

data_path = Path.cwd() / "tutorials" / "data.xyz"
data_path.parent.mkdir(exist_ok=True)

with open(data_path, "w") as f:
    f.write(data_content)

print(f"Created dataset at {data_path}")

## 2. Configure the Pipeline

We define the project settings in a YAML file. Notice how we only specify high-level goals.

In [None]:
config_data = {
    "project": {"name": "QuickstartSi"},
    "training": {"dataset_path": str(data_path), "max_epochs": 1},
    "orchestrator": {"max_iterations": 1},
    "exploration": {"strategy": "random"},
    "selection": {"method": "random"},
    "validation": {"run_validation": True},
    "dft": {"pseudopotentials": {"Si": "Si.upf"}},
    "oracle": {"method": "dft"}
}

config_path = Path.cwd() / "tutorials" / "quickstart_config.yaml"

with open(config_path, "w") as f:
    yaml.dump(config_data, f)

print(f"Created config at {config_path}")

## 3. Run the Orchestrator

We invoke the main entry point, passing our configuration file. The system will:
1. Train an initial potential.
2. Explore new structures (Mocked).
3. Select candidates.
4. Label them with DFT (Mocked).
5. Retrain and Validate.

In [None]:
# Simulate Command Line Arguments
sys.argv = ["mlip-pipeline", str(config_path)]

# Run the pipeline
try:
    main()
    print("\nWorkflow Finished Successfully!")
except SystemExit as e:
    if e.code != 0:
        print(f"Workflow failed with code {e.code}")
    else:
        print("\nWorkflow Finished Successfully!")
except Exception as e:
    print(f"An error occurred: {e}")

## 4. Check Results

The pipeline should have created a `release_v1.0.0.zip` containing the final potential.

In [None]:
release_file = Path("release_v1.0.0.zip")
if release_file.exists():
    print("Release file found!")
else:
    print("Release file missing.")