# DPP-PDC Quick Start Guide

This notebook demonstrates how to use DPP-PDC for phase diagram construction with active learning.

## 1. Installation

First, install the package (if not already installed):

```bash
pip install -e .
```

In [None]:
# Import required modules
from dpp_pdc import ExperimentConfig
from dpp_pdc.run_experiments import run_single_temperature_experiment
from dpp_pdc.data_preprocessing import standardize_data
import matplotlib.pyplot as plt
import pandas as pd

## 2. Load and Explore Data

In [None]:
# Load a dataset
data_path = "../datasets/Cu-Mg_Zn_850K.csv"
data = pd.read_csv(data_path)

print(f"Dataset shape: {data.shape}")
print(f"Number of phases: {data['phase_name'].nunique()}")
print(f"\nPhase distribution:")
print(data['phase_name'].value_counts())

## 3. Configure and Run Experiment

In [None]:
# Create experiment configuration
config = ExperimentConfig()

# Configure for a quick test
config.set_temperatures(["850K"])
config.set_algorithms(["RS", "PDC", "DPP-PDC"])
config.set_batch_sizes([10])
config.n_runs = 3  # Use more runs (e.g., 10) for publication-quality results

print(config.summary())

In [None]:
# Run the experiment
# Note: This may take several minutes depending on configuration
result = run_single_temperature_experiment(
    temperature="850K",
    file_path_template="../datasets/Cu-Mg_Zn_{temperature}.csv",
    algorithms=config.algorithms,
    batch_sizes=config.batch_sizes,
    n_runs=config.n_runs
)

print(f"\nExperiment completed!")
print(f"Results saved to: {result['result_dir']}")

## 4. Visualize Results

In [None]:
# Load and display the phase discovery plot
from IPython.display import Image
Image(filename=f"{result['result_dir']}/phase_discovery_batch10_850K.png")

## 5. Using Configuration Files

For reproducible experiments, you can use TOML configuration files:

In [None]:
# Load configuration from file
config_from_file = ExperimentConfig.from_toml("../configs/default.toml")
print(config_from_file.summary())

## 6. Command Line Usage

You can also run experiments from the command line:

```bash
# Simple run
dpp-pdc run --data datasets/Cu-Mg_Zn_850K.csv --algorithm DPP-PDC

# With config file
dpp-pdc run --config configs/default.toml

# Generate a new config file
dpp-pdc init-config --output my_experiment.toml
```