# DriftWatch CLI Tutorial

This notebook demonstrates how to use the DriftWatch Command Line Interface (CLI) effectively.

In [None]:
!pip install driftwatch[cli]

### 1. Generate Dummy Data
First, let's create some parquet files to simulate our reference (training) and current (production) datasets.

In [None]:
import numpy as np
import pandas as pd

# Generate reference data (normal)
np.random.seed(42)
ref_data = pd.DataFrame(
    {
        "age": np.random.normal(30, 5, 1000),
        "income": np.random.exponential(50000, 1000),
        "category": np.random.choice(["A", "B", "C"], 1000, p=[0.5, 0.3, 0.2]),
    }
)
ref_data.to_parquet("reference.parquet")

# Generate current data (drifted)
curr_data = pd.DataFrame(
    {
        "age": np.random.normal(35, 5, 1000),  # Drifted mean
        "income": np.random.exponential(60000, 1000),  # Drifted scale
        "category": np.random.choice(
            ["A", "B", "C"], 1000, p=[0.3, 0.4, 0.3]
        ),  # Drifted probas
    }
)
curr_data.to_parquet("current.parquet")
print("Data generated: reference.parquet, current.parquet")

### 2. Check for Drift
Use the `driftwatch check` command to compare the two datasets.

In [None]:
!driftwatch check reference.parquet current.parquet

### 3. Help Command
See all available options.

In [None]:
!driftwatch --help