# CoTLab Tutorial

**CoTLab** is a research toolkit for studying Chain-of-Thought (CoT) reasoning in LLMs.

In this tutorial you will learn to:
1. Load a model with CoTLab's backend system
2. Run experiments using CoTLab's experiment API
3. Log and save results with ExperimentLogger
4. Analyze results with the analysis module

> **Note**: We use GPT-2 here for fast demo. For real experiments, use larger models like MedGemma.

## 1. Imports

In [None]:
import os

os.environ["TOKENIZERS_PARALLELISM"] = "false"  # Suppress HF warning

from cotlab.backends import TransformersBackend
from cotlab.datasets.loaders import TutorialDataset
from cotlab.experiments import CoTFaithfulnessExperiment
from cotlab.logging import ExperimentLogger
from cotlab.prompts.strategies import (
    ChainOfThoughtStrategy,
    ContrarianStrategy,
    DirectAnswerStrategy,
)

In [None]:
NUM_SAMPLES = 10
MAX_TOKENS = 256

## 2. Load Backend and Model

In [None]:
# Create backend and load model
backend = TransformersBackend(device="auto", dtype="bfloat16")
backend.load_model("openai-community/gpt2")

# Load dataset
dataset = TutorialDataset(path="../data/tutorial.json")
print(f"Dataset: {len(dataset)} samples (using {NUM_SAMPLES} for this run)")

## 3. Define Experiment and Strategies

In [None]:
# Create CoTLab experiment
experiment = CoTFaithfulnessExperiment(
    name="tutorial_comparison",
    num_samples=NUM_SAMPLES,
)

# Define prompting strategies
strategies = {
    "contrarian": ContrarianStrategy(),
    "chain_of_thought": ChainOfThoughtStrategy(),
    "direct_answer": DirectAnswerStrategy(),
}

print(f"Experiment: {experiment.name}")
print(f"Strategies: {list(strategies.keys())}")
print(f"Samples: {NUM_SAMPLES}, Max tokens: {MAX_TOKENS}")

## 4. Run Experiments with Logging

Use CoTLab's `ExperimentLogger` to save results to JSON files.

In [None]:
from pathlib import Path

results = {}

for name, strategy in strategies.items():
    print(f"\n{'=' * 60}")
    print(f"Running: {name}")
    print(f"{'=' * 60}")

    # Create logger for this run
    logger = ExperimentLogger(f"../outputs/tutorial_{name}")

    # Log configuration
    logger.log_config(
        {
            "experiment": experiment.name,
            "strategy": name,
            "model": "gpt2",
        }
    )

    # Run CoTLab experiment with logger
    result = experiment.run(
        backend=backend,
        dataset=dataset,
        prompt_strategy=strategy,
        logger=logger,
        max_new_tokens=MAX_TOKENS,
    )

    # Save results using logger
    output_path = logger.save_results(result)
    print(f"Results saved to: {output_path}")

    results[name] = result

## 5. Analyze Results with CoTLab Analysis Module

Use `analyse_experiments` for proper answer extraction and comparison.

In [None]:
from cotlab.analyse_experiments import (
    analyse_experiments_dir,
    export_to_csv,
    print_analysis_report,
)

# Analyze all saved results
results_dir = Path("../outputs")
all_results = analyse_experiments_dir(results_dir)

# Print comprehensive analysis report
print_analysis_report(all_results, "Tutorial Experiment Analysis")

## 6. Export Results to CSV

In [None]:
# Export analysis to CSV for further processing
csv_path = results_dir / "tutorial_analysis.csv"
export_to_csv(all_results, csv_path)

print(f"\nCSV exported to: {csv_path}")

## 7. Next Steps

**Other CoTLab experiments**:
- `LogitLensExperiment` - See what model "thinks" at each layer
- `AttentionAnalysisExperiment` - Analyze attention patterns
- `ProbingClassifierExperiment` - Train probes on hidden states

**For future use**:
- Use `python -m cotlab.runner` CLI with Hydra configs
- See `conf/` folder for configuration options
- Use larger models: `google/medgemma-27b-text-it`