# ClauseMate Demo: German Pronoun Analysis

This notebook demonstrates the ClauseMate tool for analyzing German pronoun clause mate relationships.

## Overview

ClauseMate identifies critical pronouns (personal, demonstrative, and d-pronouns) and extracts their relationships with other referential expressions in the same sentence.

In [None]:
# Import the main analyzer
import os

import pandas as pd

from src.main import ClauseMateAnalyzer
from src.multi_file.multi_file_batch_processor import MultiFileBatchProcessor

print("✅ ClauseMate modules imported successfully!")

## Single File Analysis

Let's analyze a single TSV file to see how the tool works:

In [None]:
# Initialize the analyzer with adaptive parsing
analyzer = ClauseMateAnalyzer(enable_adaptive_parsing=True)

# Check if sample data exists
sample_file = "data/input/gotofiles/2.tsv"
if os.path.exists(sample_file):
    print(f"📁 Analyzing file: {sample_file}")

    # Perform analysis
    relationships = analyzer.analyze_file(sample_file)

    print(f"🔍 Found {len(relationships)} clause mate relationships")

    # Display first few relationships
    if relationships:
        df = pd.DataFrame([rel.to_dict() for rel in relationships[:5]])
        print("\n📊 Sample relationships:")
        display(df)
else:
    print("⚠️ Sample data not available in this environment")
    print("💡 In a full setup, you would see detailed pronoun analysis results here")

## Multi-File Processing

ClauseMate can process multiple files and resolve cross-chapter coreference chains:

In [None]:
# Initialize multi-file processor
processor = MultiFileBatchProcessor()

# Check if data directory exists
data_dir = "data/input/gotofiles"
if os.path.exists(data_dir):
    print(f"📁 Processing directory: {data_dir}")

    # Process all files
    result = processor.process_files(data_dir)

    print(f"📊 Total relationships: {len(result.unified_relationships)}")
    print(f"🔗 Cross-chapter chains: {len(result.cross_chapter_chains)}")
    print(f"📚 Chapters processed: {len(result.chapter_info)}")

    # Show statistics
    for chapter, info in result.chapter_info.items():
        print(f"  📖 {chapter}: {info['relationships']} relationships")

else:
    print("⚠️ Data directory not available in this environment")
    print("💡 In a full setup, you would see unified multi-file analysis results here")

## Critical Pronouns Analyzed

ClauseMate focuses on these German pronoun types:

- **Third person personal**: er, sie, es, ihm, ihr, ihn, ihnen
- **D-pronouns (pronominal)**: der, die, das, dem, den, deren, dessen, derer
- **Demonstrative**: dieser, diese, dieses, diesem, diesen

In [None]:
# Show the pronoun configuration
from src.config import CRITICAL_PRONOUNS

print("🎯 Critical pronouns analyzed by ClauseMate:")
for category, pronouns in CRITICAL_PRONOUNS.items():
    print(f"\n📝 {category.replace('_', ' ').title()}:")
    print(f"   {', '.join(pronouns)}")

## Try It Yourself!

This notebook provides a starting point for exploring ClauseMate's capabilities. You can:

1. **Upload your own TSV files** in WebAnno TSV 3.3 format
2. **Modify analysis parameters** in the code above
3. **Explore the visualization scripts** in the `scripts/` directory
4. **Run the full test suite** to understand the codebase

For more information, see the [project documentation](../README.md).