# Agile Education Analyzer - Quick Start Guide

This notebook demonstrates the basic usage of the Agile Education Analysis Framework for analyzing Ukrainian educational transcripts.

## Prerequisites

Make sure you have installed the package:
```bash
pip install -e .
```

In [None]:
# Import necessary modules
from agile_education_analyzer import (
    UkrainianDiscourseDetector,
    StatisticalAnalyzer,
    ResearchVisualizer,
    ResearchOutputGenerator,
    setup_logger
)
from agile_education_analyzer.data_structures import TranscriptSegment
from datetime import timedelta
import pandas as pd

# Setup logging
logger = setup_logger(level='INFO')

## 1. Ukrainian Discourse Pattern Detection

Detect questions, confusion, understanding confirmations, and code-switching in Ukrainian text.

In [None]:
# Initialize detector
detector = UkrainianDiscourseDetector()

# Sample Ukrainian texts
texts = [
    "Що таке спринт?",
    "Не розумію як це працює",
    "Зрозуміло, дякую",
    "Використайте function для створення component"
]

# Detect patterns
for text in texts:
    print(f"\nText: {text}")
    print(f"Question: {detector.detect_questions(text)}")
    print(f"Confusion: {detector.detect_confusion(text)}")
    print(f"Understanding: {detector.detect_understanding(text)}")
    print(f"Code-switching: {detector.detect_code_switching(text)}")

## 2. Analyze Transcript Segments

Apply comprehensive pattern detection to transcript segments.

In [None]:
# Create sample segments
segments = [
    TranscriptSegment(
        index=0,
        start_time=timedelta(seconds=0),
        end_time=timedelta(seconds=5),
        text="Що таке гнучка розробка?",
        speaker="Student_1",
        speaker_role="student"
    ),
    TranscriptSegment(
        index=1,
        start_time=timedelta(seconds=6),
        end_time=timedelta(seconds=12),
        text="Пояснюю: agile це методологія розробки",
        speaker="Teacher",
        speaker_role="teacher"
    )
]

# Analyze each segment
for segment in segments:
    analyzed = detector.analyze_segment(segment)
    print(f"\n{analyzed.speaker}: {analyzed.text}")
    print(f"  Is Question: {analyzed.is_question}")
    print(f"  Is Confusion: {analyzed.is_confusion}")
    print(f"  Technical Terms: {analyzed.technical_terms}")

## 3. Statistical Analysis

Perform rigorous statistical testing appropriate for educational research.

In [None]:
# Initialize statistical analyzer
stats = StatisticalAnalyzer(significance_level=0.05, use_bonferroni=True)

# Sample data: participation rates across sprints
data = pd.DataFrame({
    'sprint_number': [1, 1, 1, 2, 2, 2, 3, 3, 3],
    'participation_rate': [0.45, 0.52, 0.48, 0.61, 0.58, 0.63, 0.72, 0.68, 0.75]
})

# Compare across sprints
results = stats.compare_sprints(data, 'participation_rate')
print("\nKruskal-Wallis Test Results:")
print(results['kruskal_wallis'])

if 'pairwise_comparisons' in results:
    print("\nPairwise Comparisons:")
    for comparison in results['pairwise_comparisons']:
        print(f"  Sprint {comparison['sprint_pair'][0]} vs {comparison['sprint_pair'][1]}: "
              f"p={comparison['p_value']:.4f}, d={comparison['cohens_d']:.3f} "
              f"({comparison['effect_interpretation']})")

## 4. Visualization

Create publication-ready visualizations with Ukrainian text support.

In [None]:
# Initialize visualizer
viz = ResearchVisualizer(dpi=150)  # Lower DPI for notebook display

# Create sample engagement data
engagement_data = pd.DataFrame({
    'sprint_number': [1, 2, 3],
    'avg_questions_per_student': [2.1, 3.4, 4.2],
    'participation_rate': [0.48, 0.61, 0.72],
    'confusion_rate': [0.15, 0.10, 0.05],
    'understanding_rate': [0.65, 0.75, 0.85]
})

# Plot engagement evolution
viz.plot_engagement_evolution(engagement_data, 'engagement_evolution.png')
print("\nEngagement evolution plot saved!")

## 5. Research Outputs

Generate LaTeX tables and extract quotations for academic papers.

In [None]:
# Initialize output generator
output_gen = ResearchOutputGenerator()

# Generate LaTeX table from statistical results
latex_table = output_gen.generate_statistical_test_table(results)
print("\nLaTeX Table for Statistical Results:")
print(latex_table)

# Extract quotations
quotations = output_gen.extract_quotations(segments, criteria='questions', max_quotes=5)
print("\nExtracted Quotations:")
for i, quote in enumerate(quotations, 1):
    print(f"\n{i}. {output_gen.format_quotation_latex(quote)}")

## 6. Complete Analysis Workflow

For analyzing actual VTT files, you would typically:

```python
from agile_education_analyzer.vtt_processor import VTTProcessor
from agile_education_analyzer.speaker_diarization import SpeakerDiarization

# Initialize processors
vtt_processor = VTTProcessor()
speaker_diarizer = SpeakerDiarization()

# Parse VTT file
segments = vtt_processor.parse_vtt_file('transcripts/Web2.П01.Вступ до гнучкої розробки.vtt')

# Identify speakers
segments = speaker_diarizer.identify_speakers(segments)

# Analyze discourse patterns
for segment in segments:
    detector.analyze_segment(segment)

# Generate complete analysis and visualizations
# ... (continue with statistical analysis, visualization, etc.)
```

## Next Steps

1. **Explore the full API** in the `USAGE_GUIDE.md`
2. **Read CLAUDE.md** for detailed guidance on working with this framework
3. **Check the tests** in `tests/` directory for more usage examples
4. **Examine `run_analysis_example.py`** for complete analysis pipelines

For questions or issues, refer to the documentation or create an issue on GitHub.