# Getting Started with DataDojo

This notebook introduces the basics of DataDojo, an AI-powered data preparation learning framework.

## What is DataDojo?

DataDojo is an educational framework that helps you learn data preprocessing through hands-on practice with real datasets. It provides:

- **Guided Learning**: Step-by-step guidance for data preparation tasks
- **Educational Content**: Explanations, analogies, and examples for concepts
- **Progress Tracking**: Track your learning journey across projects
- **Domain-Specific Projects**: Learn with datasets from e-commerce, healthcare, finance, and more

## Installation

First, make sure DataDojo is installed:

```bash
pip install -e .
```

## Creating Your First Dojo

In [None]:
from datadojo import create_dojo

# Create a DataDojo instance
dojo = create_dojo()

print("DataDojo initialized successfully!")

## Exploring Available Projects

DataDojo comes with pre-configured projects across different domains and difficulty levels.

In [None]:
# List all available projects
projects = dojo.list_projects()

print(f"Found {len(projects)} available projects:\n")

for project in projects:
    print(f"ðŸ“Š {project.name}")
    print(f"   Domain: {project.domain.value}")
    print(f"   Difficulty: {project.difficulty.value}")
    print(f"   {project.description}")
    print()

## Filtering Projects

You can filter projects by domain or difficulty.

In [None]:
from datadojo.contracts.dojo_interface import Domain, Difficulty

# Get beginner-level projects
beginner_projects = dojo.list_projects(difficulty=Difficulty.BEGINNER)

print(f"Beginner Projects ({len(beginner_projects)}):")
for proj in beginner_projects:
    print(f"  - {proj.name}")

## Starting a Project

Let's start with a beginner project to learn the basics.

In [None]:
# Get a specific project
project_list = dojo.list_projects(domain=Domain.ECOMMERCE, difficulty=Difficulty.BEGINNER)

if project_list:
    project = dojo.start_project(project_list[0].id)
    
    print(f"Started Project: {project.name}")
    print(f"\nDescription: {project.description}")
    print(f"\nExpected Outcomes:")
    for outcome in project.expected_outcomes:
        print(f"  âœ“ {outcome}")
else:
    print("No beginner e-commerce projects found")

## Understanding Educational Concepts

DataDojo provides detailed explanations for data science concepts.

In [None]:
# Get educational interface
educational = dojo.get_educational_interface()

# Learn about missing values
concept = educational.get_concept_explanation("missing_values")

print(f"ðŸ“š {concept.title}\n")
print(f"Explanation:\n{concept.explanation}\n")

if concept.analogies:
    print("Analogies:")
    for analogy in concept.analogies:
        print(f"  ðŸ’¡ {analogy}")

if concept.examples:
    print(f"\nExample Code:\n{concept.examples[0]}")

## Getting Help and Hints

When you're stuck, DataDojo can provide context-aware hints.

In [None]:
from datadojo.contracts.dojo_interface import DifficultyLevel

# Ask for help with missing values
hint = educational.get_help(
    "I have missing values in my age column",
    difficulty=DifficultyLevel.BEGINNER
)

print(f"ðŸ’¡ Hint: {hint}")

## Listing Available Concepts

Browse all educational concepts by difficulty level.

In [None]:
# List all beginner concepts
concepts = educational.list_concepts(difficulty=DifficultyLevel.BEGINNER)

print(f"Beginner Concepts ({len(concepts)}):")
for concept in concepts:
    summary = concept.get_summary(max_length=80)
    print(f"\nðŸ“– {concept.title}")
    print(f"   {summary}")

## Next Steps

Now that you understand the basics, explore these notebooks:

1. **02_data_cleaning_workflow.ipynb** - Learn data cleaning with pipelines
2. **03_progress_tracking.ipynb** - Track your learning progress
3. **04_custom_pipelines.ipynb** - Build custom data processing pipelines

Happy learning! ðŸŽ“