# Sudoku AI Solver - CSP with MRV Heuristic

This notebook demonstrates an AI-powered Sudoku solver using:
- **Constraint Satisfaction Problem (CSP)** formulation
- **Backtracking** algorithm
- **Minimum Remaining Values (MRV)** heuristic

---

## 1. Importing Required Modules

In [15]:
# Import custom modules
from sudoku_dataset import SudokuDataset
from sudoku_solver import SudokuSolver
from sudoku_validator import SudokuValidator
from sudoku_metrics import MetricsTracker
from sudoku_utils import print_grid, compare_grids, get_difficulty_label

# Standard libraries
import time
import numpy as np
import pandas as pd
import numpy as np

print("All modules imported successfully!")

All modules imported successfully!


In [16]:
# Initialize dataset with sudoku_filtered.csv
dataset = SudokuDataset(csv_path='sudoku_filtered.csv')

# This will load  5000 rows
df_filtered = dataset.filter_dataset(n_samples=5000, difficulty_range=None, random_state=42)

print(f"\n Ready to solve puzzles!")

Loading dataset from sudoku_filtered.csv...
Loaded first 50000 puzzles (subset)
Total puzzles loaded: 5000
Columns: ['source', 'question', 'answer', 'rating']
Filtered to 5000 puzzles

Difficulty distribution:
  Min rating: 0
  Max rating: 313
  Average rating: 22.06

 Ready to solve puzzles!


## 2. Initialize AI Solver Components

In [4]:
# Initialize solver and validator
solver = SudokuSolver()
validator = SudokuValidator()
metrics_tracker = MetricsTracker()

print("AI Solver initialized with MRV heuristic")
print("Constraint validator ready")
print("Metrics tracker initialized")

AI Solver initialized with MRV heuristic
Constraint validator ready
Metrics tracker initialized


In [17]:
# Get a random puzzle from dataset
print("Selecting random puzzle...\n")
puzzle_data = dataset.get_random_puzzle()

# Extract puzzle components
puzzle_grid = puzzle_data['puzzle']
expected_solution = puzzle_data['solution']
rating = puzzle_data['rating']
source = puzzle_data['source']

# Display puzzle info
print(f"Puzzle Source: {source}")
print(f"Difficulty Rating: {rating} ({get_difficulty_label(rating)})")

# Display original puzzle with fancy style
print_grid(puzzle_grid, "ORIGINAL PUZZLE")

Selecting random puzzle...

Puzzle Source: 01_file1
Difficulty Rating: 27 (Hard)

ORIGINAL PUZZLE
┌─────────┬─────────┬─────────┐
│ .  .  . │ .  3  . │ .  .  . │
│ .  .  1 │ .  .  . │ 9  .  4 │
│ .  4  8 │ .  .  . │ .  3  . │
├─────────┼─────────┼─────────┤
│ .  3  5 │ 7  2  . │ .  .  . │
│ 4  .  . │ 9  .  . │ 7  .  . │
│ .  .  9 │ .  .  . │ .  .  2 │
├─────────┼─────────┼─────────┤
│ .  .  . │ 4  5  . │ .  .  3 │
│ 5  .  3 │ .  .  . │ .  7  . │
│ .  9  . │ .  .  7 │ 2  .  5 │
└─────────┴─────────┴─────────┘



In [18]:
# Solve the puzzle using 
print("AI Solver starting...")
print("Using: MRV Heuristic + Backtracking Algorithm\n")

# Record start time
start_time = time.time()

# Solve!
solved_grid, success = solver.solve(puzzle_grid)

# Record end time
end_time = time.time()
execution_time = end_time - start_time

# Get solver metrics
solver_metrics = solver.get_metrics()

print(f"Solving completed in {execution_time:.4f} seconds")

AI Solver starting...
Using: MRV Heuristic + Backtracking Algorithm

Solving completed in 0.7026 seconds


In [19]:
# Display AI solution
if success:
    print_grid(solved_grid, "AI SOLUTION")
    
    # Verify solution
    is_valid, errors = validator.verify_solution(solved_grid)
    
    if is_valid:
        print("Solution verified: Valid Sudoku!")
    else:
        print("Solution contains errors:")
        for error in errors:
            print(f"  - {error}")
    
    # Check against expected solution
    solution_correct = (solved_grid == expected_solution)
    if solution_correct:
        print("Solution matches expected answer from dataset!")
    else:
        print("Solution differs from expected answer")
else:
    print("Failed to solve puzzle")
    solution_correct = False


AI SOLUTION
┌─────────┬─────────┬─────────┐
│ 9  7  2 │ 5  3  4 │ 8  1  6 │
│ 3  5  1 │ 6  7  8 │ 9  2  4 │
│ 6  4  8 │ 1  9  2 │ 5  3  7 │
├─────────┼─────────┼─────────┤
│ 8  3  5 │ 7  2  6 │ 1  4  9 │
│ 4  2  6 │ 9  1  3 │ 7  5  8 │
│ 7  1  9 │ 8  4  5 │ 3  6  2 │
├─────────┼─────────┼─────────┤
│ 2  8  7 │ 4  5  1 │ 6  9  3 │
│ 5  6  3 │ 2  8  9 │ 4  7  1 │
│ 1  9  4 │ 3  6  7 │ 2  8  5 │
└─────────┴─────────┴─────────┘

Solution verified: Valid Sudoku!
Solution matches expected answer from dataset!


In [8]:
# Display comparison
if success:
    compare_grids(puzzle_grid, solved_grid, expected_solution)


PUZZLE COMPARISON

           ORIGINAL PUZZLE                             AI SOLUTION             
┌─────────┬─────────┬─────────┐     ┌─────────┬─────────┬─────────┐
│ .  9  . │ .  .  8 │ 2  .  5 │     │ 1  9  4 │ 6  3  8 │ 2  7  5 │
│ 3  .  . │ 2  .  . │ .  9  . │     │ 3  6  7 │ 2  5  4 │ 1  9  8 │
│ .  .  . │ .  .  . │ .  .  . │     │ 2  8  5 │ 7  9  1 │ 6  4  3 │
├─────────┼─────────┼─────────┤     ├─────────┼─────────┼─────────┤
│ .  4  . │ .  .  5 │ 8  .  6 │     │ 9  4  2 │ 3  7  5 │ 8  1  6 │
│ .  .  . │ 8  6  . │ .  .  . │     │ 7  1  3 │ 8  6  2 │ 4  5  9 │
│ .  .  6 │ .  1  . │ .  2  . │     │ 8  5  6 │ 4  1  9 │ 3  2  7 │
├─────────┼─────────┼─────────┤     ├─────────┼─────────┼─────────┤
│ 5  .  9 │ .  8  . │ 7  .  4 │     │ 5  2  9 │ 1  8  6 │ 7  3  4 │
│ .  7  . │ .  .  . │ .  .  . │     │ 6  7  1 │ 9  4  3 │ 5  8  2 │
│ 4  .  8 │ .  .  7 │ 9  .  . │     │ 4  3  8 │ 5  2  7 │ 9  6  1 │
└─────────┴─────────┴─────────┘     └─────────┴─────────┴─────────┘

EXPECTED SOLUTI

In [9]:
# Track metrics
result = metrics_tracker.track_solve(
    puzzle_data, 
    solver_metrics, 
    execution_time, 
    solution_correct, 
    solved_grid if success else None
)

# Print detailed report
metrics_tracker.print_solve_report(result)


SOLVE REPORT
Puzzle Source: puzzles4_forum_hardest_1905
Difficulty Rating: 12

Performance Metrics:
  • Execution Time: 0.1654 seconds
  • Max Recursion Depth: 56
  • Backtrack Count: 2122

Result:
  • Solved: ✓ Yes
  • Solution Correct: ✓ Yes



## 4. Batch Testing (Multiple Puzzles)

Test AI solver on multiple puzzles to analyze performance across different difficulties.

In [10]:
# Configure batch test
NUM_PUZZLES_TO_TEST = 100  # Change this number as needed

print(f"Starting batch test on {NUM_PUZZLES_TO_TEST} puzzles...\n")
print("This may take a few minutes...\n")

# Reset metrics tracker
metrics_tracker = MetricsTracker()

# Test loop
for i in range(NUM_PUZZLES_TO_TEST):
    # Get puzzle
    puzzle_data = dataset.get_puzzle_by_index(i)
    puzzle_grid = puzzle_data['puzzle']
    expected_solution = puzzle_data['solution']
    
    # Solve
    start_time = time.time()
    solved_grid, success = solver.solve(puzzle_grid)
    end_time = time.time()
    execution_time = end_time - start_time
    
    # Get metrics
    solver_metrics = solver.get_metrics()
    
    # Verify solution
    solution_correct = False
    if success:
        solution_correct = (solved_grid == expected_solution)
    
    # Track
    metrics_tracker.track_solve(
        puzzle_data, 
        solver_metrics, 
        execution_time, 
        solution_correct, 
        solved_grid if success else None
    )
    
    # Progress indicator
    if (i + 1) % 10 == 0:
        print(f"Progress: {i + 1}/{NUM_PUZZLES_TO_TEST} puzzles completed")

print(f"\n Batch testing completed!")

Starting batch test on 100 puzzles...

This may take a few minutes...

Progress: 10/100 puzzles completed
Progress: 20/100 puzzles completed
Progress: 30/100 puzzles completed
Progress: 40/100 puzzles completed
Progress: 50/100 puzzles completed
Progress: 60/100 puzzles completed
Progress: 70/100 puzzles completed
Progress: 80/100 puzzles completed
Progress: 90/100 puzzles completed
Progress: 100/100 puzzles completed

 Batch testing completed!


In [11]:
# Display summary statistics
metrics_tracker.print_summary()


SUMMARY STATISTICS
Total Puzzles Attempted: 100
Puzzles Solved: 100
Solve Rate: 100.00%

Timing Statistics:
  • Average Time: 0.5302 seconds
  • Fastest Solve: 0.0015 seconds
  • Slowest Solve: 4.1108 seconds

Algorithm Statistics:
  • Avg Recursion Depth: 57.03
  • Avg Backtracks: 7149.41
  • Correct Solutions: 100



In [12]:
# Export results to CSV
metrics_tracker.export_results('solver_results.csv')
print("Detailed results saved to 'solver_results.csv'")

Results exported to solver_results.csv
Detailed results saved to 'solver_results.csv'


## 5. Performance Analysis

Analyze solver performance across different difficulty levels.

In [13]:
# Load results for analysis
results_df = pd.DataFrame(metrics_tracker.results)

# Display sample results
print("Sample Results:")
print(results_df.head(10))

Sample Results:
   puzzle_rating                puzzle_source  execution_time  \
0             42  puzzles4_forum_hardest_1905        0.567933   
1             53  puzzles4_forum_hardest_1905        1.665484   
2             26  puzzles4_forum_hardest_1905        0.126303   
3              3            puzzles1_unbiased        0.022599   
4             23  puzzles4_forum_hardest_1905        0.125375   
5             43  puzzles4_forum_hardest_1905        0.486522   
6             37  puzzles4_forum_hardest_1905        0.034509   
7              0            puzzles1_unbiased        0.095904   
8             33                     01_file1        0.076239   
9             48  puzzles4_forum_hardest_1905        1.495202   

   max_recursion_depth  backtrack_count  solution_correct  solved  
0                   59             8021              True    True  
1                   57            23424              True    True  
2                   59             1816              True    Tru

In [14]:
# Analysis by difficulty
print("\nPERFORMANCE BY DIFFICULTY RATING\n")
print("="*60)

# Group by difficulty ranges
results_df['difficulty_category'] = pd.cut(
    results_df['puzzle_rating'], 
    bins=[0, 5, 20, 40, 100],
    labels=['Easy (1-5)', 'Medium (6-20)', 'Hard (21-40)', 'Expert (40+)']
)

difficulty_analysis = results_df.groupby('difficulty_category').agg({
    'execution_time': ['mean', 'min', 'max'],
    'max_recursion_depth': 'mean',
    'backtrack_count': 'mean',
    'solved': 'sum'
}).round(4)

print(difficulty_analysis)


PERFORMANCE BY DIFFICULTY RATING

                    execution_time                 max_recursion_depth  \
                              mean     min     max                mean   
difficulty_category                                                      
Easy (1-5)                  0.0227  0.0100  0.0501             55.9091   
Medium (6-20)               0.7519  0.0092  4.1108             57.3214   
Hard (21-40)                0.3615  0.0140  1.1837             57.1818   
Expert (40+)                0.9179  0.0395  2.7471             57.3043   

                    backtrack_count solved  
                               mean    sum  
difficulty_category                         
Easy (1-5)                 214.5455     11  
Medium (6-20)             9947.7500     28  
Hard (21-40)              5156.2273     22  
Expert (40+)             12441.6087     23  


  difficulty_analysis = results_df.groupby('difficulty_category').agg({


## 6. Conclusion

### Key Findings:
1. **MRV Heuristic** significantly reduces search space
2. **Backtracking** efficiently handles constraint violations
3. **Performance scales** with puzzle difficulty

### Algorithm Benefits:
- ✅ Complete: Always finds solution if one exists
- ✅ Optimal: Uses intelligent variable selection
- ✅ Efficient: MRV reduces unnecessary branches

**Student:** Aditya Karki 
**Course:** Artificial Intelligence  
**Implementation:** CSP + Backtracking + MRV Heuristic