# Database Rebuild - Eliminate Fragmentation

## Purpose
This notebook creates a fresh copy of the database by copying all data row-by-row into a new database file. This eliminates:
- **Fragmentation**: Empty space left behind by deleted records
- **Inefficiencies**: Overhead from multiple deletions and updates
- **Storage bloat**: Unused space that isn't reclaimed by VACUUM

## Why This Approach?
DuckDB's VACUUM command should theoretically reclaim space, but in practice we've seen cases where:
1. The database file size increases after deletions instead of decreasing
2. VACUUM doesn't fully reclaim fragmented space
3. Multiple deletion operations leave behind structural inefficiencies

**This brute-force approach** creates a completely fresh database with:
- Clean table structures with no fragmentation
- Optimal page layout and storage efficiency
- Sequential row IDs with no gaps
- Minimal file size for the actual data

## Process Overview
1. **Backup Check**: Verify the original database exists and log its statistics
2. **Schema Creation**: Create a new database with the proper schema using `setup_database()`
3. **Copy Players**: Copy all player records with their original IDs
4. **Copy Openings**: Copy all opening records with their original IDs
5. **Copy Stats**: Copy all player_opening_stats records to appropriate partitions
6. **Verification**: Confirm all data was copied correctly
7. **Comparison**: Show before/after file sizes and statistics

## Important Notes
- This creates a NEW database file, it does NOT modify the original
- The original database remains untouched as a backup
- You'll need to manually swap the files when you're satisfied with the rebuild
- This can be run whenever you need to defragment the database

## Better Way?
There *should* be a better way using DuckDB's built-in maintenance commands, but after trying various approaches (VACUUM, CHECKPOINT, ANALYZE), this brute-force copy is the only reliable method we've found to truly reclaim space and eliminate fragmentation.

In [1]:
# Configuration and setup
import os
from pathlib import Path
from datetime import datetime
from utils.database.db_utils import get_db_connection, setup_database

# Define paths
project_root = Path.cwd().parent if "notebooks" in str(Path.cwd()) else Path.cwd()
original_db_path = project_root / "data" / "processed" / "chess_games.db"
new_db_path = project_root / "data" / "processed" / "chess_games_rebuilt.db"

print("=" * 80)
print("DATABASE REBUILD CONFIGURATION")
print("=" * 80)
print(f"\nOriginal database: {original_db_path}")
print(f"New database: {new_db_path}")
print(f"\nOriginal database exists: {original_db_path.exists()}")
print(f"New database exists: {new_db_path.exists()}")

if new_db_path.exists():
    print(f"\n⚠️  WARNING: New database file already exists!")
    print(f"   If you continue, it will be DELETED and recreated from scratch.")
    print(f"   Press Ctrl+C to abort, or continue to proceed.")
else:
    print(f"\n✓ New database path is clear - ready to create fresh database")

DATABASE REBUILD CONFIGURATION

Original database: /Users/a/Documents/personalprojects/chess-opening-recommender/data/processed/chess_games.db
New database: /Users/a/Documents/personalprojects/chess-opening-recommender/data/processed/chess_games_rebuilt.db

Original database exists: True
New database exists: False

✓ New database path is clear - ready to create fresh database


In [2]:
# Log original database statistics
if not original_db_path.exists():
    raise FileNotFoundError(f"Original database not found at {original_db_path}")

original_size_bytes = os.path.getsize(original_db_path)
original_size_mb = original_size_bytes / (1024 * 1024)
original_size_gb = original_size_mb / 1024

print("=" * 80)
print("ORIGINAL DATABASE STATISTICS")
print("=" * 80)

print(f"\n--- File Size ---")
print(f"Size: {original_size_mb:,.1f} MB ({original_size_gb:.2f} GB)")
print(f"Raw bytes: {original_size_bytes:,}")

with get_db_connection(original_db_path) as con:
    # Core table counts
    print(f"\n--- Record Counts ---")
    player_count = con.execute('SELECT COUNT(*) FROM player').fetchone()[0]
    opening_count = con.execute('SELECT COUNT(*) FROM opening').fetchone()[0]
    total_stats = con.execute('SELECT COUNT(*) FROM player_opening_stats').fetchone()[0]
    
    print(f"Players: {player_count:,}")
    print(f"Openings: {opening_count:,}")
    print(f"Player-Opening-Stats Records: {total_stats:,}")
    
    # Partition distribution
    print(f"\n--- Partition Distribution ---")
    partition_counts = {}
    for letter in ['A', 'B', 'C', 'D', 'E', 'other']:
        count = con.execute(f'SELECT COUNT(*) FROM player_opening_stats_{letter}').fetchone()[0]
        partition_counts[letter] = count
        percentage = (count / total_stats * 100) if total_stats > 0 else 0
        print(f"  Partition {letter}: {count:,} ({percentage:.1f}%)")
    
    # Game statistics
    print(f"\n--- Game Statistics ---")
    total_games = con.execute("""
        SELECT SUM(num_wins + num_draws + num_losses) as total_games
        FROM player_opening_stats
    """).fetchone()[0]
    
    print(f"Total Games: {total_games:,}")
    print(f"Average Games per Stats Record: {total_games/total_stats:.1f}")
    print(f"Bytes per Stats Record: {original_size_bytes/total_stats:.1f}")
    print(f"Bytes per Game: {original_size_bytes/total_games:.2f}")

# Store these for later comparison
original_stats = {
    'size_bytes': original_size_bytes,
    'players': player_count,
    'openings': opening_count,
    'stats_records': total_stats,
    'total_games': total_games,
    'partition_counts': partition_counts
}

print(f"\n✓ Original database statistics logged")

ORIGINAL DATABASE STATISTICS

--- File Size ---
Size: 2,781.5 MB (2.72 GB)
Raw bytes: 2,916,626,432

--- Record Counts ---
Players: 50,000
Openings: 3,568
Player-Opening-Stats Records: 25,452,253

--- Partition Distribution ---
  Partition A: 5,854,225 (23.0%)
  Partition B: 6,670,613 (26.2%)
  Partition C: 8,472,867 (33.3%)
  Partition D: 3,474,963 (13.7%)
  Partition E: 979,585 (3.8%)
  Partition other: 0 (0.0%)

--- Game Statistics ---
  Partition D: 3,474,963 (13.7%)
  Partition E: 979,585 (3.8%)
  Partition other: 0 (0.0%)

--- Game Statistics ---
Total Games: 475,054,388
Average Games per Stats Record: 18.7
Bytes per Stats Record: 114.6
Bytes per Game: 6.14
Total Games: 475,054,388
Average Games per Stats Record: 18.7
Bytes per Stats Record: 114.6
Bytes per Game: 6.14

✓ Original database statistics logged

✓ Original database statistics logged


In [3]:
# Delete existing new database if it exists, then create fresh schema
print("=" * 80)
print("CREATING NEW DATABASE")
print("=" * 80)

if new_db_path.exists():
    print(f"\n🗑️  Deleting existing new database file...")
    os.remove(new_db_path)
    print(f"   ✓ Deleted: {new_db_path}")

print(f"\n📋 Creating fresh database with proper schema...")
new_con = get_db_connection(new_db_path)
setup_database(new_con)
new_con.close()

print(f"\n✅ New database created with proper schema at: {new_db_path}")
print(f"   File size: {os.path.getsize(new_db_path):,} bytes (empty database overhead)")

CREATING NEW DATABASE

📋 Creating fresh database with proper schema...
Initializing database schema...
Database tables and partitioned stats tables are ready.

✅ New database created with proper schema at: /Users/a/Documents/personalprojects/chess-opening-recommender/data/processed/chess_games_rebuilt.db
   File size: 274,432 bytes (empty database overhead)


In [4]:
# Copy player table
print("=" * 80)
print("COPYING PLAYER TABLE")
print("=" * 80)

with get_db_connection(original_db_path) as source_con:
    with get_db_connection(new_db_path) as dest_con:
        print(f"\n📊 Reading all players from original database...")
        players = source_con.execute("""
            SELECT id, name, title
            FROM player
            ORDER BY id
        """).fetchall()
        
        print(f"   Found {len(players):,} players to copy")
        
        print(f"\n📝 Inserting players into new database...")
        print(f"   This preserves original IDs to maintain referential integrity")
        print(f"   Using batched transactions for better performance...")
        
        # Insert in batches with explicit transaction control
        # The key issue: we need to wrap batches in transactions to avoid slowdowns
        batch_size = 10_000
        total_inserted = 0
        
        for i in range(0, len(players), batch_size):
            batch = players[i:i+batch_size]
            
            # Use explicit transaction for this batch
            dest_con.execute("BEGIN TRANSACTION")
            
            # Insert the batch
            dest_con.executemany("""
                INSERT INTO player (id, name, title)
                VALUES (?, ?, ?)
            """, batch)
            
            dest_con.execute("COMMIT")
            
            total_inserted += len(batch)
            print(f"   Progress: {total_inserted:,} / {len(players):,} players ({total_inserted/len(players)*100:.1f}%)")
        
        # Verify count
        new_player_count = dest_con.execute('SELECT COUNT(*) FROM player').fetchone()[0]
        
        print(f"\n✅ Player table copied successfully")
        print(f"   Original: {len(players):,} players")
        print(f"   New: {new_player_count:,} players")
        print(f"   Match: {'✓ YES' if new_player_count == len(players) else '✗ NO - ERROR!'}")


COPYING PLAYER TABLE

📊 Reading all players from original database...
   Found 50,000 players to copy

📝 Inserting players into new database...
   This preserves original IDs to maintain referential integrity
   Using batched transactions for better performance...
   Progress: 10,000 / 50,000 players (20.0%)
   Progress: 10,000 / 50,000 players (20.0%)
   Progress: 20,000 / 50,000 players (40.0%)
   Progress: 20,000 / 50,000 players (40.0%)
   Progress: 30,000 / 50,000 players (60.0%)
   Progress: 30,000 / 50,000 players (60.0%)
   Progress: 40,000 / 50,000 players (80.0%)
   Progress: 40,000 / 50,000 players (80.0%)
   Progress: 50,000 / 50,000 players (100.0%)

✅ Player table copied successfully
   Original: 50,000 players
   New: 50,000 players
   Match: ✓ YES
   Progress: 50,000 / 50,000 players (100.0%)

✅ Player table copied successfully
   Original: 50,000 players
   New: 50,000 players
   Match: ✓ YES


In [5]:
# Copy opening table
print("=" * 80)
print("COPYING OPENING TABLE")
print("=" * 80)

with get_db_connection(original_db_path) as source_con:
    with get_db_connection(new_db_path) as dest_con:
        print(f"\n📊 Reading all openings from original database...")
        openings = source_con.execute("""
            SELECT id, eco, name
            FROM opening
            ORDER BY id
        """).fetchall()
        
        print(f"   Found {len(openings):,} openings to copy")
        
        print(f"\n📝 Inserting openings into new database...")
        print(f"   This preserves original IDs to maintain referential integrity")
        print(f"   Using batched transactions for better performance...")
        
        # Insert in batches with explicit transaction control
        batch_size = 1000
        total_inserted = 0
        
        for i in range(0, len(openings), batch_size):
            batch = openings[i:i+batch_size]
            
            # Use explicit transaction for this batch
            dest_con.execute("BEGIN TRANSACTION")
            
            # Insert batch
            dest_con.executemany("""
                INSERT INTO opening (id, eco, name)
                VALUES (?, ?, ?)
            """, batch)
            
            dest_con.execute("COMMIT")
            
            total_inserted += len(batch)
            
            if total_inserted % 5000 == 0 or total_inserted >= len(openings):
                print(f"   Progress: {total_inserted:,} / {len(openings):,} openings ({total_inserted/len(openings)*100:.1f}%)")
        
        # Verify count
        new_opening_count = dest_con.execute('SELECT COUNT(*) FROM opening').fetchone()[0]
        
        print(f"\n✅ Opening table copied successfully")
        print(f"   Original: {len(openings):,} openings")
        print(f"   New: {new_opening_count:,} openings")
        print(f"   Match: {'✓ YES' if new_opening_count == len(openings) else '✗ NO - ERROR!'}")


COPYING OPENING TABLE

📊 Reading all openings from original database...
   Found 3,568 openings to copy

📝 Inserting openings into new database...
   This preserves original IDs to maintain referential integrity
   Using batched transactions for better performance...
   Progress: 3,568 / 3,568 openings (100.0%)

✅ Opening table copied successfully
   Original: 3,568 openings
   New: 3,568 openings
   Match: ✓ YES
   Progress: 3,568 / 3,568 openings (100.0%)

✅ Opening table copied successfully
   Original: 3,568 openings
   New: 3,568 openings
   Match: ✓ YES


In [None]:
# Copy player_opening_stats to appropriate partitions
print("=" * 80)
print("COPYING PLAYER_OPENING_STATS (PARTITIONED)")
print("=" * 80)

# Define partition mapping function
def get_partition_letter(eco_code: str) -> str:
    """
    Determines which partition a record belongs to based on ECO code.
    ECO codes starting with A-E go to their respective partitions.
    All others go to the 'other' partition.
    """
    if not eco_code or len(eco_code) == 0:
        return 'other'
    
    first_letter = eco_code[0].upper()
    if first_letter in ['A', 'B', 'C', 'D', 'E']:
        return first_letter
    else:
        return 'other'

partition_letters = ['A', 'B', 'C', 'D', 'E', 'other']

with get_db_connection(original_db_path) as source_con:
    with get_db_connection(new_db_path) as dest_con:
        print(f"\n📊 Copying stats records partition by partition...")
        print(f"   This ensures proper distribution across partitioned tables")
        print(f"   Stats are copied WITH their ECO codes to enable partition routing")
        print(f"   Using batched transactions for better performance\n")
        
        total_copied = 0
        partition_results = {}
        
        for letter in partition_letters:
            print(f"\n{'='*60}")
            print(f"PARTITION {letter}")
            print(f"{'='*60}")
            
            source_table = f"player_opening_stats_{letter}"
            dest_table = f"player_opening_stats_{letter}"
            
            print(f"\n📊 Reading from {source_table}...")
            
            # Read all stats from this partition
            # We need to join with opening to get ECO codes for verification
            stats = source_con.execute(f"""
                SELECT 
                    pos.player_id,
                    pos.opening_id,
                    pos.color,
                    pos.num_wins,
                    pos.num_draws,
                    pos.num_losses,
                    o.eco
                FROM {source_table} pos
                JOIN opening o ON pos.opening_id = o.id
                ORDER BY pos.player_id, pos.opening_id, pos.color
            """).fetchall()
            
            print(f"   Found {len(stats):,} records in source partition")
            
            if len(stats) == 0:
                print(f"   ⚠️  Empty partition - skipping")
                partition_results[letter] = {'source': 0, 'copied': 0, 'verified': 0}
                continue
            
            # Verify all records belong in this partition
            misplaced = 0
            for record in stats:
                eco = record[6]
                expected_partition = get_partition_letter(eco)
                if expected_partition != letter:
                    misplaced += 1
            
            if misplaced > 0:
                print(f"   ⚠️  WARNING: {misplaced:,} records appear to be in wrong partition!")
            
            print(f"\n📝 Inserting into {dest_table}...")
            
            # Insert in batches with explicit transaction control (without ECO code, as it's not in the stats table)
            batch_size = 1_000_000
            records_inserted = 0
            
            for i in range(0, len(stats), batch_size):
                batch = stats[i:i+batch_size]
                
                # Extract just the stats columns (exclude ECO)
                batch_data = [(r[0], r[1], r[2], r[3], r[4], r[5]) for r in batch]
                
                # Use explicit transaction for this batch
                dest_con.execute("BEGIN TRANSACTION")
                
                dest_con.executemany(f"""
                    INSERT INTO {dest_table} (player_id, opening_id, color, num_wins, num_draws, num_losses)
                    VALUES (?, ?, ?, ?, ?, ?)
                """, batch_data)
                
                dest_con.execute("COMMIT")
                
                records_inserted += len(batch)
                
                if records_inserted % 100000 == 0 or records_inserted >= len(stats):
                    print(f"   Progress: {records_inserted:,} / {len(stats):,} ({records_inserted/len(stats)*100:.1f}%)")
            
            # Verify count in new partition
            new_count = dest_con.execute(f'SELECT COUNT(*) FROM {dest_table}').fetchone()[0]
            
            print(f"\n✅ Partition {letter} copied")
            print(f"   Source: {len(stats):,} records")
            print(f"   Destination: {new_count:,} records")
            print(f"   Match: {'✓ YES' if new_count == len(stats) else '✗ NO - ERROR!'}")
            
            partition_results[letter] = {
                'source': len(stats),
                'copied': records_inserted,
                'verified': new_count
            }
            
            total_copied += new_count
        
        # Overall summary
        print(f"\n{'='*80}")
        print(f"OVERALL STATS COPY SUMMARY")
        print(f"{'='*80}")
        
        print(f"\nPartition-by-partition results:")
        for letter in partition_letters:
            result = partition_results[letter]
            status = '✓' if result['source'] == result['verified'] else '✗ ERROR'
            print(f"  {letter}: {result['source']:>10,} → {result['verified']:>10,}  {status}")
        
        print(f"\nTotal stats records copied: {total_copied:,}")
        print(f"Original total: {original_stats['stats_records']:,}")
        print(f"Match: {'✓ YES' if total_copied == original_stats['stats_records'] else '✗ NO - ERROR!'}")

COPYING PLAYER_OPENING_STATS (PARTITIONED)

📊 Copying stats records partition by partition...
   This ensures proper distribution across partitioned tables
   Stats are copied WITH their ECO codes to enable partition routing
   Using batched transactions for better performance


PARTITION A

📊 Reading from player_opening_stats_A...


FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

   Found 5,854,225 records in source partition

📝 Inserting into player_opening_stats_A...

📝 Inserting into player_opening_stats_A...
   Progress: 100,000 / 5,854,225 (1.7%)
   Progress: 100,000 / 5,854,225 (1.7%)
   Progress: 200,000 / 5,854,225 (3.4%)
   Progress: 200,000 / 5,854,225 (3.4%)


RuntimeError: Query interrupted

In [None]:
# Comprehensive verification of the new database
print("=" * 80)
print("COMPREHENSIVE VERIFICATION")
print("=" * 80)

with get_db_connection(new_db_path) as con:
    print(f"\n--- Record Counts Verification ---")
    
    new_player_count = con.execute('SELECT COUNT(*) FROM player').fetchone()[0]
    new_opening_count = con.execute('SELECT COUNT(*) FROM opening').fetchone()[0]
    new_stats_count = con.execute('SELECT COUNT(*) FROM player_opening_stats').fetchone()[0]
    
    print(f"Players:")
    print(f"  Original: {original_stats['players']:,}")
    print(f"  New:      {new_player_count:,}")
    print(f"  Match:    {'✓ YES' if new_player_count == original_stats['players'] else '✗ NO'}")
    
    print(f"\nOpenings:")
    print(f"  Original: {original_stats['openings']:,}")
    print(f"  New:      {new_opening_count:,}")
    print(f"  Match:    {'✓ YES' if new_opening_count == original_stats['openings'] else '✗ NO'}")
    
    print(f"\nPlayer-Opening-Stats:")
    print(f"  Original: {original_stats['stats_records']:,}")
    print(f"  New:      {new_stats_count:,}")
    print(f"  Match:    {'✓ YES' if new_stats_count == original_stats['stats_records'] else '✗ NO'}")
    
    # Verify partition distribution matches
    print(f"\n--- Partition Distribution Verification ---")
    for letter in ['A', 'B', 'C', 'D', 'E', 'other']:
        orig_count = original_stats['partition_counts'][letter]
        new_count = con.execute(f'SELECT COUNT(*) FROM player_opening_stats_{letter}').fetchone()[0]
        match = '✓' if orig_count == new_count else '✗'
        print(f"  Partition {letter}: {orig_count:>10,} → {new_count:>10,}  {match}")
    
    # Verify game totals match
    print(f"\n--- Game Statistics Verification ---")
    new_total_games = con.execute("""
        SELECT SUM(num_wins + num_draws + num_losses)
        FROM player_opening_stats
    """).fetchone()[0]
    
    print(f"Total Games:")
    print(f"  Original: {original_stats['total_games']:,}")
    print(f"  New:      {new_total_games:,}")
    print(f"  Match:    {'✓ YES' if new_total_games == original_stats['total_games'] else '✗ NO'}")
    
    # Check for referential integrity
    print(f"\n--- Referential Integrity Checks ---")
    
    # Check for orphaned stats (player_id not in player table)
    orphaned_players = con.execute("""
        SELECT COUNT(DISTINCT pos.player_id)
        FROM player_opening_stats pos
        LEFT JOIN player p ON pos.player_id = p.id
        WHERE p.id IS NULL
    """).fetchone()[0]
    
    print(f"Orphaned player_ids in stats: {orphaned_players:,}")
    print(f"  Status: {'✓ GOOD (none)' if orphaned_players == 0 else '✗ ERROR - orphaned records exist!'}")
    
    # Check for orphaned stats (opening_id not in opening table)
    orphaned_openings = con.execute("""
        SELECT COUNT(DISTINCT pos.opening_id)
        FROM player_opening_stats pos
        LEFT JOIN opening o ON pos.opening_id = o.id
        WHERE o.id IS NULL
    """).fetchone()[0]
    
    print(f"\nOrphaned opening_ids in stats: {orphaned_openings:,}")
    print(f"  Status: {'✓ GOOD (none)' if orphaned_openings == 0 else '✗ ERROR - orphaned records exist!'}")
    
    # Final verdict
    all_checks_passed = (
        new_player_count == original_stats['players'] and
        new_opening_count == original_stats['openings'] and
        new_stats_count == original_stats['stats_records'] and
        new_total_games == original_stats['total_games'] and
        orphaned_players == 0 and
        orphaned_openings == 0
    )
    
    print(f"\n{'='*80}")
    if all_checks_passed:
        print(f"✅ ALL VERIFICATION CHECKS PASSED")
        print(f"   The new database is an exact copy of the original")
    else:
        print(f"❌ VERIFICATION FAILED")
        print(f"   The new database does NOT match the original")
        print(f"   DO NOT use the new database - investigate errors above")
    print(f"{'='*80}")

In [None]:
# File size comparison and efficiency gains
print("=" * 80)
print("FILE SIZE COMPARISON")
print("=" * 80)

new_size_bytes = os.path.getsize(new_db_path)
new_size_mb = new_size_bytes / (1024 * 1024)
new_size_gb = new_size_mb / 1024

size_difference_bytes = original_stats['size_bytes'] - new_size_bytes
size_difference_mb = size_difference_bytes / (1024 * 1024)
percentage_reduction = (size_difference_bytes / original_stats['size_bytes']) * 100

print(f"\n--- Original Database ---")
print(f"Size: {original_stats['size_bytes'] / (1024*1024):,.1f} MB ({original_stats['size_bytes'] / (1024*1024*1024):.2f} GB)")
print(f"Raw bytes: {original_stats['size_bytes']:,}")

print(f"\n--- Rebuilt Database ---")
print(f"Size: {new_size_mb:,.1f} MB ({new_size_gb:.2f} GB)")
print(f"Raw bytes: {new_size_bytes:,}")

print(f"\n--- Comparison ---")
if size_difference_bytes > 0:
    print(f"✅ SIZE REDUCED by {size_difference_mb:,.1f} MB ({percentage_reduction:.1f}%)")
    print(f"   Space reclaimed: {size_difference_bytes:,} bytes")
elif size_difference_bytes < 0:
    print(f"⚠️  Size INCREASED by {abs(size_difference_mb):,.1f} MB ({abs(percentage_reduction):.1f}%)")
    print(f"   This is unexpected - the rebuild should not increase size")
else:
    print(f"Size is EXACTLY THE SAME")
    print(f"   The original database was already fully optimized")

# Efficiency metrics
print(f"\n--- Storage Efficiency ---")
bytes_per_record_old = original_stats['size_bytes'] / original_stats['stats_records']
bytes_per_record_new = new_size_bytes / original_stats['stats_records']
bytes_per_game_old = original_stats['size_bytes'] / original_stats['total_games']
bytes_per_game_new = new_size_bytes / original_stats['total_games']

print(f"Bytes per stats record:")
print(f"  Original: {bytes_per_record_old:.1f}")
print(f"  New:      {bytes_per_record_new:.1f}")
print(f"  Change:   {((bytes_per_record_new - bytes_per_record_old) / bytes_per_record_old * 100):+.1f}%")

print(f"\nBytes per game:")
print(f"  Original: {bytes_per_game_old:.2f}")
print(f"  New:      {bytes_per_game_new:.2f}")
print(f"  Change:   {((bytes_per_game_new - bytes_per_game_old) / bytes_per_game_old * 100):+.1f}%")

print(f"\n{'='*80}")
print(f"DATABASE REBUILD COMPLETE")
print(f"{'='*80}")
print(f"\nOriginal database: {original_db_path}")
print(f"New database:      {new_db_path}")
print(f"\nThe original database has NOT been modified.")
print(f"If you're satisfied with the rebuild, you can:")
print(f"  1. Rename the original as a backup")
print(f"  2. Rename the rebuilt database to replace it")
print(f"\nExample commands:")
print(f"  mv {original_db_path} {original_db_path}.backup")
print(f"  mv {new_db_path} {original_db_path}")