## Load to AWS S3

Upload the completed `americas.db` database to AWS S3 for cloud storage and sharing.

**Prerequisites:**
- AWS credentials configured via one of:
  - AWS CLI: `aws configure`
  - Environment variables: `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`
  - IAM role (if running on EC2)
- S3 bucket permissions for write access to `renan-peres-datasets`

**Install boto3 if needed:**
```bash
pip install boto3
```

The database will be uploaded to: `s3://renan-peres-datasets/finance/americas.db`

In [None]:
# Upload americas.db to AWS S3
import boto3
import os
from botocore.exceptions import NoCredentialsError, ClientError

def upload_to_s3(local_file_path: str, bucket_name: str, s3_key: str) -> bool:
    """
    Upload a file to AWS S3 bucket
    
    Args:
        local_file_path: Path to the local file to upload
        bucket_name: Name of the S3 bucket
        s3_key: S3 object key (path within bucket)
    
    Returns:
        bool: True if upload was successful, False otherwise
    """
    try:
        # Initialize S3 client (uses AWS credentials from environment or AWS config)
        s3_client = boto3.client('s3')
        
        # Check if file exists locally
        if not os.path.exists(local_file_path):
            print(f"Error: Local file {local_file_path} does not exist")
            return False
        
        # Get file size for progress tracking
        file_size = os.path.getsize(local_file_path)
        print(f"Uploading {local_file_path} ({file_size:,} bytes) to s3://{bucket_name}/{s3_key}")
        
        # Upload file with progress callback
        def progress_callback(bytes_transferred):
            percentage = (bytes_transferred / file_size) * 100
            print(f"\rProgress: {percentage:.1f}% ({bytes_transferred:,}/{file_size:,} bytes)", end='')
        
        s3_client.upload_file(
            local_file_path, 
            bucket_name, 
            s3_key,
            Callback=progress_callback
        )
        
        print(f"\n✅ Successfully uploaded to s3://{bucket_name}/{s3_key}")
        
        # Generate public URL (if bucket allows public access)
        s3_url = f"https://{bucket_name}.s3.us-east-2.amazonaws.com/{s3_key}"
        print(f"📍 Public URL: {s3_url}")
        
        return True
        
    except NoCredentialsError:
        print("❌ Error: AWS credentials not found. Please configure AWS CLI or set environment variables:")
        print("   - AWS_ACCESS_KEY_ID")
        print("   - AWS_SECRET_ACCESS_KEY")
        print("   - AWS_DEFAULT_REGION (optional)")
        return False
        
    except ClientError as e:
        error_code = e.response['Error']['Code']
        if error_code == 'NoSuchBucket':
            print(f"❌ Error: Bucket '{bucket_name}' does not exist")
        elif error_code == 'AccessDenied':
            print(f"❌ Error: Access denied to bucket '{bucket_name}'. Check your permissions.")
        else:
            print(f"❌ AWS Error: {e}")
        return False
        
    except Exception as e:
        print(f"❌ Unexpected error: {e}")
        return False

# Configuration
LOCAL_DB_PATH = '../americas.db'
S3_BUCKET = 'renan-peres-datasets'
S3_KEY = 'finance/americas.db'

# Verify database exists before upload
if os.path.exists(LOCAL_DB_PATH):
    print(f"📊 Database file found: {LOCAL_DB_PATH}")
    
    # Get database stats before upload
    import duckdb
    try:
        con = duckdb.connect(LOCAL_DB_PATH)
        tables = con.sql("SHOW TABLES").fetchall()
        total_rows = sum(con.sql(f"SELECT COUNT(*) FROM {table[0]}").fetchone()[0] for table in tables)
        con.close()
        
        print(f"📈 Database contains {len(tables)} tables with {total_rows:,} total rows")
    except Exception as e:
        print(f"⚠️  Could not read database stats: {e}")
    
    # Upload to S3
    success = upload_to_s3(LOCAL_DB_PATH, S3_BUCKET, S3_KEY)
    
    if success:
        print("\n🎉 Americas financial database successfully uploaded to AWS S3!")
        print("🔗 Access your data at: https://renan-peres-datasets.s3.us-east-2.amazonaws.com/finance/americas.db")
    else:
        print("\n❌ Failed to upload database to S3")
        
else:
    print(f"❌ Database file not found at {LOCAL_DB_PATH}")
    print("   Please run the data collection cells first to create the database.")

📊 Database file found: ../americas.db
📈 Database contains 8 tables with 34,763,155 total rows
Uploading ../americas.db (1,751,134,208 bytes) to s3://renan-peres-datasets/finance/americas.db
Progress: 0.1% (1,048,576/1,751,134,208 bytes)
✅ Successfully uploaded to s3://renan-peres-datasets/finance/americas.db
📍 Public URL: https://renan-peres-datasets.s3.us-east-2.amazonaws.com/finance/americas.db

🎉 Americas financial database successfully uploaded to AWS S3!
🔗 Access your data at: https://renan-peres-datasets.s3.us-east-2.amazonaws.com/finance/americas.db
