# BigQuery Connection Setup for Belgian Brewery Project

This notebook establishes and tests the connection to Google BigQuery for the Belgian Brewery "Glide Template" Strategy Project.

## Prerequisites
- Google Cloud Project with BigQuery API enabled
- Service account key or authentication setup
- Required Python packages installed

In [None]:
# Create a conda environment for this project
# conda create -n be-brew-py311 python=3.11
# conda activate be-brew-py311

# Install required packages using the requirements.txt file
# Install required packages (run this cell first if packages are not installed)
# !pip install -r requirements.txt

## Import Required Libraries

In [None]:
# Import required libraries for BigQuery connection
import pandas as pd
from google.cloud import bigquery
import os
from google.oauth2 import service_account
import json

## Authentication Setup

Choose one of the following authentication methods:
1. Service Account Key File
2. Application Default Credentials (if running on Google Cloud)
3. Manual credentials setup

In [None]:
# Option 1: Using Service Account Key File
# Uncomment and modify the path to your service account key
# os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/your/service-account-key.json'

# Option 2: Using explicit credentials (replace with your actual service account key content)
# service_account_info = {
#     "type": "service_account",
#     "project_id": "your-project-id",
#     "private_key_id": "your-private-key-id",
#     # ... other fields from your service account key
# }
# credentials = service_account.Credentials.from_service_account_info(service_account_info)

# For now, we'll use application default credentials
# Make sure you've run: gcloud auth application-default login

## BigQuery Client Setup

In [None]:
# Set your Google Cloud Project ID
PROJECT_ID = "your-project-id"  # Replace with your actual project ID
DATASET_ID = "belgian_brewery"   # Dataset for our brewery project

# Initialize BigQuery client
try:
    # Option 1: Using default credentials
    client = bigquery.Client(project=PROJECT_ID)
    
    # Option 2: Using explicit credentials (uncomment if using service account)
    # client = bigquery.Client(project=PROJECT_ID, credentials=credentials)
    
    print(f"Successfully connected to BigQuery project: {PROJECT_ID}")
except Exception as e:
    print(f"Error connecting to BigQuery: {e}")

## Test Connection with Simple Query

In [None]:
# Test the connection with a simple query
test_query = """
SELECT 
    'BigQuery connection successful!' as status,
    CURRENT_DATETIME() as timestamp,
    @@project_id as project_id
"""

try:
    # Execute the test query
    query_job = client.query(test_query)
    results = query_job.result()
    
    # Convert to DataFrame for better display
    df_test = results.to_dataframe()
    print("Connection Test Results:")
    print(df_test)
    
except Exception as e:
    print(f"Error executing test query: {e}")

## Create Dataset for Belgian Brewery Project

In [None]:
# Create dataset for the Belgian brewery project
dataset_id = f"{PROJECT_ID}.{DATASET_ID}"

try:
    # Check if dataset already exists
    dataset = client.get_dataset(dataset_id)
    print(f"Dataset {dataset_id} already exists.")
except:
    # Create the dataset if it doesn't exist
    dataset = bigquery.Dataset(dataset_id)
    dataset.location = "US"  # or "EU" depending on your preference
    dataset.description = "Dataset for Belgian Brewery Glide Template Strategy Project"
    
    dataset = client.create_dataset(dataset, timeout=30)
    print(f"Created dataset {dataset_id}")

## Prepare for Data Upload

Next steps will involve:
1. Loading the Belgian beers and breweries data from Google Sheets
2. Uploading raw data to BigQuery tables
3. Setting up dbt for data transformation

In [None]:
# Function to upload DataFrame to BigQuery
def upload_dataframe_to_bigquery(df, table_name, dataset_id=DATASET_ID, if_exists='replace'):
    """
    Upload a pandas DataFrame to BigQuery
    
    Args:
        df: pandas DataFrame to upload
        table_name: name of the BigQuery table
        dataset_id: BigQuery dataset ID
        if_exists: what to do if table exists ('replace', 'append', 'fail')
    """
    table_id = f"{PROJECT_ID}.{dataset_id}.{table_name}"
    
    # Configure the job
    job_config = bigquery.LoadJobConfig(
        write_disposition="WRITE_TRUNCATE" if if_exists == 'replace' else "WRITE_APPEND",
        autodetect=True  # Automatically detect schema
    )
    
    try:
        # Upload the DataFrame
        job = client.load_table_from_dataframe(df, table_id, job_config=job_config)
        job.result()  # Wait for the job to complete
        
        print(f"Successfully uploaded {len(df)} rows to {table_id}")
        return True
        
    except Exception as e:
        print(f"Error uploading data to {table_id}: {e}")
        return False

# Test function with sample data
sample_data = pd.DataFrame({
    'test_column': ['value1', 'value2', 'value3'],
    'timestamp': pd.Timestamp.now()
})

print("Testing upload function with sample data:")
upload_dataframe_to_bigquery(sample_data, 'connection_test', if_exists='replace')

## Verify Sample Upload

In [None]:
# Query the test table to verify upload worked
verify_query = f"""
SELECT * FROM `{PROJECT_ID}.{DATASET_ID}.connection_test`
LIMIT 10
"""

try:
    query_job = client.query(verify_query)
    results = query_job.result()
    df_verify = results.to_dataframe()
    
    print("Sample data successfully uploaded and retrieved:")
    print(df_verify)
    
except Exception as e:
    print(f"Error querying test table: {e}")

## Next Steps

✅ BigQuery connection established  
✅ Dataset created  
✅ Upload function tested  

**Ready for the main project workflow:**

1. **Data Ingestion**: Load Belgian brewery data from Google Sheets
2. **Data Enrichment**: Use Python + geocoding API to get brewery locations  
3. **dbt Setup**: Create transformation pipeline
4. **Hex Dashboard**: Build analytics dashboard

Update the `PROJECT_ID` variable above with your actual Google Cloud Project ID to proceed.