# AWS Infrastructure Setup
This notebook provisions foundational AWS resources for Riley Inc. demos. The primary objective is to create an Amazon S3 bucket that will host knowledge base documents and other evaluation artifacts.

## Prerequisites
- AWS credentials configured in your environment (e.g., `~/.aws/credentials`, environment variables, or an instance role).
- IAM permissions for `s3:CreateBucket`, `s3:PutBucketVersioning`, and `s3:PutObject`.
- Python environment with `boto3` installed.

In [None]:
import boto3
from botocore.exceptions import ClientError
from pathlib import Path
import json
import time

## Configure Bucket Parameters
Set a globally unique bucket name. Including the date or a random suffix helps avoid collisions.

In [None]:
aws_region = "us-west-2"
bucket_base_name = "riley-inc-rag-knowledge-base"
timestamp_suffix = time.strftime("%Y%m%d-%H%M%S")
bucket_name = f"{bucket_base_name}-{timestamp_suffix}"
print(f"Target bucket: {bucket_name}")

## Create Knowledge Base Bucket
The cell below attempts to create the bucket. If the bucket already exists under your account, the code will skip creation.

In [None]:
s3_client = boto3.client("s3", region_name=aws_region)

def ensure_bucket(client, name, region):
    try:
        client.head_bucket(Bucket=name)
        print(f"Bucket '{name}' already exists and is accessible.")
    except ClientError as err:
        error_code = int(err.response['Error']['Code'])
        if error_code == 404:
            create_args = {
                'Bucket': name
            }
            if region != 'us-east-1':
                create_args['CreateBucketConfiguration'] = {'LocationConstraint': region}
            client.create_bucket(**create_args)
            print(f"Bucket '{name}' created in {region}.")
        else:
            raise

ensure_bucket(s3_client, bucket_name, aws_region)

## Enable Versioning (Optional but Recommended)

In [None]:
versioning = boto3.resource("s3", region_name=aws_region).BucketVersioning(bucket_name)
versioning.enable()
print(f"Knowledge base bucket versioning status: {versioning.status}")

## Create Evaluation Results Bucket
Provision a separate bucket to capture evaluation outputs and artifacts.

In [None]:
eval_bucket_base_name = "riley-inc-rag-eval-results"
eval_bucket_name = f"{eval_bucket_base_name}-{timestamp_suffix}"
ensure_bucket(s3_client, eval_bucket_name, aws_region)
eval_versioning = boto3.resource("s3", region_name=aws_region).BucketVersioning(eval_bucket_name)
eval_versioning.enable()
print(f"Evaluation results bucket ready: s3://{eval_bucket_name} (versioning={eval_versioning.status})")

## Upload Knowledge Base Documents
This helper uploads all markdown files from `knowledge_base_docs` into the knowledge base bucket under the `knowledge-base/` prefix.

In [None]:
knowledge_base_dir = Path('../knowledge_base_docs')
prefix = 'knowledge-base/'
upload_manifest = []

for path in knowledge_base_dir.glob('*.md'):
    key = prefix + path.name
    s3_client.upload_file(str(path), bucket_name, key)
    upload_manifest.append({'file': path.name, 's3_key': key})
    print(f"Uploaded {path.name} -> s3://{bucket_name}/{key}")

print(json.dumps(upload_manifest, indent=2))

## Cleanup (Optional)
Use this cell to remove all objects and delete the buckets when you no longer need them.

In [None]:
perform_cleanup = False  # Set to True to run cleanup

if perform_cleanup:
    s3_resource = boto3.resource("s3", region_name=aws_region)

    def empty_and_delete(bucket_name_to_remove):
        bucket = s3_resource.Bucket(bucket_name_to_remove)
        print(f"Clearing bucket: {bucket_name_to_remove}")
        bucket.object_versions.delete()
        bucket.objects.delete()
        bucket.delete()
        print(f"Bucket {bucket_name_to_remove} deleted.")

    for target in (bucket_name, eval_bucket_name):
        try:
            empty_and_delete(target)
        except ClientError as err:
            print(f"Failed to delete {target}: {err}")
else:
    print("Cleanup skipped. Set perform_cleanup=True to remove buckets.")

## Next Steps
- Store evaluation datasets and prompt libraries in the same bucket.
- Grant access to downstream AWS services (e.g., Bedrock, SageMaker) via bucket policies.
- Enable server-side encryption and lifecycle rules if required by compliance policies.