# SageMaker Demo
This notebook is intended to be used with a [SageMaker notebook instance](https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html) launched using the following [CloudFormation](https://docs.aws.amazon.com/cloudformation/) template:

- [sagemaker-notebook-cloudformation.yml](https://github.com/managedkaos/jupyter-environment-details/blob/main/sagemaker-notebook-cloudformation.yml)

Together the CloudFormation template and this notebook demonstrate:

- Attaching an IAM role to a SageMaker instance with policies that allow the instance to use other AWS services
- Using the [Boto3 Python library](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) to create clients for accessing AWS services
- Using boto3 clients to read from [Parameter Store](https://docs.aws.amazon.com/systems-manager/latest/userguide/systems-manager-parameter-store.html) and write to an [S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html)



## UPDATE THE FOLLOWING VALUES FOR THE NOTEBOOK TO WORK CORRECTLY

In [None]:
# Replace with your value for NotebookInstanceName
NotebookInstanceName = "sagemaker-demo-202403112200"

# Replace with your region
region = "us-west-2"

## Install the Boto3 library and initialize clients for S3 and SSM

In [None]:
%pip install --quiet boto3

In [None]:
import boto3
s3_client = boto3.client('s3')
ssm_client = boto3.client('ssm')

## Create helper functions
- read_from_parameter_store(name)
- write_to_s3(bucket, key, content)

In [None]:
# Read value from Parameter Store
def read_from_parameter_store(name):
    response = ssm_client.get_parameter(Name=name, WithDecryption=True)
    return response['Parameter']['Value']

# Write data to the S3 bucket
def write_to_s3(bucket, key, body):
    s3_client.put_object(Bucket=bucket, Key=key, Body=body)
    print(f"\tSuccessfully wrote data to s3://{bucket}/{key}")


## Read from SSM ParameterStore 

In [None]:
# Constructed Parameter Names
s3_bucket_name_parameter = f"/{NotebookInstanceName}/S3Bucket"

# Get the S3 bucket name from Parameter Store
bucket_name = read_from_parameter_store(s3_bucket_name_parameter)
print(f"\tS3 Bucket Name from ParameterStore: {bucket_name}")

## Do something really cool in the following cell...

In [None]:
print("Hello, World!")

## Generate data...

In [None]:
import os

data_directory = "./data"

os.makedirs(data_directory, exist_ok=True)

for i in range(0, 10):
    file_name = os.path.join(data_directory, f"data-{i}.html")
    with open(file_name, 'w') as file:
        file.write(f"sample data for file {i}")
        print(f"\tWrote data {i} to {file_name}")

## Upload data to S3 and create `index.html`

In [None]:
import fnmatch
import subprocess

website = f"http://{bucket_name}.s3-website-{region}.amazonaws.com"

# Use the fnmatch module to find all files in the current directory that end in ".html"
file_list = []
for root, dirnames, filenames in os.walk("."):
    for filename in fnmatch.filter(filenames, "*.html"):
        file_list.append(os.path.join(root, filename))

# Sort the file list alphabetically
file_list.sort()

# Create the HTML file and write the header
with open(os.path.join(".", "index.html"), "w") as f:
    f.write(
        """<html>
        <head>
            <title>Praxis 2023 HTML Output</title>
            <style>
                table {
                    border-collapse: collapse;
                    width: 100%;
                }
                th, td {
                    text-align: left;
                    padding: 8px;
                }
                th {
                    background-color: #007bff;
                    color: #fff;
                    font-weight: bold;
                }
                tr:nth-child(even) {
                    background-color: #f2f2f2;
                }
                tr:hover {
                    background-color: #ddd;
                }
            </style>
        </head>
        <body>
            <table>
                <tr><th>Name</th><th>Size</th></tr>\n
    """
    )

    # Loop through each file and add a row to the table
    for file_name in file_list:
        if file_name in ["./index.html"]:
            continue

        file_size = os.path.getsize(file_name)
        f.write(
            f'<tr><td><a href="{website}/{file_name}" target="_blank" rel="noopener noreferrer">{file_name}</a></td><td>{int(file_size / 1048576)} MB</td></tr>\n'
        )

    # Write the footer and close the file
    f.write("</table></body></html>")
    f.close()

command = [
    "aws",
    "s3",
    "sync",
    ".",
    f"s3://{bucket_name}",
    "--exclude",
    "*",
    "--include",
    "*.html",
    "--no-progress",
]

# Run the command and wait for it to complete
output = subprocess.run(command, capture_output=True, text=True)

# Print the output
print(output.stdout)
print("fin")

## Read the bucket contents

In [None]:
objects = s3_client.list_objects_v2(Bucket=bucket_name)

print(f"Contents of bucket {bucket_name}:")
for obj in objects['Contents']:
    print(f"\t{obj['Key']}")

In [None]:
from IPython.display import display, Markdown

markdown_text = f"""
## Conclusion
Use the following link to view the data in the S3 bucket's website:

- {website}
"""

display(Markdown(markdown_text))
