# Using an Amazon SageMaker Studio Notebook

This notebook walks you through initializing and setting up the environment. It shows you how to upload and download the data to and from an Amazon Simple Storage Service (Amazon S3) bucket using the Amazon S3 helper methods from the SageMaker Python SDK.

Start by initializing the environment. To do this, import the required libraries and get the default Amazon S3 bucket that SageMaker Studio uses.

In [None]:
# Install dependencies
import pandas as pd
import boto3


# SageMaker dependencies
import sagemaker
from sagemaker import get_execution_role, session
from sagemaker.s3 import S3Downloader, S3Uploader

region= boto3.Session().region_name

# This object represents the AWS Identity and Access Management (IAM) role that you are assigned
role = sagemaker.get_execution_role()
print("Role: ", role)

sm_session = session.Session(boto3.Session())
sm = boto3.Session().client("sagemaker")
sm_runtime = boto3.Session().client("sagemaker-runtime")

Download the dataset from the Amazon S3 bucket using the SageMaker Python SDK S3Downloader method. Refer to [SageMaker Python SDK S3Downloader method](https://sagemaker.readthedocs.io/en/stable/api/utility/s3.html#sagemaker.s3.S3Downloader) for more information about this downloader method.

In [None]:
bucket = ''
s3 = boto3.resource('s3')
for buckets in s3.buckets.all():
    if 'labdatabucket' in buckets.name:
        bucket = buckets.name
print(bucket)
prefix = 'scripts/data'

S3Downloader.download(s3_uri=f"s3://{bucket}/{prefix}/iris.csv", local_path= 'data/')

The following code verifies the dataset and displays it in a grid.

In [None]:
import pandas as pd
import numpy as np

shape=pd.read_csv("data/iris.csv", header=None)
shape.sample(3)

Partition the dataset into train and test splits. Upload it to an Amazon S3 bucket using the SageMaker Python SDK S3Uploader method. Refer to [SageMaker Python SDK S3Uploader method](https://sagemaker.readthedocs.io/en/stable/api/utility/s3.html#sagemaker.s3.S3Uploader) for more information about this uploader method.

In [None]:
train_data = shape.sample(frac=0.8,random_state=200)
test_data = shape.drop(train_data.index)

In [None]:
train_file = 'train_data.csv';
train_data.to_csv(train_file, index=False, header=True)
test_file = 'test_data.csv';
test_data.to_csv(test_file, index=False, header=True)

# Return the URLs of the uploaded file, so they can be reviewed or used elsewhere
s3url = S3Uploader.upload(train_file, 's3://{}/{}'.format(bucket, prefix + "/train", "train"))
print(s3url)
s3url = S3Uploader.upload(test_file, 's3://{}/{}'.format(bucket, prefix + "/test", "test"))
print(s3url)

### Conclusion

Congratulations! You have successfully initialized your environment and uploaded and downloaded files from an Amazon S3 bucket. 

### Cleanup

You have completed this notebook. To move to the next part of the lab, do the following:

- Close this notebook file
- Return to the lab instructions