# This is a simple script to move Kaggle files to Amazon Web Services (AWS) S3 bucket

1. If you look at the first code cell you will see that you will need the following data: "aws_access_key_id", "aws_secret_access_key", "aws_region" -- below will show you how to obtain these pieces of data. 

2. **If you do not have an AWS account go here and sign up:**
https://aws.amazon.com/ Then create an AWS S3 account and bucket. 
AWS S3 storage service: https://aws.amazon.com/s3/

3. If (or once) you have an AWS account, [Follow these instructions in order](https://docs.aws.amazon.com/powershell/latest/userguide/pstools-appendix-sign-up.html) to obtain your access key and secret access key. Save those as a CSV so you can fill in the Secrets info.

4. Then you need to select a region. Select the one closest to your location:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html#concepts-available-regions The closest region to me is: us-east-2. Feel free to use. 

5. You will need to go to the menu above and click Add-ons>Secrets - to add all the information you gathered above. 

See example - label is each of these: "aws_access_key_id", "aws_secret_access_key", "aws_region" and the value obtained above.
[![Screen-Shot-2022-02-19-at-1-19-18-PM.png](https://i.postimg.cc/Z5BWsY88/Screen-Shot-2022-02-19-at-1-19-18-PM.png)](https://postimg.cc/68K6yX1q)

**If you are a startup you can get free credits here:**
https://aws.amazon.com/startups/
https://aws.amazon.com/activate/

**Depending on the size of your files it can take an hour or two. Enjoy!**

In [None]:
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
aws_id = user_secrets.get_secret("aws_access_key_id")
aws_key = user_secrets.get_secret("aws_secret_access_key")
aws_region = user_secrets.get_secret("aws_region")

In [None]:
#If your Secret worked properly you should see a print
# of your current region below

aws_region

Below you will see:

bucket_name = ''.join(['happywhales', str(uuid.uuid4())])

Put in whatever you want as a name instead of happywhales. Don't add / or - as this can break the API. 

In [None]:
import boto3
import uuid
import os

s3 = boto3.resource(
    's3',
    aws_access_key_id=aws_id,
    aws_secret_access_key=aws_key,
)

s3_client = boto3.client(
    's3',
    aws_access_key_id=aws_id,
    aws_secret_access_key=aws_key,
)
bucket_name = ''.join(['happywhales', str(uuid.uuid4())])
bucket_response = s3_client.create_bucket(Bucket=bucket_name)

response = s3_client.list_buckets()

# Output the bucket names
#print('Existing buckets:')
#for bucket in response['Buckets']:
#    print(f'  {bucket["Name"]}')


for dirname, _, filenames in os.walk('/kaggle/input/'):
    for filename in filenames:
#        print(os.path.join(dirname, filename))
        file_name = os.path.join(dirname, filename)
        response = s3_client.upload_file(file_name, bucket_name, filename)

# Let us get some feedback: the list of the objects of our Bucket is very verbose and we can
# check that everything is OK

s3_client.list_objects_v2(Bucket=bucket_name)

And that is it! If you have questions ask in comments below and feel free to upvote if you liked or found this useful!