### Part 1: AWS S3 & Sourcing Datasets
Republish this open dataset in Amazon S3 and share with us a link.
You may run into 403 Forbidden errors as you test accessing this data. There is a way to comply with the BLS data access policies and re-gain access to fetch this data programatically - we have included some hints as to how to do this at the bottom of this README in the Q/A section.

# Required sub-steps to accomplisht this:
1. Set up S3 bucket environmnet
2. Be able to read and publish into S3 env. 
3. troubleshoot error

- Key notes: This is DataLake design. Create a landing-zone where data will be uploaded and parsed under upload date. The goal is just to have a starting point to being bringing in data into s3.

In [1]:
import boto3
import os
from dotenv import load_dotenv

In [2]:
load_dotenv('.env')

True

In [7]:
access_key = os.getenv('AWS_ACCESS_KEY')
secret_key = os.getenv('AWS_SECRET_ACCESS_KEY')

In [8]:
s3_client = boto3.client(
    's3',
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key
)

In [9]:
response = s3_client.list_buckets()

if 'Buckets' in response:
    buckets = response['Buckets']
    for bucket in buckets:
        print(bucket['Name'])

2021-04-02-ep-website
aws-emr-resources-386175835981-us-east-2
aws-logs-386175835981-us-east-2
bigbatbucket
canvas-bucket-iris-4123
dbdatalocation
ed-exp-cost-and-usage
eddysfistbuck
edwardplatagschoolcap
eplatacapstonedata
eplatacapstoneipynb
qep-sports-betting-bucket
rearc-datalake-bucket
redditdatacollectionwemeta
sagemaker-soln-ddf-js-2ruwg4-386175835981-us-east-1
sagemaker-soln-ddf-js-2seloa-386175835981-us-east-1
sagemaker-soln-ddf-js-2sf3s6-386175835981-us-east-1
sagemaker-soln-ddf-js-44xdya-386175835981-us-east-1
sagemaker-soln-documents-js-4htc2a-us-east-1-386175835981
sagemaker-studio-386175835981-l4ayz3cscdq
sagemaker-studio-386175835981-l9tzph12na
sagemaker-studio-386175835981-zzqac2052o
sagemaker-us-east-2-386175835981
someonesbucket
ss-discord-group-minecraft-bucket


We have access to the s3 bucekt

#### Part 1: AWS S3 & Sourcing Datasets
1. Republish [this open dataset](https://download.bls.gov/pub/time.series/pr/) in Amazon S3 and share with us a link.
    - You may run into 403 Forbidden errors as you test accessing this data. There is a way to comply with the BLS data access policies and re-gain access to fetch this data programatically - we have included some hints as to how to do this at the bottom of this README in the Q/A section.
2. Script this process so the files in the S3 bucket are kept in sync with the source when data on the website is updated, added, or deleted.
    - Don't rely on hard coded names - the script should be able to handle added or removed files.
    - Ensure the script doesn't upload the same file more than once.

In [12]:
import os
import requests
import boto3

# AWS credentials
access_key = os.getenv('AWS_ACCESS_KEY')
secret_key = os.getenv('AWS_SECRET_ACCESS_KEY')

# S3 bucket and landing zone details
bucket_name = 'rearc-datalake-bucket'
landing_zone_prefix = 'landing-zone/'

# Create an S3 client
s3_client = boto3.client(
    's3',
    aws_access_key_id=access_key,
    aws_secret_access_key=secret_key
)

# Check if the landing zone exists, create it if it doesn't
response = s3_client.list_objects_v2(
    Bucket=bucket_name,
    Prefix=landing_zone_prefix
)

if 'Contents' not in response:
    s3_client.put_object(
        Bucket=bucket_name,
        Key=landing_zone_prefix
    )
    print(f"Landing zone '{landing_zone_prefix}' created in bucket '{bucket_name}'.")

# Fetch data from the provided link and upload to S3 landing zone
url = 'https://download.bls.gov/pub/time.series/pr/'
response = requests.get(url)

if response.status_code == 200:
    files = response.text.split('\n')
    for file in files:
        if file.endswith('.txt'):
            file_name = file.split('/')[-1]
            s3_client.put_object(
                Bucket=bucket_name,
                Key=f"{landing_zone_prefix}{file_name}",
                Body=requests.get(f"{url}{file}").content
            )
            print(f"Uploaded '{file_name}' to landing zone.")
