## Redshift Setup with Python SDK (boto3)
This notebook will show how to set up some AWS resources using the Python SDK for AWS, boto3.

Boto3 Documentation: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/redshift.html

---

#### Package Import

---

In [24]:
import boto3
import re
import configparser


In [31]:
#Custom code for importing AWS credentials from where VS Code stores them on local machine to a local Python file,
#which can be more easily imported as credentials to Python (https://stackoverflow.com/questions/25501403/storing-the-secrets-passwords-in-a-separate-file)

'''
Note: Deprecated. Could be a cool method to store credentials in the future, but not worth creating a whole new file for
just these credentials (less secure too).
'''
#import shutil

# #Find and copy file
# orig_path = "/home/rambino/.aws/credentials"
# #Must be full file path to ensure .gitignore understands it
# new_file_name = "cred_*.py"
# new_file_name = os.getcwd() + "/" + new_file_name

# shutil.copyfile(orig_path,new_file_name)

# #add to local .gitignore (credentials):
# with open("../.gitignore","a") as file:
#     file.write("\n")
#     file.write(new_file_name)


---

#### Loading Credentials from file

---

In [28]:
config = configparser.ConfigParser()

config.read_file(open("/home/rambino/.aws/credentials"))
aws_key         = config.get('udacity_course','aws_access_key_id')
aws_secret      = config.get('udacity_course','aws_secret_access_key')

config.read_file(open("./redshift_credentials.cfg"))
redshift_user   = config.get('redshift_credentials','UN')
redshift_password   = config.get('redshift_credentials','PW')

['udacity_course', 'redshift_credentials']

---

#### Creating IAM role for Redshift

---

In [29]:
iam = boto3.client('iam',
    region_name             = "us-west-2",
    aws_access_key_id       = aws_key,
    aws_secret_access_key   = aws_secret
)

In [12]:
#Create IAM role:

#This policy is something about allowing Redshift to impersonate a user, but I don't really understand it.
#Look more into what "sts:AssumeRole" really means.

import json

dwhRole = iam.create_role(
    Path = "/",
    RoleName =  "RedShift_Impersonation",
    Description = "Allows redshift to access S3",
    AssumeRolePolicyDocument=json.dumps(
        {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Action": 'sts:AssumeRole',
                    "Principal":{"Service": "redshift.amazonaws.com"}
                }
            ]
        }
    )
)

dwhRole

EntityAlreadyExistsException: An error occurred (EntityAlreadyExists) when calling the CreateRole operation: Role with name RedShift_Impersonation already exists.

In [21]:
role = iam.get_role(RoleName = "Redshift_Impersonation")
role_arn = role['Role']['Arn']

arn:aws:iam::380710778029:role/RedShift_Impersonation


In [73]:
#Attaching IAM policy to the role (which actually gives permissions):

attach_response = iam.attach_role_policy(
    RoleName = "RedShift_Impersonation",
    PolicyArn="arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"
)

attach_response

{'ResponseMetadata': {'RequestId': 'adff6259-83f2-4371-8a63-2c18104d452d',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'adff6259-83f2-4371-8a63-2c18104d452d',
   'content-type': 'text/xml',
   'content-length': '212',
   'date': 'Sun, 14 Aug 2022 18:56:57 GMT'},
  'RetryAttempts': 0}}

---

#### Creating Redshift cluster

---

In [9]:
redshift = boto3.client('redshift',
    region_name             = "us-west-2",
    aws_access_key_id       = aws_key,
    aws_secret_access_key   = aws_secret
)

In [23]:
#Documentation: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/redshift.html#Redshift.Client.create_cluster
redshift_response = redshift.create_cluster(
    ClusterType = "multi-node",
    NodeType = 'dc2.large',
    NumberOfNodes = 4,
    DBName = "my_redshift_db",
    ClusterIdentifier = 'redshift-cluster-1',
    MasterUsername = redshift_cred["user"],
    MasterUserPassword = redshift_cred["password"],
    IamRoles = [role_arn]
)

'''
WARNING! After running this code, you WILL create a Redshift cluster. Be sure to delete it to not incur costs!!
'''

redshift_response

{'Cluster': {'ClusterIdentifier': 'redshift-cluster-1',
  'NodeType': 'dc2.large',
  'ClusterStatus': 'creating',
  'ClusterAvailabilityStatus': 'Modifying',
  'MasterUsername': 'dev',
  'DBName': 'my_redshift_db',
  'AutomatedSnapshotRetentionPeriod': 1,
  'ManualSnapshotRetentionPeriod': -1,
  'ClusterSecurityGroups': [],
  'VpcSecurityGroups': [{'VpcSecurityGroupId': 'sg-082d4747234261501',
    'Status': 'active'}],
  'ClusterParameterGroups': [{'ParameterGroupName': 'default.redshift-1.0',
    'ParameterApplyStatus': 'in-sync'}],
  'ClusterSubnetGroupName': 'default',
  'VpcId': 'vpc-0055627b0d43048a7',
  'PreferredMaintenanceWindow': 'wed:09:00-wed:09:30',
  'PendingModifiedValues': {'MasterUserPassword': '****'},
  'ClusterVersion': '1.0',
  'AllowVersionUpgrade': True,
  'NumberOfNodes': 4,
  'PubliclyAccessible': True,
  'Encrypted': False,
  'Tags': [],
  'EnhancedVpcRouting': False,
  'IamRoles': [{'IamRoleArn': 'arn:aws:iam::380710778029:role/RedShift_Impersonation',
    'Ap

In [36]:
redshift.describe_clusters()

{'Clusters': [],
 'ResponseMetadata': {'RequestId': '3c3fe77c-7911-4154-ba0b-56172c545bf1',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': '3c3fe77c-7911-4154-ba0b-56172c545bf1',
   'content-type': 'text/xml',
   'content-length': '287',
   'date': 'Mon, 15 Aug 2022 17:33:51 GMT'},
  'RetryAttempts': 0}}

In [31]:
response = redshift.delete_cluster(
    ClusterIdentifier = 'redshift-cluster-1',
    SkipFinalClusterSnapshot=True
)

---

#### Creating S3 Bucket

---

In [41]:
s3 = boto3.client('s3',
    region_name             = "us-west-2",
    aws_access_key_id       = aws_key,
    aws_secret_access_key   = aws_secret
)

In [52]:
'''
#This command is telling me my bucket name is invalid even though it is not. Not sure why:

s3_response = s3.create_bucket(
    Bucket = "whyWontBucketWork-udacitycourse",
    CreateBucketConfiguration = {
        'LocationConstraint':'us-west-2'
    },
    
)
'''

s3_resource = boto3.resource('s3',
    aws_access_key_id       = aws_key,
    aws_secret_access_key   = aws_secret
)
bucket = s3_resource.Bucket("udacitybucket17") #Bucket I made manually previously

#Iterate over files in a bucket:
bucket_data = bucket.objects.all()
for file in bucket_data:
    print(file)

s3.ObjectSummary(bucket_name='udacitybucket17', key='AWS_Code.md')
