## Setting-up the Lambda function

<ol>
  <li>Log-in to AWS and head to the IAM Management Console to create a new role with a Lambda use case, attaching the following policies:
    <ul>
      <li><i>S3FullAccess</i></li>
      <li><i>CloudWatchFullAccess</i></li>
    </ul>
  </li>
  <li>Go to the Lambda Console and create a new function, selecting the option <i>Author from scratch</i>.
  <br>
  In the <i>Change default execution role</i> section, select <i>Use an existing role</i> to add the IAM role you've created.
  </li>
  <li>Once the function has been created, add an S3 trigger. Select the bucket created earlier and used for DMS where data will be loaded into.
  <li>In the <i>Code</i> tab, delete the default code present and replace it with the code in this notebook below, before clicking <i>Deploy</i>.</li>
</ol>

In [0]:
# Import libraries
import json
import boto3

# Create function to extract the filename of the data and the name of its S3 bucket
def lambda_handler(event, context):
    
    bucketName = event["Records"][0]["s3"]["bucket"]["name"]
    fileName = event["Records"][0]["s3"]["object"]["key"]

    glue = boto3.client('glue')
    
    # The JobName specified here should be used when naming the Glue job to be 
    # created in the next step of this project
    response = glue.start_job_run(
        JobName = 'glueCDC-pyspark',
        Arguments = {
            '--s3_target_path_key': fileName,
            '--s3_target_path_bucket': bucketName
        } 
    )
    
    return {
        'statusCode': 200,
        'body': json.dumps('Hello from Lambda!')
    }