![https://pieriantraining.com/](../PTCenteredPurple.png)


## Capstone Project: AWS Media Library Management System

In this capstone project, we'll delve into the integration of AWS services with Python through Boto3. Our objective is to construct a robust system for users to seamlessly upload media files such as images, videos, and audio, which will then be carefully processed, cataloged, and securely stored on AWS.

This project offers the opportunity to apply your knowledge of AWS and Boto3 in a practical scenario.


### Components:

1. S3 Bucket - To store the uploaded media files.
2. DynamoDB - To store the metadata of each media file.
3. Lambda Functions - To process media files after upload.
4. Searching - To obtain stored results and download them.


## Tasks

### 1. Infrastructure
1. Create an S3 bucket for storing media files.
2. Set up a DynamoDB table to store the metadata of the media files. If you want, you can create a secondary key for the file type
3. Configure / Verify your IAM roles and permissions needed for Lambda, S3, DynamoDB

**Create S3 Bucket**

In [None]:
### Create S3 Bucket ###
import boto3

**Create Table Definition**

In [None]:
### Basic Table Definition ###
table_name = # TODO
attributes = [

    # TODO. Feel free to use as many attributes as you want


]

key_schema = [
    
    # TODO
]

provisioned_throughput = {
    'ReadCapacityUnits': 5,
    'WriteCapacityUnits': 5
}


**Create the database**

In [None]:
# Create the database

response = dynamo_client.create_table(
    #TODO
)

### 2. Media Upload

1. Use Boto3 to create a Python function to upload media files to the S3 bucket.


In [None]:
from pathlib import Path
def upload_to_s3(local_file_path, bucket_name):
    """
    Uploads a file to an S3 bucket.

    Args:
    - local_file_path (str): Path to the local file to be uploaded.
    - bucket_name (str): Name of the S3 bucket where the file should be uploaded.

    Returns:
    - str: Path to the uploaded file in the S3 bucket in the format "bucket_name/filename".
    """
    # TODO


### 3. Processing Media
1. Create a Lambda function that extracts the metadata from the media (file type, size) and saves it to the DynamoDB table.
2. This function should be triggered, once a file is uploaded
3. Save your code to a .py file


In [None]:
### Lambda code. Similar to lambda.py. Do not run this code here, as it will raise an exception ###

### Required lambda imports and clients
from pathlib import Path
import boto3

dynamodb = boto3.client('dynamodb')
###

### Extract metadata from the file ###
def extract_metadata(event):
    """
    Extracts metadata from an S3 event.

    Args:
    - event (dict): The S3 event from which metadata is to be extracted.

    Returns:
    - tuple: A tuple containing:
        - bucket (str): The name of the S3 bucket.
        - key (str): The key (path) of the object in the S3 bucket.
        - file_type (str): The file type (extension) of the object, or "None" if it doesn't have an extension.
        - size (int): The size of the object in bytes.

    Note:
    Assumes the event is an S3 PutObject event and has the appropriate structure.
    """

    # TODO

### Add metadata to database. Use file identifier as id ###
def add_to_database(bucket, key, file_type, size):
    """
    Adds metadata information to a DynamoDB table.

    Args:
    - bucket (str): The name of the S3 bucket where the file is located.
    - key (str): The key (path) of the object in the S3 bucket.
    - file_type (str): The file type (extension) of the object.
    - size (int): The size of the object in bytes.

    Outputs:
    - Prints the response from the DynamoDB put_item operation.
    """

    # TODO

### Lambda handler routine ###
def lambda_handler(event, context):
    """
    AWS Lambda function handler that processes an S3 event, extracts metadata 
    from the event, and then adds the metadata to a DynamoDB table.

    Args:
    - event (dict): The S3 event triggered when a new object is added to the bucket.
    - context (obj): AWS Lambda context object (not used, but included as it's a standard parameter).

    Outputs:
    - Prints the file type and size (in kilobytes) of the uploaded object.
    - Calls the `add_to_database` function to store the metadata in DynamoDB.
    - Prints the response from the DynamoDB operation within the `add_to_database` function.
    """

    # TODO


**Save your lambda code as a .py file**

In [2]:
# TODO

**Create an IAM role for Lambda trigger. Note that it needs to have s3 Read access as well as dynamodb PUT access**

In [None]:
# Create an IAM role for Lambda trigger. Note that it needs to have s3 Read access as well as dynamodb PUT access

lambda_execution_policy = {
    # TODO
}

role_response = iam_client.create_role(
    # TODO
)

iam_client.put_role_policy(
    # TODO
)

# Get the ARN of the created role
role_arn = # TODO

**Create the lambda function**<br />
Note that you should read your function_code

In [3]:
# Read in function_code
with open(# TODO)
    function_code # TODO

In [None]:
### Create Lambda function ###
function_name = # TODO

import io
import zipfile


with io.BytesIO() as deployment_package:
    with zipfile.ZipFile(deployment_package, 'w') as zipf:
        zipf.writestr(# TODO)

    create_function_response = lambda_client.create_function(
       # TODO
    )


**Create Inline Permission**

In [None]:
### Inline Permission ###

bucket_arn = # TODO
lambda_client.add_permission(
     # TODO
 )


**Define Event Configuration**

In [None]:
### Define the event configuration ###
event_configuration = {
    # TODO
}

# Configure the S3 event trigger
s3_client.put_bucket_notification_configuration(
    # TODO
)



### 4. Upload Data.
Let's upload some data. You can use the data provided in the **data** directory

In [None]:
for data in Path("data/").glob("*"):
    # TODO

### 5. Search and Retrieve Media:

1. Create a Python function using Boto3 that allows users to search the DynamoDB table based on various parameters (file type, size, etc ...).
2. The function should return a list of files that match the search criteria.
3. Allow users to download the file from the S3 bucket using a pre-signed URL.

In [None]:
def search_dynamodb(file_type=None, size=0):
    """
    Searches a DynamoDB table for media files based on their file type and/or size.

    Args:
    - file_type (str, optional): The file type (extension) to filter by. Defaults to None.
    - size (int, optional): The minimum size (in kilobytes) to filter by. Defaults to 0.

    Returns:
    - list: A list of items (dicts) from the DynamoDB table that match the search criteria.
    """

    # TODO

In [None]:
def generate_presigned_urls(query_response_list, expiration=300):
    """
    Generates presigned URLs for items in a given DynamoDB query response list.

    Args:
    - query_response_list (list): A list of items (dicts) from a DynamoDB query response.
    - expiration (int, optional): The number of seconds the presigned URL is valid for. Defaults to 300 seconds.

    Returns:
    - dict: A dictionary where:
        - key: The 'id' (in the format "bucket/key") of the item from the query response.
        - value: The generated presigned URL for the corresponding item.
    """

    # TODO