# AWS DynamoDB

- DynamoDB is a **fully managed NoSQL** database provided by AWS, offering automatic scalability without manual intervention.

- Designed for **high availability and low-latency** performance, DynamoDB can handle millions of requests per second and scales horizontally to accommodate varying workloads.

- Supports **flexible data models**, including structured, semi-structured, or unstructured data, along with different consistency options to meet application requirements. Features like secondary indexes, streams, and global tables enhance functionality and data access patterns.

- In DynamoDB, a single table can hold multiple entities through the use of a composite primary key, which consists of a partition key and optionally a sort key.

https://boto3.amazonaws.com/v1/documentation/api/1.20.5/reference/services/dynamodb.html


In [None]:
import boto3
import pandas as pd
import json
import tempfile
import os

## Connections

In [None]:
# Create a session using a specific profile
session = boto3.Session(profile_name='default')

# Access the credentials
credentials = session.get_credentials()

# Access the access key ID and secret access key
aws_access_key_id = credentials.access_key
aws_secret_access_key = credentials.secret_key
region_name = 'us-east-1'

#print(aws_access_key_id)

In [None]:
# Initialize DynamoDB client
dynamodb = boto3.client('dynamodb', aws_access_key_id=aws_access_key_id,
                  aws_secret_access_key=aws_secret_access_key,
                  region_name=region_name)

## Create Table

In [None]:
#table name
table_name = 'QuizData'

# Create DynamoDB table
table = dynamodb.create_table(
    TableName=table_name,
    KeySchema=[
        {
            'AttributeName': 'PK',
            'KeyType': 'HASH'  # Partition key
        },
        {
            'AttributeName': 'SK',
            'KeyType': 'RANGE'  # Sort key
        }
    ],
    AttributeDefinitions=[
        {
            'AttributeName': 'PK',
            'AttributeType': 'S'
        },
        {
            'AttributeName': 'SK',
            'AttributeType': 'S'
        }
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }
)

# Wait until the table is created
waiter = dynamodb.get_waiter('table_exists')
waiter.wait(TableName='QuizData')

print(f"DynamoDB table '{table_name}' created successfully!")

## Insert Data

In [None]:
# Initialize DynamoDB client
dynamodb_resource = boto3.resource('dynamodb', aws_access_key_id=aws_access_key_id,
                  aws_secret_access_key=aws_secret_access_key,
                  region_name=region_name)

In [None]:
# Initialize DynamoDB client
table_name = 'QuizData'
table = dynamodb_resource.Table(table_name)

# Sample dummy data
dummy_data = [
    # Sample questions
    {'PK': 'Question', 'SK': 'Question#1', 'Data': {'question_text': 'What is the capital of France?', 'options': ['Paris', 'London', 'Berlin', 'Madrid'], 'correct_answer': 'Paris'}},
    {'PK': 'Question', 'SK': 'Question#2', 'Data': {'question_text': 'What is the capital of Germany?', 'options': ['Paris', 'London', 'Berlin', 'Madrid'], 'correct_answer': 'Berlin'}},
    
    # Sample answers
    {'PK': 'Answer', 'SK': 'Answer#1', 'Data': {'question_id': '1', 'student_id': '1', 'selected_option': 'Paris'}},
    {'PK': 'Answer', 'SK': 'Answer#2', 'Data': {'question_id': '2', 'student_id': '2', 'selected_option': 'Berlin'}},
    
    # Sample students
    {'PK': 'Student', 'SK': 'Student#1', 'Data': {'name': 'John Doe', 'email': 'john@example.com', 'grade': '10'}},
    {'PK': 'Student', 'SK': 'Student#2', 'Data': {'name': 'Jane Smith', 'email': 'jane@example.com', 'grade': '11'}},
    
    # Sample teachers
    {'PK': 'Teacher', 'SK': 'Teacher#1', 'Data': {'name': 'Mr. Brown', 'email': 'brown@example.com', 'department': 'Mathematics'}},
    {'PK': 'Teacher', 'SK': 'Teacher#2', 'Data': {'name': 'Ms. White', 'email': 'white@example.com', 'department': 'Science'}}
]

# Populate the table with dummy data
for item in dummy_data:
    table.put_item(Item=item)

print("Dummy data inserted into DynamoDB table successfully!")

## Select Data

Partition key (PK) is used for data distribution and scalability, while the sort key (SK) enables efficient querying and sorting of items within a partition. Together, they form the composite primary key structure that uniquely identifies items within a DynamoDB table. 

In [None]:
# Helper function to paginate through DynamoDB results
def paginate_items():
    paginator = dynamodb.get_paginator('scan')
    for page in paginator.paginate(TableName=table_name):
        yield from page['Items']

try:
    # Fetch all items from DynamoDB table
    items = list(paginate_items())

    # Convert items to DataFrame
    df = pd.DataFrame(items)

    # Convert DynamoDB types to native Python types
    for column in df.columns:
        df[column] = df[column].apply(lambda x: list(x.values())[0])

    # Print DataFrame
    print(df)

except Exception as e:
    print("Error:", e)