## DynamoDB
First, let's create a DynamoDB table to collect and store streaming Twitter data in our database.
Twitter 'username' will be the primary key here.

In [1]:
import boto3

dynamodb = boto3.resource('dynamodb')

table = dynamodb.create_table(
    TableName='twitter',
    KeySchema=[
        {
            'AttributeName': 'username',
            'KeyType': 'HASH'
        }
    ],
    AttributeDefinitions=[
        {
            'AttributeName': 'username',
            'AttributeType': 'S'
        }
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 1,
        'WriteCapacityUnits': 1
    }
)

# Wait until AWS confirms that table exists before moving on
table.meta.client.get_waiter('table_exists').wait(TableName='twitter')

# get data about table (should currently be no items in table)
print(table.item_count)
print(table.creation_date_time)

0
2023-04-18 15:59:28.294000-05:00


Let's actually put some items into our table:

In [2]:
table.put_item(
   Item={
        'username': 'macss',
        'num_followers': 100,
        'num_tweets': 5
    }
)

table.put_item(
   Item={
        'username': 'jon_c',
        'num_followers': 10,
        'num_tweets': 0
    }
)

{'ResponseMetadata': {'RequestId': 'SF8BJ9S35VNKI9SQKL7DVB6UJBVV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Tue, 18 Apr 2023 20:59:48 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '2',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'SF8BJ9S35VNKI9SQKL7DVB6UJBVV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '2745614147'},
  'RetryAttempts': 0}}

We can then easily get items from our table using the `get_item` method and providing our key:

In [3]:
response = table.get_item(
    Key={
        'username': 'macss'
    }
)
item = response['Item']
print(item)

{'num_followers': Decimal('100'), 'username': 'macss', 'num_tweets': Decimal('5')}


We can also update existing items using the `update_item` method:

In [4]:
table.update_item(
    Key={
        'username': 'macss'
    },
    UpdateExpression='SET num_tweets = :val1',
    ExpressionAttributeValues={
        ':val1': 6
    }
)

{'ResponseMetadata': {'RequestId': 'OG3HVBUA0I77UPITF2Q340JHOVVV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Tue, 18 Apr 2023 20:59:56 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '2',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'OG3HVBUA0I77UPITF2Q340JHOVVV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '2745614147'},
  'RetryAttempts': 0}}

Then, if we take a look again at this item, we'll see that it's been updated:

In [5]:
response = table.get_item(
    Key={
        'username': 'macss'
    }
)
item = response['Item']
print(item)

{'num_followers': Decimal('100'), 'username': 'macss', 'num_tweets': Decimal('6')}


Note as well, that even though it is not optimal to perform complicated queries in DynamoDB tables, we can write and run SQL-like queries to run again our DynamoDB tables if we want to:

In [6]:
response = table.meta.client.execute_statement(
    Statement='''
              SELECT *
              FROM twitter
              WHERE num_followers > 20
              '''
)
item = response['Items']
print(item)

[{'num_followers': Decimal('100'), 'username': 'macss', 'num_tweets': Decimal('6')}]


We can use `boto3` to access our DynamoDB database from within other AWS resources (such as Lambda or EC2). For instance, let's create a Lambda function that will process some data (username, as well raw follower and tweet data) and enter the results of this processing into our database without ever leaving the AWS cloud (see zipped Lambda deployment package in this directory):

In [7]:
# create Lambda client
aws_lambda = boto3.client('lambda')

# Access our class IAM role, which allows Lambda
# to interact with other AWS resources
iam_client = boto3.client('iam')
role = iam_client.get_role(RoleName='LabRole')

# Open our Zipped directory
with open('write_to_dynamodb.zip', 'rb') as f:
    lambda_zip = f.read()

try:
    # If function hasn't yet been created, create it
    response = aws_lambda.create_function(
        FunctionName='write_to_dynamodb',
        Runtime='python3.9',
        Role=role['Role']['Arn'],
        Handler='lambda_function.lambda_handler',
        Code=dict(ZipFile=lambda_zip),
        Timeout=3
    )
except aws_lambda.exceptions.ResourceConflictException:
    # If function already exists, update it based on zip
    # file contents
    response = aws_lambda.update_function_code(
    FunctionName='write_to_dynamodb',
    ZipFile=lambda_zip
    )

Finally, we make sure to delete the table to avoid AWS extra charges:

In [8]:
table.delete()

{'TableDescription': {'TableName': 'twitter',
  'TableStatus': 'DELETING',
  'ProvisionedThroughput': {'NumberOfDecreasesToday': 0,
   'ReadCapacityUnits': 1,
   'WriteCapacityUnits': 1},
  'TableSizeBytes': 0,
  'ItemCount': 0,
  'TableArn': 'arn:aws:dynamodb:us-east-1:230488219088:table/twitter',
  'TableId': '9ab90478-4e03-46ce-a1f2-0432d63b530f'},
 'ResponseMetadata': {'RequestId': 'UHBJLHN1DB10IFNK61ECQAC103VV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Tue, 18 Apr 2023 21:00:21 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '350',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'UHBJLHN1DB10IFNK61ECQAC103VV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '1548591828'},
  'RetryAttempts': 0}}