# DynamoDB

DynamoDB is a managed No-SQL database made available by AWS. While [it's not as popular as MongoDB or Apache Cassandra](https://db-engines.com/en/ranking), it has the advantage of being [free-tier-eligible in the Amazon Cloud](https://aws.amazon.com/de/free/?nc2=h_ql_pr_ft&all-free-tier.sort-by=item.additionalFields.SortRank&all-free-tier.sort-order=asc&awsf.Free%20Tier%20Types=*all&awsf.Free%20Tier%20Categories=categories%23databases)

*Note: Amazon Keyspaces is also free-tier-eligible, but only for three months. Also, at the time of writing this file, the support of the boto3 library for DynamoDB seems better than the one for keyspaces(Cassandra)*

Since it's a No-SQL database, we can store unstructured data without worrying about enforcing a fixed schema. 

# Creating the DynamoDB Table

After logging in to the AWS account, we just have to click on "create table". We will then be prompted to input a table name, the partition key and sort key.

![dynamo_table](DynamoDB/dynamo_create_table.png)


**What is this ?**

1) **Table Name:** the table is the largest object we have in DynamoDb. It'll contain whichever elements we need to store. For now, we will call it test.

2) **Partition Key:** in a simplified way, a partition is a small piece of a table containing similar data. This concept is not exclusive of DynamoDB and what dynamo does is basically use the values in this key to group items and retrieve them faster.

3) **Sort Key:** this is optional at this point. Depending on the entity we will store in the table, we might need a sort key to group all elements of the entity together. To explain this, a [picture](https://aws.amazon.com/de/blogs/database/choosing-the-right-dynamodb-partition-key/) is better than words:

![db_partition](DynamoDB/dynamo_partition.png)

We will leave the sort key empty for now.

# Database Created

And that was it. As I mentioned, the beauty of working with no sql is that we don't have to worry about any schema for the data. We just need to have minimal knowledge of the data in order to tell the database what makes an element in the table unique.

# Communication with the Database

There are a couple of ways to interact with the database. Some of them that I'd like to show here are:

1) Manually via the AWS Management Console

2) Via the PartiQL Editor in AWS

3) Boto3

# Management Console

We can now select the table we just created. This will show a detailled menu.

![dynamo_table](DynamoDB/dynamo_table_small.png)

By clicking on "Tabelle erkunden" or "explore table", we finally see the option to manually insert the elements ("Element erstellen")

![dynamo_iteam](DynamoDB/dynamo_item.png)

Now we can manually input the data. Notice that we have some "main" attributes (first name, second name, email and additional info) and some secondary information stored inside "additional info". This is useful, for example, if a client must insert first name, last name and email in order to subscribe to a given service, but is not forced to give any more information, event though he might, if he wants to.

![dynamo_manual](DynamoDB/dynamo_manual.png)

# PartiQL-Editor

Now that we have an element in our test table, we can take a look at the PartiQL to monitor if everything is alright (of course, we can also click our way through this with the tools in the management console).

![dynamo_partiql](DynamoDB/dynamo_partiql.png)

In many ways, PartiQL is similar to SQL. So the "SELECT * FROM test" will work fine here. For more information on PartiQL, check [this page](https://partiql.org/).

The result outputed as a table looks like this:

![dynamo_partiql_result](DynamoDB/dynamo_partiql_result.png)

The only element we have in the table so far was returned, so everything worked out fine. 

# Boto 3

While manually inserting the data and the PartiQL are interesting to get familiar with the database, a great deal of the interaction of the database will happen through an application. To demonstrate how a script can interact with the database, we can use the library boto3.

However, we in order to access the DynamoDB, we need to generate the aws credentials. For this, we have to create a User Group in the AWS IAM Role.

![dynamo_iam](DynamoDB/dynamo_iam.png)

Creating a new user will generate the credentials. We have to use them to connect to the database.

**Getting Items**

In [69]:
import boto3
import pandas as pd
import json

# You might want to store your credentials in a better place than I did for example
# Here we just use pandas as a quick way to read the credentials file and load it to a dict.
# This will allow us to easily pass the arguments we need to access the DynamoDB
credentials = pd.read_csv(r"C:\Users\celio.picano\AWS\credentials_dynamo.csv").to_dict("records")[0]

# Creating the Connection
dynamodb = boto3.resource("dynamodb", aws_access_key_id = credentials["Access key ID"],
                                      aws_secret_access_key = credentials["Secret access key"],
                                      region_name = "eu-central-1")

# Fetching Results
dynamodb.Table("test").get_item(Key={"id":1})["Item"]

{'additional_info': {'zip': Decimal('123456'),
  'favorite color': 'blue',
  'address': 'Auf dem Holzweg 1',
  'phone': '+49 1234 5678900'},
 'last_name': 'User',
 'id': Decimal('1'),
 'email': 'test.user@web.com',
 'first_name': 'Test'}

**Putting Items**

In [72]:
dynamodb.Table("test").put_item(Item = {"id":2,
                                "first_name":"Another",
                                "last_name":"Test",
                                "additional_info": {"favorite color":"green"}})

{'ResponseMetadata': {'RequestId': 'RL9LV20BDDU5UK94ASGSGK8IORVV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Fri, 11 Mar 2022 11:49:31 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '2',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'RL9LV20BDDU5UK94ASGSGK8IORVV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '2745614147'},
  'RetryAttempts': 0}}

# Checking Results

To see if we could successfully insert a new item, we can scan all the elements in the table:

In [77]:
dynamodb.Table("test").scan()["Items"]

[{'additional_info': {'favorite color': 'green'},
  'last_name': 'Test',
  'id': Decimal('2'),
  'first_name': 'Another'},
 {'additional_info': {'zip': Decimal('123456'),
   'favorite color': 'blue',
   'address': 'Auf dem Holzweg 1',
   'phone': '+49 1234 5678900'},
  'last_name': 'User',
  'id': Decimal('1'),
  'email': 'test.user@web.com',
  'first_name': 'Test'}]

It worked!

# Unstructured Data

The cool thing about working with No-SQL, as mentioned, is that we don't have to worry about data structures. DynamoDB allows us to insert items "on the fly" in a complete different structure (as long as the id remains unique).

In [78]:
dynamodb.Table("test").put_item(Item = {"id":99,
                                       "age":27,
                                       "profession":"Data Engineer"}) # Item with a completely different structure

{'ResponseMetadata': {'RequestId': 'QNUH6MBOIHQ351TBEJMCI3MAA7VV4KQNSO5AEMVJF66Q9ASUAAJG',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'server': 'Server',
   'date': 'Fri, 11 Mar 2022 11:54:41 GMT',
   'content-type': 'application/x-amz-json-1.0',
   'content-length': '2',
   'connection': 'keep-alive',
   'x-amzn-requestid': 'QNUH6MBOIHQ351TBEJMCI3MAA7VV4KQNSO5AEMVJF66Q9ASUAAJG',
   'x-amz-crc32': '2745614147'},
  'RetryAttempts': 0}}

**Checking Results**

In [82]:
for i in dynamodb.Table("test").scan()["Items"]:
    print(i)

{'additional_info': {'favorite color': 'green'}, 'last_name': 'Test', 'id': Decimal('2'), 'first_name': 'Another'}
{'additional_info': {'zip': Decimal('123456'), 'favorite color': 'blue', 'address': 'Auf dem Holzweg 1', 'phone': '+49 1234 5678900'}, 'last_name': 'User', 'id': Decimal('1'), 'email': 'test.user@web.com', 'first_name': 'Test'}
{'profession': 'Data Engineer', 'id': Decimal('99'), 'age': Decimal('27')}


# Conclusion

So, this was a very small project to present some features of Amazon DynamoDB and NoSQL. Working with unstructured data can be tricky, because it can quickly become a mess if there is no metadata management or active usage of the data. However, data does not stop growing in size and type variety, so it's nice to have tools like DynamoDB that can help us accomodate this "ever-changing" data and responde to business needs.