Data modelling demo with DynamoDB using terraform for infrastructure management.
This demostration goes from the raw data to designing and planning a DyanmoDB store for the given access patterns provided.
Design an efficient dynamodb with the data and the provided access patterns.
Please note this is work in process.
Access Patterns:
- Given ID, return all attributes for store (ID: 1957).
- Return all stores that do not observe daylight savings time; return just state and store ID.
- Given a phone area code, return all stores with formatted addresses and phone numbers (Area Code: 206).
- Return all attributes for all stores with Starbucks and CVS.
- Given state, return all attributes for all stores (State: WA).
- Given state and city, return all attributes for all stores (State: WA; City: Seattle).
- Given state city and zip, return all attributes for all stores (State: WA; City: Seattle; Zip: 98125).
Core Concepts:
- Leverage AWS DynamoDB GSI / LSI
- Leverage AWS Querying
- DynamoDB Design with GSI Sparse indexes
- DynamoDB Design with GSI Composite keys
Technologies:
- AWS DynamoDB
- Terraform (>= v0.12.24)
- Python / Typescript
- Start local DynamoDB & Create Table(for local testing)
# Start local dynamodb
docker compose up
# Create table using terraform (Use Option 1 or 2)
# Option 1, Env: local (Mock dynamoDB)
terraform plan -auto-approve -var-file=./env/local.tfvars
terraform apply -auto-approve -var-file=./env/local.tfvars
# Option 2, Env: Dev (Use a real service from AWS)
terraform plan -var-file=./env/dev.tfvars
terraform apply -auto-approve -var-file=./env/dev.tfvars
# IMPORTANT: Remember to destroy resources after finishing with testing it otherwise you will incur charges
terraform destroy -auto-approve -var-file=./env/dev.tfvars
Note: -var-file=./env/local.tfvars
points the dynamodb to the local endpoint if you want to use local dyamodb (local dev)
- Create the insert json (Using python)
python ./scripts/generate.py
- Insert Into Table (Using Node.js)
# Install packages
yarn
# Insert data into db
yarn run build && yarn run insert-db
- Identify 'Entities' based on the access patterns
Query Entities:
- Store
- Identify 'Attributes' based on the access patterns
Query Attributes:
- ID (number)
- IsDaylightSavingsTimeRecognized (string)
- Address.FormattedAddress (string)
- PhoneNumber (string)
- Address.Subdivision, "state" (string)
- Address.City (string)
- AllCapability (string[])
- PostalCode (string)
- Identify 'Attribute' restrictions
- ID must be unique (Store ID)
- Given state, return all attributes for all stores (State: WA).
This be the PK/SK of the base table
Index Type | PK | SK | Projections |
---|---|---|---|
BASE | Address.Subdivision | ID | None (Base Table) |
AWS CLI query:
# Change the --endpoint-url if using a different environment
# Note: Remove the --select "Count" if you like to see all projections from DB
aws dynamodb query \
--endpoint-url "http://localhost:8000/" \
--table-name Store \
--select "COUNT" \
--index-name "Query-By-Store-State" \
--key-condition-express "#state = :v" \
--expression-attribute-names "{\"#state\":\"Address.Subdivision\"}" \
--expression-attribute-values "{\":v\": {\"S\": \"WA\"}}"
- Given ID, return all attributes for store (ID: 1957)
Index Type | PK | SK | Projections |
---|---|---|---|
GSI | ID | Address.Subdivision | ALL |
Implementation: ID = '<Store-ID>'
- Return all stores that do not observe daylight savings time; return just state and store ID.
Index Type | PK | SK | Projections |
---|---|---|---|
GSI | IsDaylightSavingsTimeRecognized | None | KEYS (Base PK/SK) |
Implementation: IsDaylightSavingsTimeRecognized = 'TRUE' | 'FALSE'
- Given a phone area code, return all stores with formatted addresses and phone numbers (Area Code: 206).
Index Type | PK | SK | Projections |
---|---|---|---|
LSI | X.locale | PhoneNumber | All |
Implementation: X.locale = '<locale>' and begins_with('(<area-code>)')
- Return all attributes for all stores with Starbucks and CVS.
Index Type | PK | SK | Projections |
---|---|---|---|
LSI | Starbucks | CVS | KEYS (Base PK/SK) |
- Add new Fields
Starbucks
("EXISTS" or blank) - Add new Fields
CVS
("EXISTS" or blank)
Note: For any item in a table, DynamoDB writes a corresponding index entry only if the index sort key value is present in the item. We will leverage the sparse indexing.
AWS CLI query:
# Change the --endpoint-url if using a different environment
# Note: Remove the --select "Count" if you like to see all projections from DB
aws dynamodb query \
--endpoint-url "http://localhost:8000/" \
--select "COUNT" \
--table-name Store \
--projection-expression "ID" \
--index-name "Query-By-Starbucks-And-CSV" \
--key-condition-express "#sb = :v and #cvs = :v" \
--expression-attribute-names "{\"#sb\":\"Starbucks\", \"#cvs\":\"CVS\"}" \
--expression-attribute-values "{\":v\": {\"S\": \"True\"}}"
Index Type | PK | SK | Projections |
---|---|---|---|
GSI | IsDaylightSavingsTimeRecognized | None | KEYS (Base PK/SK) |
Implementation: IsDaylightSavingsTimeRecognized = 'FALSE'
- Given state and city, return all attributes for all stores (State: WA; City: Seattle).
Index Type | PK | SK | Projections |
---|---|---|---|
LSI | Address.Subdivision | CityZip | KEYS (Base PK/SK) |
Pre-computed key (composite key): Address.Subdivision
and Address.PostalCode
named CityZip
separated by #
Implementation: state = '<State>' and CityZip begins_with('<city>')
- Given state city and zip, return all attributes for all stores (State: WA; City: Seattle; Zip: 98125).
Same as #7.
Pre-computed key (composite key): Address.Subdivision
and Address.PostalCode
named CityZip
separated by #
Implementation: state = '<State>' and CityZip begins_with('<city>#<zip>')
Below are the final list of data we need to extract from the raw data to insert into dyanmoDB to support the query patterns above.
Query Attributes:
- ID (number)
- IsDaylightSavingsTimeRecognized (string)
- Address.FormattedAddress (string)
- PhoneNumber, FaxNumber (string)
- Address.Subdivision, "state" (string)
- Address.City (string)
- AllCapability (string[])
- Address.PostalCode (string)
- X.locale (string)
Custom Attributes:
CitZip
- Address.Subdivision#Address.PostalCode(string)Starbucks
(string)CVS
(string)
GSI:
- Query-By-Store-State - PK: Address.Subdivision, SK: ID
- Query-By-DayLightSavings - PK: IsDaylightSavingsTimeRecognized, SK: None
- Query-By-Phone-Area-Code - PK: X.Locale, SK: PhoneNumber
- Query-By-Starbucks-And-CSV - PK: Starbucks, SK: CVS (Sparse indexing)
LSI:
- Query-By-State-CityZip - PK: ADdress.Subdivision (by default), SK: CityZip