AWS Glue ETL Sample CDK

This project deploys a minimum ETL workload using AWS Glue. It loads data from Aurora cluster and store the ETL results to S3 bucket as parquet format. The Glue job is quite simple that replaces "content" column of the table to "*". (content: Hello => *****)

Deployment

You need to setup your CDK environment. See Getting started with the AWS CDK.

cdk deploy

Testing

First, you need to create demo data in Aurora cluster. We deployed a lambda function that inserts 1000 records to the database. Let's invoke it by below.

aws lambda invoke --function-name create-demo-data /dev/null

Next, run the Glue job to do the ETL. Go to AWS Glue Console (Jobs) and select AwsGlueEtlSampleCdk. Then click Action and Run job.

After the job succeeds, go to AWS Glue Console (Crawlers) and select AwsGlueEtlSampleCdk. Then click Run crawler.

After the crawler succeeds, go to Athena (Query) and select AwsDataCatalog as Data source and mydatabase as Database. Then enter the following query in the box. Then click Run query.

SELECT * FROM mytable;

As you can see, the "content" column is masked by "*".

To get the number of records, run the query below.

SELECT COUNT(*) FROM mytable;

You will get 1000 as the result if you invoked the lambda function once.

Cleaning

cdk destroy

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bin		bin
glue		glue
lambda		lambda
lib		lib
.gitignore		.gitignore
.npmignore		.npmignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
cdk.json		cdk.json
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

License

aws-samples/aws-glue-etl-sample-cdk

Folders and files

Latest commit

History

Repository files navigation

AWS Glue ETL Sample CDK

Deployment

Testing

Cleaning

Security

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Languages