Skip to content

AWS CDK-TypeScript project to showcase an Athena-based solution for S3 data analysis.

License

Notifications You must be signed in to change notification settings

san99tiago/aws-cdk-athena-s3-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

💫 AWS CDK ATHENA S3 WORKFLOW 💫


This is a fun Athena-based project deployed on AWS with Infrastructure as Code on top of AWS CDK (TypeScript). The overall functionality of the project is to be able to deploy (and automatically configure) the AWS Glue and Athena services (Workgroup, Database, Table and Queries), so that some "Raw Data" (found at cdk/sample_data) that is stored in an Raw Data S3 bucket, can be queried with Athena-Named-Queries in an SQL-like approach, and the results are automatically stored in another "Results" bucket. It also deploys a sample role that can be used by the Glue service in Crawlers.

The information of this repository is based on many online resources, so feel free to use it as a guide for your future projects!

AWS CDK ☁️

AWS Cloud Development Kit is an amazing open-source software development framework to programmatically define cloud-based applications with familiar languages.

My personal opinion is that you should learn about CDK when you feel comfortable with cloud-based solutions with IaC on top of AWS Cloudformation. At that moment, I suggest that if you need to enhance your architectures, it's a good moment to use these CDK-based solutions.

The best way to start is from the Official AWS Cloud Development Kit (AWS CDK) v2 Documentation.

Dependencies 🚦

Software dependencies (based on project)

  • Visual Studio Code
    Visual Studio Code is my main code editor for high-level programming. This is not absolutely necessary, but from my experience, it gives us a great performance and we can link it with Git and GitHub easily.

  • NodeJs
    NodeJs is a JavaScript runtime built on Chrome's V8 JavaScript engine programming language. The community is amazing and lets us handle async functionalities in elegant ways.

Libraries and Package dependencies (based on project)

Usage 💫

Project deployment commands are explained in detail at important_commands.sh, including the necessary steps to configure CDK and do the deployments.

Special thanks 🎁

  • Thanks to all contributors of the great OpenSource projects that I am using.

Author 🎹

Santiago Garcia Arango

Senior DevOps Engineer passionate about advanced cloud-based solutions and deployments in AWS. I am convinced that today's greatest challenges must be solved by people that love what they do.