This project is aiming to build a whole cloud based DevOps ETL process. Include below Parts:
- Cloud Infrastructure
- Jenkins on ECS
- Airflow on EKS
- Airflow framework(wrapper)
- Jenkins Devops Pipeline
- Glue ETL Common Solution
- Multi-account architecture
- Front end development & design
- Backend development & design
- DB development & design
- User/Role Management Architecture
- Network/Security Architecture
- DevOps Architecture
- Infrastructure Level DevOps
- Project Level DevOps
- Project Architecture
- ETL framework/solution
- Data Visualization(PowerBI)
Cloud base ETL DevOps process of Community = CEDC
- DevOps Account: this is a DevOps account mainly include Jenkins and Airflow
- Data Account: this is a data lake account mainly include S3
- Serverless Account: this is a ETL account mainly include Glue, Lambda etc
- IDP Account: this is a Identity account which can assume A/B/C accounts by User role or Admin Role
Note: in the first draft, we can centralized deploy all services into one account for demo purpose.
- Parameter driven framework
- Check Dependence
- Kickoff
- Monitor
- Job Retry
- Notify
- Metadata backend
- Deploy airflow dags and glue job in project
- Onboarding/Off Boarding
- Data validation
- Convert SQL to Glue Pyspark
- Glue
- Lambda
- S3
- Cloudwatch Events
- Cloudwatch logs
- Secrets manager
- wip ...
Glue job naming standard:
- <project_name>_<table_name or process_name>_prelanding
- <project_name>_<table_name or process_name>_landing
- <project_name>_<table_name or process_name>_landing_merge
- <project_name>_<table_name or process_name>_refinement
- <project_name>_<table_name or process_name>_publish
- Serverless Account: Glue Job Execution role -> DEVOPS_GLUE_CEDC_EXECUTION (cross account role to ensure Airflow can trigger glue jobs on Account C)
- DevOps Account: DEVOPS_GLUE_CEDC_READ/DEVOPS_GLUE_CEDC_ADMIN (Readonly or Admin)
- IDP Account: CICD Role: DEVOPS_CICD_CEDC (which will assume admin access for all accounts for now.)
- Data Account: DEVOPS_S3_CEDC_READ/DEVOPS_S3_CEDC_ADMIN