This repository contains a series of automation to help in the deployment of a cost tracking solution for EMR on EKS. The implementation details of the repository are covered in this blog.
First execute deploy-emr-eks-cost-tracking.sh
the script expect the following in this order: region, kubecost version, eks cluster name, and account id.
sh deploy-emr-eks-cost-tracking.sh REGION KUBECOST-VERSION EKS-CLUSTER-NAME ACCOUNT-ID
Example
sh deploy-emr-eks-cost-tracking.sh eu-west-1 1.102.0 emreks 111111111
Use by Kubecost to store metrics.
A Glue database used to encompass all tables that store cost data.
Two glue tables:
* One used to store information about CUR data
* Used to store the compute cost related to each job
A Glue crawler:
* Used to crawler CUR data and update the table partitions
A lambda function to trigger the glue craweler everytime there is new data put by Cost and Usage Report.
Two S3 buckets:
* cost-data-REGION-ACCOUNT_ID: used to store cost data
* aws-athena-query-results-cur-REGION-ACCOUNT_ID: used to store Athena query results
An Amazon Athena workgroup named: emreks-compute-cost-exporter. This workgroup is use by Kubecost to query CUR data.
Used by Kubecost and to get cost by Job in EMR on EKS.
Used to get spot price data and used by Kubecost.
To delete all the resources created. First empty these two s3 buckets cost-data-REGION-ACCOUNT_ID
and aws-athena-query-results-cur-REGION-ACCOUNT_ID
, then Athena workgroup emr-eks-cost-analysis
and last empty the ECR repository with the name emreks-compute-cost-exporter
. Execute the following file destroy-emr-eks-cost-tracking.sh
.
sh destroy-emr-eks-cost-tracking.sh REGION EKS-CLUSTER-NAME
Example
sh destroy-emr-eks-cost-tracking.sh eu-west-1 emreks
See CONTRIBUTING for more information.
This library is licensed under the MIT-0 License. See the LICENSE file.