Skip to content

Set of Airflow DAGs to create and destroy a Cloud Data Fusion instance

Notifications You must be signed in to change notification settings

sscosta/datafusion-on-demand

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Cloud Data Fusion is a managed ETL pipeline service provided on the GCP Big Data stack.

Google bills customers per hour the service is active, not providing a way to turn off the service in periods of inactivity to reduce costs.

This repository contains two airflow dags that I use to create a Data Fusion instance in the morning, then perform all the batch jobs and development i need to perform and destroy the instance in the end of the day.

It can be used to lower costs. Your use case may only require that you have Data Fusion active for 2 hours, therefore reducing costs by 91%.

Example:

Basic Data Fusion Instance (per hour) = 1,8$ (April 2022)

Cost per month 24/7 = 1,8 * 24 * 30 = 1296$

Cost per month 2h per day = 1,8 * 2 * 30 = 108$

About

Set of Airflow DAGs to create and destroy a Cloud Data Fusion instance

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages