Skip to content

Latest commit

 

History

History
615 lines (476 loc) · 20.1 KB

integration.rst

File metadata and controls

615 lines (476 loc) · 20.1 KB

Integration

Content

Azure: Microsoft Azure

Airflow has limited support for Microsoft Azure.

Logging

Airflow can be configured to read and write task logs in Azure Blob Storage. See write-logs-azure.

Operators and Hooks

Service operators and hooks

These integrations allow you to perform various operations within the Microsoft Azure.

Service name Hook Operators Sensors
Azure Blob Storage airflow.contrib.operators.wasb_delete_blob_operator airflow.contrib.sensors.wasb_sensor
Azure Container Instances airflow.contrib.hooks.azure_container_instance_hook, airflow.contrib.hooks.azure_container_registry_hook, airflow.contrib.hooks.azure_container_volume_hook airflow.contrib.operators.azure_container_instances_operator
Azure Cosmos DB airflow.contrib.hooks.azure_cosmos_hook airflow.contrib.operators.azure_cosmos_operator airflow.contrib.sensors.azure_cosmos_sensor
Azure Data Lake Storage airflow.contrib.hooks.azure_data_lake_hook airflow.contrib.operators.adls_list_operator
Azure Files airflow.contrib.hooks.azure_fileshare_hook

Transfer operators and hooks

These integrations allow you to copy data from/to Microsoft Azure.

Source Destination Guide Operators
Azure Data Lake Storage Google Cloud Storage (GCS) airflow.contrib.operators.adls_to_gcs
Local Azure Blob Storage airflow.contrib.operators.file_to_wasb
Oracle Azure Data Lake Storage airflow.contrib.operators.oracle_to_azure_data_lake_transfer

AWS: Amazon Web Services

Airflow has support for Amazon Web Services.

Logging

Airflow can be configured to read and write task logs in Amazon Simple Storage Service (Amazon S3). See write-logs-amazon.

Operators and Hooks

All hooks are based on airflow.contrib.hooks.aws_hook.

Service operators and hooks

These integrations allow you to perform various operations within the Amazon Web Services.

Service name Hook Operators Sensors
Amazon Athena airflow.contrib.hooks.aws_athena_hook airflow.contrib.operators.aws_athena_operator airflow.contrib.sensors.aws_athena_sensor
AWS Batch airflow.contrib.operators.awsbatch_operator
Amazon CloudWatch Logs airflow.contrib.hooks.aws_logs_hook
Amazon DynamoDB airflow.contrib.hooks.aws_dynamodb_hook
Amazon EC2 airflow.contrib.operators.ecs_operator
Amazon EMR airflow.contrib.hooks.emr_hook airflow.contrib.operators.emr_add_steps_operator, airflow.contrib.operators.emr_create_job_flow_operator, airflow.contrib.operators.emr_terminate_job_flow_operator. airflow.contrib.sensors.emr_base_sensor, airflow.contrib.sensors.emr_job_flow_sensor, airflow.contrib.sensors.emr_step_sensor.
AWS Glue Catalog airflow.contrib.hooks.aws_glue_catalog_hook airflow.contrib.sensors.aws_glue_catalog_partition_sensor
Amazon Kinesis Data Firehose airflow.contrib.hooks.aws_firehose_hook
AWS Lambda airflow.contrib.hooks.aws_lambda_hook
Amazon Redshift airflow.contrib.hooks.redshift_hook airflow.contrib.sensors.aws_redshift_cluster_sensor
Amazon Simple Storage Service (S3) airflow.hooks.S3_hook airflow.operators.s3_file_transform_operator airflow.contrib.operators.s3_copy_object_operator, airflow.contrib.operators.s3_delete_objects_operator, airflow.contrib.operators.s3_list_operator. airflow.sensors.s3_key_sensor, airflow.sensors.s3_prefix_sensor,
Amazon SageMaker airflow.contrib.hooks.sagemaker_hook, airflow.contrib.operators.sagemaker_base_operator, airflow.contrib.operators.sagemaker_endpoint_config_operator, airflow.contrib.operators.sagemaker_endpoint_operator, airflow.contrib.operators.sagemaker_model_operator, airflow.contrib.operators.sagemaker_training_operator, airflow.contrib.operators.sagemaker_transform_operator, airflow.contrib.operators.sagemaker_tuning_operator. airflow.contrib.sensors.sagemaker_base_sensor, airflow.contrib.sensors.sagemaker_endpoint_sensor, airflow.contrib.sensors.sagemaker_training_sensor, airflow.contrib.sensors.sagemaker_transform_sensor, airflow.contrib.sensors.sagemaker_tuning_sensor.
Amazon Simple Notification Service (SNS) airflow.contrib.hooks.aws_sns_hook airflow.contrib.operators.sns_publish_operator
Amazon Simple Queue Service (SQS) airflow.contrib.hooks.aws_sqs_hook airflow.contrib.operators.aws_sqs_publish_operator airflow.contrib.sensors.aws_sqs_sensor

Transfer operators and hooks

These integrations allow you to copy data from/to Amazon Web Services.

Source Destination Guide Operators
Apache Hive Amazon DynamoDB airflow.contrib.operators.hive_to_dynamodb
MongoDB Amazon DynamoDB airflow.contrib.operators.hive_to_dynamodb
Amazon Simple Storage Service (S3) Google Cloud Storage (GCS) How to use <howto/operator/gcp/cloud_storage_transfer_service> airflow.contrib.operators.s3_to_gcs_operator airflow.gcp.operators.cloud_storage_transfer_service
Amazon Redshift Amazon Simple Storage Service (S3) airflow.operators.redshift_to_s3_operator
Amazon Simple Storage Service (S3) Apache Hive airflow.operators.s3_to_hive_operator
Amazon Simple Storage Service (S3) Amazon Redshift airflow.operators.s3_to_redshift_operator

GCP: Google Cloud Platform

Airflow has extensive support for the Google Cloud Platform.

See the GCP connection type <howto/connection/gcp> documentation to configure connections to GCP.

Logging

Airflow can be configured to read and write task logs in Google Cloud Storage. See write-logs-gcp.

Operators and Hooks

All hooks are based on airflow.contrib.hooks.gcp_api_base_hook.GoogleCloudBaseHook.

Service operators and hooks

These integrations allow you to perform various operations within the Google Cloud Platform.

Service name Guide Hook Operators Sensors
AutoML How to use <howto/operator/gcp/automl> airflow.gcp.hooks.automl
BigQuery airflow.gcp.hooks.bigquery airflow.gcp.operators.bigquery airflow.gcp.sensors.bigquery
BigQuery Data Transfer Service How to use <howto/operator/gcp/bigquery_dts> airflow.gcp.hooks.bigquery_dts airflow.gcp.operators.bigquery_dts airflow.gcp.sensors.bigquery_dts
Bigtable How to use <howto/operator/gcp/bigtable> airflow.gcp.hooks.bigtable airflow.gcp.operators.bigtable airflow.gcp.sensors.bigtable
Cloud Build How to use <howto/operator/gcp/cloud_build> airflow.gcp.hooks.cloud_build airflow.gcp.operators.cloud_build
Compute Engine How to use <howto/operator/gcp/compute> airflow.gcp.hooks.compute airflow.gcp.operators.compute
Dataflow airflow.gcp.hooks.dataflow airflow.gcp.operators.dataflow
Dataproc airflow.gcp.hooks.dataproc airflow.gcp.operators.dataproc
Datastore airflow.gcp.hooks.datastore airflow.gcp.operators.datastore
Cloud Data Loss Prevention (DLP) airflow.gcp.hooks.dlp airflow.gcp.operators.dlp
Cloud Functions How to use <howto/operator/gcp/functions> airflow.gcp.hooks.functions airflow.gcp.operators.functions
Cloud Storage (GCS) How to use <howto/operator/gcp/gcs> airflow.gcp.hooks.gcs airflow.gcp.operators.gcs airflow.gcp.sensors.gcs
Cloud Key Management Service (KMS) airflow.gcp.hooks.kms
Kubernetes Engine airflow.gcp.hooks.kubernetes_engine airflow.gcp.operators.kubernetes_engine
Cloud Memorystore How to use <howto/operator/gcp/cloud_memorystore> airflow.gcp.hooks.cloud_memorystore airflow.gcp.operators.cloud_memorystore
Machine Learning Engine airflow.gcp.hooks.mlengine airflow.gcp.operators.mlengine
Natural Language How to use <howto/operator/gcp/natural_language> airflow.gcp.hooks.natural_language airflow.gcp.operators.natural_language
Cloud Pub/Sub airflow.gcp.hooks.pubsub airflow.gcp.operators.pubsub airflow.gcp.sensors.pubsub
Cloud Spanner How to use <howto/operator/gcp/spanner> airflow.gcp.hooks.spanner airflow.gcp.operators.spanner
Cloud Speech-to-Text How to use <howto/operator/gcp/speech> airflow.gcp.hooks.speech_to_text airflow.gcp.operators.speech_to_text
Cloud SQL How to use <howto/operator/gcp/sql> airflow.gcp.hooks.cloud_sql airflow.gcp.operators.cloud_sql
Storage Transfer Service How to use <howto/operator/gcp/cloud_storage_transfer_service> airflow.gcp.hooks.cloud_storage_transfer_service airflow.gcp.operators.cloud_storage_transfer_service airflow.gcp.sensors.cloud_storage_transfer_service
Cloud Tasks airflow.gcp.hooks.tasks airflow.gcp.operators.tasks
Cloud Text-to-Speech How to use <howto/operator/gcp/speech> airflow.gcp.hooks.text_to_speech airflow.gcp.operators.text_to_speech
Cloud Translation How to use <howto/operator/gcp/translate> airflow.gcp.hooks.translate airflow.gcp.operators.translate
Cloud Video Intelligence How to use <howto/operator/gcp/video_intelligence> airflow.gcp.hooks.video_intelligence airflow.gcp.operators.video_intelligence
Cloud Vision How to use <howto/operator/gcp/vision> airflow.gcp.hooks.vision airflow.gcp.operators.vision

Transfer operators and hooks

These integrations allow you to copy data from/to Google Cloud Platform.

Source Destination Guide Operators

.. _integration:GCP-Discovery-ref:

All services [1] <integration:GCP-Discovery>

Amazon Simple Storage Service (S3) airflow.operators.google_api_to_s3_transfer
Azure Data Lake Storage Google Cloud Storage (GCS) airflow.operators.adls_to_gcs
Amazon Simple Storage Service (S3) Google Cloud Storage (GCS) How to use <howto/operator/gcp/cloud_storage_transfer_service> airflow.operators.s3_to_gcs airflow.gcp.operators.cloud_storage_transfer_service
Google BigQuery Google BigQuery airflow.operators.bigquery_to_bigquery
Google BigQuery Cloud Storage (GCS) airflow.operators.bigquery_to_gcs
BigQuery MySQL airflow.operators.bigquery_to_mysql
Apache Cassandra Google Cloud Storage (GCS) airflow.operators.cassandra_to_gcs
Google Cloud Storage (GCS) Google BigQuery airflow.operators.gcs_to_bq
Google Cloud Storage (GCS) Google Cloud Storage (GCS) How to use <howto/operator/gcp/gcs_to_gcs>, How to use <howto/operator/gcp/cloud_storage_transfer_service> airflow.operators.gcs_to_gcs airflow.gcp.operators.cloud_storage_transfer_service
Google Cloud Storage (GCS) Amazon Simple Storage Service (S3) airflow.operators.gcs_to_s3
Local Google Cloud Storage (GCS) airflow.operators.local_to_gcs
Microsoft SQL Server (MSSQL) Google Cloud Storage (GCS) airflow.operators.mssql_to_gcs
MySQL Google Cloud Storage (GCS) airflow.operators.mysql_to_gcs
PostgresSQL Google Cloud Storage (GCS) airflow.operators.postgres_to_gcs
SQL Cloud Storage (GCS) airflow.operators.sql_to_gcs

[1] <integration:GCP-Discovery-ref> Those discovery-based operators use airflow.gcp.hooks.discovery_api.GoogleDiscoveryApiHook to communicate with Google Services via the Google API Python Client. Please note that this library is in maintenance mode hence it won't fully support GCP in the future. Therefore it is recommended that you use the custom GCP Service Operators for working with the Google Cloud Platform.

Other integrations

Operators and Hooks

Service operators and hooks

These integrations allow you to perform various operations within various services.

Service name Guide Hook Operators Sensors
Qubole airflow.contrib.hooks.qubole_hook airflow.contrib.operators.qubole_operator, airflow.contrib.operators.qubole_check_operator airflow.contrib.sensors.qubole_sensor
Databricks airflow.contrib.hooks.databricks_hook airflow.contrib.operators.databricks_operator