Skip to content

Apache Airflow's --> EmrAddStepsOperator throws up error as "ERROR - You must specify a region" #13785

@sha12br

Description

@sha12br

Hi Team,

Am trying to create a data pipeline where the EMR cluster gets provisioned, runs a spark job and terminate upon completion, so the flow would be similar as below

EmrCreateJobFlowOperator --> EmrAddStepsOperator --> EmrTerminateJobFlowOperator

The class "airflow.providers.amazon.aws.operators.emr_create_job_flow.EmrCreateJobFlowOperator" is taking in "region_name" as argument and it is easy to define the region in DAG script. Moreover this step works good and a cluster is provisioned.

Where as the next step "EmrAddStepsOperator" --> the underlying class [airflow.providers.amazon.aws.operators.emr_add_steps.EmrAddStepsOperator] does not has any region_name as argument. So at first it throws up an error "ERROR - You must specify a region", since there is no such way to pass a region name to the class, i decided to create a new connection with conn_id as aws_shabr and connection uri as this --> aws://?region=eu-west-1 (so that it takes default login and pass, and region is mentioned). But upon running it still throws up the same error "ERROR - You must specify a region".

image

link to Doc --> http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/apache-airflow-providers-amazon/latest/_api/airflow/providers/amazon/aws/operators/emr_add_steps/index.html

Is there any other way to pass region_name to "EmrAddStepsOperator" step, if so please help me out here.

Thanks in Advance
Shabr

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions