-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Description
Hi Team,
Am trying to create a data pipeline where the EMR cluster gets provisioned, runs a spark job and terminate upon completion, so the flow would be similar as below
EmrCreateJobFlowOperator --> EmrAddStepsOperator --> EmrTerminateJobFlowOperator
The class "airflow.providers.amazon.aws.operators.emr_create_job_flow.EmrCreateJobFlowOperator" is taking in "region_name" as argument and it is easy to define the region in DAG script. Moreover this step works good and a cluster is provisioned.
Where as the next step "EmrAddStepsOperator" --> the underlying class [airflow.providers.amazon.aws.operators.emr_add_steps.EmrAddStepsOperator] does not has any region_name as argument. So at first it throws up an error "ERROR - You must specify a region", since there is no such way to pass a region name to the class, i decided to create a new connection with conn_id as aws_shabr and connection uri as this --> aws://?region=eu-west-1 (so that it takes default login and pass, and region is mentioned). But upon running it still throws up the same error "ERROR - You must specify a region".
Is there any other way to pass region_name to "EmrAddStepsOperator" step, if so please help me out here.
Thanks in Advance
Shabr
