-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Daniel Ribeiro Bueno edited this page Oct 6, 2023
·
1 revision
Welcome to the airflow_aws_justwatch_pipeline wiki!
- Create your account.
- Create a S3 bucket.
- Create a new IAM user with authorization to read and write to S3 and run Glue Jobs.
- Create the access key pair generated for the just created user.
- Install Docker Desktop in your local machine.
- Run Docker.
- Go to your project root path directory and add the subdirectories for airflow settings:
config
,dags
,logs
andplugins
- Go to \dockerfile\airflow\Dockerfile and change the airflow version to the one you want to use in the first line:
FROM apache/airflow:2.7.1
- Open a terminal under your project root path directory and write a command line to start Airflow:
docker compose up
- After start Airflow, go to Airflow web interface (http://localhost:8080/) and then log in (default user and password:
airflow
) - Open the
Admin > Connections
tab to set the AWS connection with Airflow:- Connection Id:
AWSConnection
(or whatever you want, remember to change it in the dag script) - Connection Type: Amazon Web Services
- Extra:
{"aws_access_key_id": "<YOUR_AWS_ACCESS_KEY_ID>",
"aws_secret_access_key": "<YOUR_AWS_SECRET_ACCESS_KEY>"}
- Connection Id:
- Save the setings.