-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a multiple node DAG for GCP #30
Milestone
Comments
2 tasks
Before everything, try to specify remote file locations for csv and json |
Remote CSV fetching works |
Problem with enoding when reading remote file on Airflow DAG |
Resolved with #53 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Using the same structure of the multiple node DAG assuming local files, create a DAG that handles resources on the cloud (CSV and JSON).
Acceptance
Tasks
- [ ] Refactor code to reuse existing nodesMoved to [Refactor] DAGs #60DAG tasks:
Upload CSV from CKAN instance to bucketRead remote CSVckannext-aircan (connector) tasks:- [x] 1. create endpoint to receive Airflow response after processing- [x] 2. Handle Airflow response- [ ] 3. If success, download processed json file from bucketNOTE: This is not the strategy. We will send the processed JSON via API.
Analysis
After this long task is complete, we still need to:
- [ ] Handle errorsThis will be in the next milestone- [ ] Handle absence of a response from AirflowThis will be in the next milestone- [ ] Delete remote file (create a separate DAG for that)This will be in the next milestoneThe text was updated successfully, but these errors were encountered: