Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(s3): Split bucket name and key before uploading #48

Merged
merged 5 commits into from Feb 23, 2022

Conversation

tomasfarias
Copy link
Owner

S3 files were uploaded to incorrect keys when running Airflow 2. This
was caused by differences between Airflow 1.10 and Airflow 2: the
latter assumes that when passing both bucket_name and key that the key
is to be taken as it is, where as the former seemed to work even when
the key contained the full url.

Now, we do the parsing and splitting ourselves. We would just set
bucket_name to None, however with Airflow 2.0 the upload function
checks for the presence of the bucket_name argument, to decide whether
to parse the url, not if it's set to None (I think this may be a bug).

S3 files were uploaded to incorrect keys when running Airflow 2. This
was caused by differences between Airflow 1.10 and Airflow 2: the
latter assumes that when passing both bucket_name and key that the key
is to be taken as it is, where as the former seemed to work even when
the key contained the full url.

Now, we do the parsing and splitting ourselves. We would just set
bucket_name to None, however with Airflow 2.0 the upload function
checks for the presence of the bucket_name argument, to decide whether
to parse the url, not if it's set to None (I think this may be a bug).
@tomasfarias
Copy link
Owner Author

Closes #46

@tomasfarias
Copy link
Owner Author

Will be interesting to see if tests pass as we are now running multiple Airflow versions. If everything's green, then we are good to go.

@codecov
Copy link

codecov bot commented Feb 23, 2022

Codecov Report

Merging #48 (0c9b22c) into master (6c20997) will increase coverage by 0.12%.
The diff coverage is 100.00%.

❗ Current head 0c9b22c differs from pull request most recent head 8ee4bde. Consider uploading reports for the commit 8ee4bde to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #48      +/-   ##
==========================================
+ Coverage   97.70%   97.82%   +0.12%     
==========================================
  Files           8        8              
  Lines         828      829       +1     
==========================================
+ Hits          809      811       +2     
+ Misses         19       18       -1     
Flag Coverage Δ
unittests 97.82% <100.00%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
airflow_dbt_python/hooks/backends/s3.py 94.31% <100.00%> (+1.21%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6c20997...8ee4bde. Read the comment docs.

@tomasfarias
Copy link
Owner Author

All green. I admit these patches are getting scarier to release, but at some point, I need to start trusting my tests and keep continuously improving them. Bumping patch and shipping :shipit:

@tomasfarias tomasfarias merged commit e97e1c2 into master Feb 23, 2022
@tomasfarias tomasfarias deleted the s3-url-upload branch February 23, 2022 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant