Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Hadoop Downloader Range not correct #21778

Merged
merged 2 commits into from
Jun 14, 2022

Conversation

Abacn
Copy link
Contributor

@Abacn Abacn commented Jun 9, 2022

This fixes #20110

  • This fixes ValueError happens when reading a hdfs of file size larger than buffer size

  • Added related unit test

Please add a meaningful description for your change here


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests

See CI.md for more information about GitHub Actions CI.

* This fixes ValueError happens when reading a hdfs of file size larger than buffer size

* Added related unit test
@asf-ci
Copy link

asf-ci commented Jun 9, 2022

Can one of the admins verify this patch?

1 similar comment
@asf-ci
Copy link

asf-ci commented Jun 9, 2022

Can one of the admins verify this patch?

@codecov
Copy link

codecov bot commented Jun 9, 2022

Codecov Report

Merging #21778 (912e5bc) into master (a1c3d0c) will increase coverage by 0.40%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #21778      +/-   ##
==========================================
+ Coverage   74.03%   74.43%   +0.40%     
==========================================
  Files         698      699       +1     
  Lines       92192    93717    +1525     
==========================================
+ Hits        68252    69756    +1504     
- Misses      22689    22710      +21     
  Partials     1251     1251              
Flag Coverage Δ
python 83.95% <100.00%> (+0.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
sdks/python/apache_beam/io/hadoopfilesystem.py 97.28% <100.00%> (ø)
sdks/python/apache_beam/dataframe/io.py 88.78% <0.00%> (-3.26%) ⬇️
...eam/transforms/py_dataflow_distribution_counter.py 93.75% <0.00%> (-2.55%) ⬇️
sdks/python/apache_beam/utils/counters.py 86.36% <0.00%> (-0.39%) ⬇️
...thon/apache_beam/ml/inference/pytorch_inference.py 0.00% <0.00%> (ø)
...beam/transforms/fully_qualified_named_transform.py 100.00% <0.00%> (ø)
...examples/inference/pytorch_image_classification.py 0.00% <0.00%> (ø)
sdks/python/apache_beam/ml/inference/__init__.py 100.00% <0.00%> (ø)
...hon/apache_beam/runners/worker/bundle_processor.py 93.79% <0.00%> (+0.12%) ⬆️
...ks/python/apache_beam/runners/worker/operations.py 74.32% <0.00%> (+0.29%) ⬆️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a1c3d0c...912e5bc. Read the comment docs.

@Abacn
Copy link
Contributor Author

Abacn commented Jun 9, 2022

retest this please

@Abacn Abacn marked this pull request as ready for review June 9, 2022 21:46
@Abacn
Copy link
Contributor Author

Abacn commented Jun 9, 2022

R: @chamikaramj

@Abacn
Copy link
Contributor Author

Abacn commented Jun 10, 2022

retest this please

Copy link
Contributor

@johnjcasey johnjcasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chamikaramj
Copy link
Contributor

Thanks for the quick fix.

I can merge when tests pass.

@chamikaramj
Copy link
Contributor

Run Python PreCommit

@Abacn
Copy link
Contributor Author

Abacn commented Jun 13, 2022

Run RAT PreCommit
Run Python PreCommit

@Abacn
Copy link
Contributor Author

Abacn commented Jun 13, 2022

Run Python PreCommit

1 similar comment
@Abacn
Copy link
Contributor Author

Abacn commented Jun 14, 2022

Run Python PreCommit

@chamikaramj chamikaramj merged commit ab8977f into apache:master Jun 14, 2022
@Abacn Abacn deleted the hadoopdownloaderrange branch June 14, 2022 16:48
bullet03 pushed a commit to akvelon/beam that referenced this pull request Jun 20, 2022
* Fix Hadoop Downloader Range not correct

* This fixes ValueError happens when reading a hdfs of file size larger than buffer size

* Added related unit test

* fix typo in comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hadoop Downloader Range not correct
4 participants