Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOP-8157] Add SparkS3 troubleshooting guide #124

Merged
merged 1 commit into from
Aug 16, 2023
Merged

Conversation

dolfinus
Copy link
Member

@dolfinus dolfinus commented Aug 16, 2023

Change Summary

  • Added Troubleshooting guide for SparkS3. Here I've added all useful links I found during this class implementation and debug.
  • While working with magic committer, I found that there is a spark-hadoop-cloud library which should be used for Spark on S3. It already includes hadoop-aws, but its versioning is the same as Spark version. In this case we don't need to pass Hadoop libraries version explicitly, because it is already built with the same version the Spark is compiled with. So I replaced hadoop-aws with spark-hadoop-cloud, and updated SparkS3.get_options signature and tests.

Related issue number

Checklist

  • Commit message and PR title is comprehensive
  • Keep the change as small as possible
  • Unit and integration tests for the changes exist
  • Tests pass on CI and coverage does not decrease
  • Documentation reflects the changes where applicable
  • docs/changelog/next_release/<pull request or issue id>.<change type>.rst file added describing change
    (see CONTRIBUTING.rst for details.)
  • My PR is ready to review.

@dolfinus dolfinus added the ci:skip-changelog Add this label to skip changelog file check label Aug 16, 2023
@dolfinus dolfinus temporarily deployed to test-pypi August 16, 2023 13:18 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Aug 16, 2023

Codecov Report

Merging #124 (1e56bbe) into develop (7dd2faa) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #124      +/-   ##
===========================================
+ Coverage    94.04%   94.07%   +0.02%     
===========================================
  Files          172      172              
  Lines         7491     7495       +4     
  Branches      1442     1442              
===========================================
+ Hits          7045     7051       +6     
+ Misses         325      324       -1     
+ Partials       121      120       -1     
Files Changed Coverage Δ
...nnection/file_df_connection/spark_s3/connection.py 92.56% <100.00%> (+0.20%) ⬆️

... and 1 file with indirect coverage changes

@dolfinus dolfinus temporarily deployed to test-pypi August 16, 2023 13:24 — with GitHub Actions Inactive
@dolfinus dolfinus marked this pull request as ready for review August 16, 2023 13:36
@dolfinus dolfinus temporarily deployed to test-pypi August 16, 2023 14:10 — with GitHub Actions Inactive
@dolfinus dolfinus merged commit e4c739d into develop Aug 16, 2023
33 checks passed
@dolfinus dolfinus deleted the feature/DOP-8157 branch August 16, 2023 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci:skip-changelog Add this label to skip changelog file check
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants