[DOP-8157] Add SparkS3 troubleshooting guide #124

dolfinus · 2023-08-16T13:04:31Z

Change Summary

Added Troubleshooting guide for SparkS3. Here I've added all useful links I found during this class implementation and debug.
While working with magic committer, I found that there is a spark-hadoop-cloud library which should be used for Spark on S3. It already includes hadoop-aws, but its versioning is the same as Spark version. In this case we don't need to pass Hadoop libraries version explicitly, because it is already built with the same version the Spark is compiled with. So I replaced hadoop-aws with spark-hadoop-cloud, and updated SparkS3.get_options signature and tests.

Related issue number

Checklist

Commit message and PR title is comprehensive
Keep the change as small as possible
Unit and integration tests for the changes exist
Tests pass on CI and coverage does not decrease
Documentation reflects the changes where applicable
docs/changelog/next_release/<pull request or issue id>.<change type>.rst file added describing change
(see CONTRIBUTING.rst for details.)
My PR is ready to review.

codecov · 2023-08-16T13:19:28Z

Codecov Report

Merging #124 (1e56bbe) into develop (7dd2faa) will increase coverage by 0.02%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           develop     #124      +/-   ##
===========================================
+ Coverage    94.04%   94.07%   +0.02%     
===========================================
  Files          172      172              
  Lines         7491     7495       +4     
  Branches      1442     1442              
===========================================
+ Hits          7045     7051       +6     
+ Misses         325      324       -1     
+ Partials       121      120       -1

Files Changed	Coverage Δ
...nnection/file_df_connection/spark_s3/connection.py	`92.56% <100.00%> (+0.20%)`	⬆️

... and 1 file with indirect coverage changes

docs/connection/file_df_connection/spark_s3/troubleshooting.rst

dolfinus added the ci:skip-changelog Add this label to skip changelog file check label Aug 16, 2023

dolfinus force-pushed the feature/DOP-8157 branch from ebaa042 to 476e6a2 Compare August 16, 2023 13:18

dolfinus temporarily deployed to test-pypi August 16, 2023 13:18 — with GitHub Actions Inactive

dolfinus force-pushed the feature/DOP-8157 branch from 476e6a2 to fe76be0 Compare August 16, 2023 13:24

dolfinus temporarily deployed to test-pypi August 16, 2023 13:24 — with GitHub Actions Inactive

dolfinus marked this pull request as ready for review August 16, 2023 13:36

dolfinus requested review from andy-takker and maxim-lixakov August 16, 2023 13:36

andy-takker reviewed Aug 16, 2023

View reviewed changes

docs/connection/file_df_connection/spark_s3/troubleshooting.rst Outdated Show resolved Hide resolved

[DOP-8157] Add SparkS3 troubleshooting guide

1e56bbe

dolfinus force-pushed the feature/DOP-8157 branch from fe76be0 to 1e56bbe Compare August 16, 2023 14:10

dolfinus temporarily deployed to test-pypi August 16, 2023 14:10 — with GitHub Actions Inactive

maxim-lixakov approved these changes Aug 16, 2023

View reviewed changes

andy-takker approved these changes Aug 16, 2023

View reviewed changes

dolfinus merged commit e4c739d into develop Aug 16, 2023
33 checks passed

dolfinus deleted the feature/DOP-8157 branch August 16, 2023 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOP-8157] Add SparkS3 troubleshooting guide #124

[DOP-8157] Add SparkS3 troubleshooting guide #124

dolfinus commented Aug 16, 2023 •

edited

codecov bot commented Aug 16, 2023 •

edited

[DOP-8157] Add SparkS3 troubleshooting guide #124

[DOP-8157] Add SparkS3 troubleshooting guide #124

Conversation

dolfinus commented Aug 16, 2023 • edited

Change Summary

Related issue number

Checklist

codecov bot commented Aug 16, 2023 • edited

Codecov Report

dolfinus commented Aug 16, 2023 •

edited

codecov bot commented Aug 16, 2023 •

edited