Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33425] Fix https credentials when doing spark-submit #30337

Closed
wants to merge 2 commits into from

Conversation

pprzetacznik
Copy link

When using:

./bin/spark-submit \
  --class org.apache.spark.examples.SparkPi \
  --master k8s://xx.yy.zz.ww:443 \
  --deploy-mode cluster \
  --executor-memory 20G \
  --num-executors 50 \
  https://user:password@path/to/examples.jar \
  1000

I receive an error described in https://issues.apache.org/jira/browse/SPARK-33425

What changes were proposed in this pull request?

Passing credentials to the request properties so there's no need for a server to challenge the URLConnection for credentials.

Why are the changes needed?

Documentations says:

(Note that credentials for password-protected repositories can be supplied in some cases in the repository URI, such as in https://user:password@host/.... Be careful when supplying credentials this way.)
However, that feature doesn't work.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Not tested yet. Looking for feedback.

@github-actions github-actions bot added the CORE label Nov 11, 2020
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution, @pprzetacznik . Please run dev/lint-scala and fix the Scala style issue.

@pprzetacznik
Copy link
Author

Thank you @dongjoon-hyun for feedback. Could you check now if that looks better?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun dongjoon-hyun marked this pull request as draft November 12, 2020 23:28
@dongjoon-hyun
Copy link
Member

One more tip, @pprzetacznik . In general, people don't review PRs with [WIP] in the title.

@pprzetacznik pprzetacznik changed the title [WIP][SPARK-33425] Fix https credentials when doing spark-submit [SPARK-33425] Fix https credentials when doing spark-submit Nov 12, 2020
@@ -743,7 +743,15 @@ private[spark] object Utils extends Logging {
val is = Channels.newInputStream(source)
downloadFile(url, is, targetFile, fileOverwrite)
case "http" | "https" | "ftp" =>
val uc = new URL(url).openConnection()
val url_object = new URL(url)
val uc = url_object.openConnection().asInstanceOf[HttpURLConnection]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work in the case of an FTP URI?
Maybe needs some pattern matching to only set this if this is an HttpURLConnection and user info is not null.

val uc = new URL(url).openConnection()
val url_object = new URL(url)
val uc = url_object.openConnection().asInstanceOf[HttpURLConnection]
uc.setDoInput(true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually needed?

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Feb 24, 2021
@github-actions github-actions bot closed this Feb 25, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants