Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-30812][yarn] Fix uploading local files when using YARN with S3 #21788

Merged
merged 3 commits into from Sep 2, 2023

Conversation

mateczagany
Copy link
Contributor

What is the purpose of the change

Hadoop versions starting from 3.3.2 have reworked copying local files to remote using S3AFileSystem. Now it requires the passed local Hadoop Path to have a scheme specified.

Brief change log

YarnApplicationFileUploader will append file:// scheme for local path if none specified

Verifying this change

I manually tested on a local single-node Hadoop 3.3.4 cluster with fs.defaultFS pointing to a local Minio S3 server. Creating a Flink YARN session failed before the fix, but worked after.

Also added new unit test in YarnApplicationFileUploaderTest.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no) no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no) no
  • The serializers: (yes / no / don't know) no
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know) no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know) yes
  • The S3 file system connector: (yes / no / don't know) no

Documentation

  • Does this pull request introduce a new feature? (yes / no) no
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented) not applicable

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 30, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@MartijnVisser MartijnVisser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please re-check to avoid Mockito per the Flink code contribution guidelines

@MartijnVisser
Copy link
Contributor

@gaborgsomogyi Can you take a look?

@gaborgsomogyi
Copy link
Contributor

I'm having vacation on the whole week w/o computer. Next week I can start w/ this...

Copy link
Contributor

@gaborgsomogyi gaborgsomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a minor found.

@@ -388,13 +388,20 @@ private Path copyToRemoteApplicationDir(
(relativeDstPath.isEmpty() ? "" : relativeDstPath + "/") + localSrcPath.getName();
final Path dst = new Path(applicationDir, suffix);

final Path localSrcPathWithScheme;
if (localSrcPath.toUri().getScheme() == null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the internal details of the Uri parsing but isNullOrWhitespaceOnly sounds more safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I agree, I've changed it

Copy link
Contributor

@gaborgsomogyi gaborgsomogyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, pending tests.

@mateczagany
Copy link
Contributor Author

Rebased to fix failing CI tests

@gaborgsomogyi
Copy link
Contributor

gaborgsomogyi commented Aug 1, 2023

@MartijnVisser are you fine w/ the PR? I'm merging if you say so...

@gaborgsomogyi
Copy link
Contributor

Waiting to end the feature freeze....

@gaborgsomogyi
Copy link
Contributor

@flinkbot run azure

@gaborgsomogyi
Copy link
Contributor

Unless there are objections + azure is passing I'm intended to merge this tomorrow.

@gaborgsomogyi gaborgsomogyi merged commit 9507dd6 into apache:master Sep 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants