Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-12143] Pass configuration parameters to plugins manager (FS components) #8660

Merged
merged 10 commits into from Jun 13, 2019

Conversation

1u0
Copy link
Contributor

@1u0 1u0 commented Jun 7, 2019

What is the purpose of the change

This PR changes Flink distribution's starter scripts: defines new configuration environment parameter for plugins dir and passes the configuration to the plugins manager.

Currently, only file system components support the plugin manager, so related (to file system) end-to-end tests also have been modified to be loaded via the plugin manager.

Brief change log

  • the scripts in flink-dist have been modified, a new FLINK_HOME environment variable replaces FLINK_ROOT_DIR;
  • the various Flink startup entry points are extended to configure and pass plugins dir to plugins manager;
  • the plugin manager is configured by system wide parents first patterns for class loading;
  • some end-to-end tests involving different file system components are modified to use fs components via plugins mechanism.

Verifying this change

This change is already covered by existing tests, such as:

  • e2e batch wordcount tests for hadoop, presto, azure and dummy fs are modified to use the corresponding fs component via plugin mechanism;
  • test_streaming_file_sink.sh is modified to use hadoop fs via plugin mechanism;
  • test_docker_embedded_job.sh is modified to use dummy fs via plugin mechanism;
  • test_yarn_kerberos_docker.sh is modified to use dummy fs via plugin mechanism.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (yes / no) (the distribution Flink scripts)
  • The serializers: (yes / no / don't know)
  • The runtime per-record code paths (performance sensitive): (yes / no / don't know)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes / no / don't know)
  • The S3 file system connector: (yes / no / don't know)

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 7, 2019

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

  • ❓ 1. The [description] looks good.
  • ❓ 2. There is [consensus] that the contribution should go into to Flink.
  • ❓ 3. Needs [attention] from.
  • ❓ 4. The change fits into the overall [architecture].
  • ❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.


The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commands
The @flinkbot bot supports the following commands:

  • @flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
  • @flinkbot approve all to approve all aspects
  • @flinkbot approve-until architecture to approve everything until architecture
  • @flinkbot attention @username1 [@username2 ..] to require somebody's attention
  • @flinkbot disapprove architecture to remove an approval you gave earlier

@1u0 1u0 mentioned this pull request Jun 7, 2019
@1u0 1u0 changed the title [FLINK-12143] [FLINK-12143] Pass configuration parameters to plugins manager (FS components) Jun 7, 2019
@pnowojski pnowojski self-requested a review June 11, 2019 09:13
Copy link
Contributor

@pnowojski pnowojski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@1u0 commits LGTM (and thanks @1u0 for driving this effort home :) )

@zentol could you take a look at my commits? They touch scripts/deployments so it would be nice if someone with an expertise in that area would take a look.

@zentol
Copy link
Contributor

zentol commented Jun 11, 2019

@pnowojski I'm not really knowledgeable about the start scripts nor deployment unfortuantely. I still skimmed over them and didn't spot anything that looks wildly suspicious.
I wouldn't know who to ask for as an "expert" though.

I'm not entirely sold on migrating from FLINK_ROOT_DIR to FLINK_HOME; it looks like only the docker/mesos and new pyflink code (bar some super-old scripts) were using FLINK_HOME.
So basically my question is, why did you pick one over the other?

@pnowojski
Copy link
Contributor

Thanks @zentol. I just hoped that someone with more experience here might catch something that's not covered by our tests (like the Windows bat file bug that @1u0 has fixed).

I'm not entirely sold on migrating from FLINK_ROOT_DIR to FLINK_HOME; it looks like only the docker/mesos and new pyflink code (bar some super-old scripts) were using FLINK_HOME. So basically my question is, why did you pick one over the other?

I just wanted to have them unified. I personally preferred FLINK_ROOT_DIR a bit better, but @tillrohrmann was using FLINK_HOME during our offline discussions about it (even when I was referring to it as FLINK_ROOT_DIR ;) ), so I presume he preferred the FLINK_HOME more. I didn't mind one way or another too much so just I sticked to nomenclature that @tillrohrmann was using.

If you have some stronger preference the other way, we could change it the other direction. If not, I would vote for leaving it unified as it is to avoid an extra work.

@tillrohrmann
Copy link
Contributor

For me X_HOME is a bit more standard than other names (looking at my local env variables definitions). But I'm not adamant on the actual naming.

@zentol
Copy link
Contributor

zentol commented Jun 13, 2019

Oh we should definitely unify things.

My gut feeling is that we should stick to FLINK_ROOT_DIR, purely because it was used before the dockerfiles were added (and thus introduced the inconsistency). Had we bothered about this consistency back then they probably would have used FLINK_ROOT_DIR as well; so we're fixing the timeline here.

But this is ultimately a minor thing that I wouldn't mind being decided by a coin-flip.

@pnowojski
Copy link
Contributor

Yes, you are right @zentol. I will just pick the lazy solution - keep it as it is proposed now.

@1u0 could you rebase and resolve the conflict so that I can merge the PR?

pnowojski and others added 10 commits June 13, 2019 12:00
The same concept of Flink's root/home directory was named in different places differently.
This commit unifies the naming convention to FLINK_HOME by renaming FLINK_ROOT_DIR and FLINK_HOME_DIR to FLINK_HOME.
Previously exporting of variables like FLINK_CONF_DIR, FLINK_BIN_DIR, FLINK_LIB_DIR, ...
was scattered or even dupliacted in multiple places. Now it happens only in config.sh.
@1u0
Copy link
Contributor Author

1u0 commented Jun 13, 2019

@1u0 could you rebase and resolve the conflict so that I can merge the PR?

Done.

@pnowojski
Copy link
Contributor

Thanks, merging :)

@pnowojski pnowojski merged commit 9e6ff81 into apache:master Jun 13, 2019
@1u0 1u0 deleted the flink-12143 branch June 13, 2019 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants