[FLINK-4127] Check API compatbility for 1.1 in flink-core #2177

rmetzger · 2016-06-28T14:00:56Z

I checked all the newly introduced methods in public APIs by going through the reports generated from japicmp.
I've also put the reports (before my PR) into the JIRA: https://issues.apache.org/jira/browse/FLINK-4127

I added the new configuration parameters to the documentation, and renamed some new configuration keys.

@uce @tillrohrmann @mxm: What do you think about the renaming of the configuration keys?

greghogan · 2016-06-28T18:53:03Z

docs/setup/config.md


 - `taskmanager.memory.preallocate`: Can be either of `true` or `false`. Specifies whether task managers should allocate all managed memory when starting up. (DEFAULT: false)

+- `taskmanager.runtime.large-record-handler`: Whether to use the LargeRecordHandler when spilling. This feature is experimental. (DEFAULT: false)


Have we documented when it would be useful to enable LargeRecordHandler?

That's a very good point, Greg. I think we didn't. As it stands, the large record handler seems to have known issues. Personally, I would therefore discourage people from using it, but it totally makes sense to add a paragraph about what it actually does.

@aljoscha, can you explain what the large record handler is doing?

Not really, I'm afraid. I just recently disabled it by default because of the known issues.

What I can gather from the code is that it is a separate sort buffer that is intended for very large records. It is not obvious from the code what "very large records" are, however.

The known problem is that key serialization does not work correctly if the user specified a custom type or if Scala types are used because the TypeAnalyzer is used in the LargeRecordHandler to get a TypeInformation on the fly.

Maybe we should not mention this in the documentation then?

I would say so, yes.

mxm · 2016-06-29T08:40:21Z

Thanks for updating the documentation! I've made some suggestions regarding the names of the new configuration keys.

tillrohrmann · 2016-06-29T08:50:19Z

docs/setup/config.md


+### Resource Manager
+
+- `resourcemanager.rpc.port`: The config parameter defining the network port to connect to for communication with the resource manager.


Is it possible to specify a port range here?

No, it doesn't. It is currently not in use (except for testing code). In Yarn, the port of the application master is always the port of the job manager because the two run in the same actor system.

tillrohrmann · 2016-06-29T08:55:05Z

Good changes @rmetzger.

rmetzger · 2016-07-01T09:06:46Z

I addressed all comments. It seems to me that the configuration parameters for YARN are overly complicated now because they are separated in resource manager and YARN now.

mxm · 2016-07-01T13:40:18Z

flink-core/src/main/java/org/apache/flink/configuration/ConfigConstants.java

 	 * in the flink-conf.yaml.
 	 */
-	public static final String CONTAINERED_MASTER_ENV_PREFIX = "containered.application-master.env.";
+	public static final String CONTAINERIZED_MASTER_ENV_PREFIX = "containerized.master.env.";


I thought this should be CONTAINER_MASTER_ENV_PREFIX = container.master.env.?

I renamed it.
I think we misunderstood each other. I thought our discussion regarding container / containered / containerized was about the prefix in general, not for the heap cutoff settings alone.

rmetzger · 2016-07-04T13:27:34Z

The YARN settings are now spread across three prefixes:

yarn.
containerized.
container.

I don't think this is helpful for making the system easy to configure. I suggest to remove the containerized. variant and rename it to container.
What's your opinion on this @tillrohrmann @uce @StephanEwen ?

StephanEwen · 2016-07-04T13:33:43Z

That would be good to address. Can we get away with only yarn. and taskmanager.?

StephanEwen · 2016-07-04T17:12:53Z

There are a lot of config options that seem to exist in two variants: One for standalone setup, one for containered setup.

Why are we making this distinction?

tillrohrmann · 2016-07-05T12:41:17Z

I think it's a good idea to merge container and containerized.

I think the idea is to introduce some general container parameters, such as the heap-cutoff-ratio, which are valid not only for yarn but also for mesos. Therefore, the container/containered/containerized prefix was introduced. Given that we don't have mesos support yet it might be confusing. However, changing the parameter names too often, given that they are declared public evolving, will cause pain for users.

Which duplicate configuration parameters are you referring to @StephanEwen?

rmetzger · 2016-07-07T13:23:42Z

I'll rename containerized to container once I got a third confirmation from a committer on this issue.

aljoscha · 2016-07-07T13:32:00Z

Sorry for chiming in late, but can't - heap-cutoff-ratio and heap-cutoff-min be general parameters that apply to all VMs we start?

Same for master.env.* and taskmanager.env.*. Shouldn't this also apply to all VMs that we start?

rmetzger · 2016-07-07T13:43:19Z

So you are suggesting to use the heap cutoff also for the JVM started in standalone mode?
The cutoff was introduced for YARN because there is a difference in the amount of used memory for a container we specify to YARN vs the memory the JVM is actually using.
So if we specify a container of 1000 MB, it will be killed by YARN as soon as we use 1001 MBs. For operating systems, this doesn't apply and if users run into issues, the can just set a lower heap size.
For Mesos, I don't know how it behaves, but I hope that there is a similar requirement, otherwise, we would have prematurely optimized the configuration keys.
I'm against applying this mechanism to all JVMs we start.

For the environment variables of the standalone mode, its quite easy to put the environment variables into the config.sh file, or just set the env variables.
But true, we could do a follow up JIRA to extract those env variables and set them in the config.sh script.

aljoscha · 2016-07-07T13:52:23Z

My thinking was that Mesos maybe didn't require this, yes. Having it for Yarn only makes for a better user experience right now. So we could leave it under the "container" namespace.

rmetzger · 2016-07-07T14:47:51Z

But with this argument, it would be even better to leave the configuration parameters as they are in 1.0 until we are 100% certain that mesos needs a similar mechanism.

aljoscha · 2016-07-07T14:50:27Z

Yeah, then let's leave them as they were since we don't know yet what mesos will require there.

rmetzger · 2016-07-11T11:58:59Z

All the container* config keys are now renamed to containerized., so we have yarn. and containerized. as prefixes. But these config keys will be undocumented for now.
In the documentation, we still only list the yarn. configuration keys (and they are also picked up by the system).

I guess with the 1.2 release (when we are adding Mesos support) we can reconsider this decision and see how we are naming / grouping the configuration keys.

I'm going to merge this pull request after 24 hours.

greghogan reviewed Jun 28, 2016
View reviewed changes

tillrohrmann reviewed Jun 29, 2016
View reviewed changes

mxm reviewed Jul 1, 2016
View reviewed changes

rmetzger added 5 commits July 11, 2016 14:04

[FLINK-4127] Check API compatbility for 1.1 in flink-core

793ebaf

Address some of the comments

10a3a6f

group config keys

077fb79

Adress Max comment

e2190df

undo

d85ce53

rmetzger force-pushed the flink4127 branch from 6d5ad83 to d85ce53 Compare July 11, 2016 12:04

asfgit closed this in 6b7bb76 Jul 13, 2016

rmetzger deleted the flink4127 branch July 14, 2016 09:51

rmetzger added the component=BuildSystem label Mar 14, 2019


		- `taskmanager.memory.preallocate`: Can be either of `true` or `false`. Specifies whether task managers should allocate all managed memory when starting up. (DEFAULT: false)

		- `taskmanager.runtime.large-record-handler`: Whether to use the LargeRecordHandler when spilling. This feature is experimental. (DEFAULT: false)


		### Resource Manager

		- `resourcemanager.rpc.port`: The config parameter defining the network port to connect to for communication with the resource manager.

[FLINK-4127] Check API compatbility for 1.1 in flink-core #2177

[FLINK-4127] Check API compatbility for 1.1 in flink-core #2177

Uh oh!

Conversation

rmetzger commented Jun 28, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mxm commented Jun 29, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tillrohrmann commented Jun 29, 2016

Uh oh!

rmetzger commented Jul 1, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rmetzger commented Jul 4, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

StephanEwen commented Jul 4, 2016

Uh oh!

StephanEwen commented Jul 4, 2016

Uh oh!

tillrohrmann commented Jul 5, 2016

Uh oh!

rmetzger commented Jul 7, 2016

Uh oh!

aljoscha commented Jul 7, 2016

Uh oh!

rmetzger commented Jul 7, 2016

Uh oh!

aljoscha commented Jul 7, 2016

Uh oh!

rmetzger commented Jul 7, 2016

Uh oh!

aljoscha commented Jul 7, 2016

Uh oh!

rmetzger commented Jul 11, 2016

Uh oh!

Uh oh!

rmetzger commented Jul 4, 2016 •

edited

Loading