-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-4127] Check API compatbility for 1.1 in flink-core #2177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/setup/config.md
Outdated
|
||
- `taskmanager.memory.preallocate`: Can be either of `true` or `false`. Specifies whether task managers should allocate all managed memory when starting up. (DEFAULT: false) | ||
|
||
- `taskmanager.runtime.large-record-handler`: Whether to use the LargeRecordHandler when spilling. This feature is experimental. (DEFAULT: false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we documented when it would be useful to enable LargeRecordHandler
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a very good point, Greg. I think we didn't. As it stands, the large record handler seems to have known issues. Personally, I would therefore discourage people from using it, but it totally makes sense to add a paragraph about what it actually does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aljoscha, can you explain what the large record handler is doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, I'm afraid. I just recently disabled it by default because of the known issues.
What I can gather from the code is that it is a separate sort buffer that is intended for very large records. It is not obvious from the code what "very large records" are, however.
The known problem is that key serialization does not work correctly if the user specified a custom type or if Scala types are used because the TypeAnalyzer
is used in the LargeRecordHandler
to get a TypeInformation
on the fly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should not mention this in the documentation then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say so, yes.
Thanks for updating the documentation! I've made some suggestions regarding the names of the new configuration keys. |
docs/setup/config.md
Outdated
|
||
### Resource Manager | ||
|
||
- `resourcemanager.rpc.port`: The config parameter defining the network port to connect to for communication with the resource manager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to specify a port range here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't. It is currently not in use (except for testing code). In Yarn, the port of the application master is always the port of the job manager because the two run in the same actor system.
Good changes @rmetzger. |
I addressed all comments. It seems to me that the configuration parameters for YARN are overly complicated now because they are separated in resource manager and YARN now. |
* in the flink-conf.yaml. | ||
*/ | ||
public static final String CONTAINERED_MASTER_ENV_PREFIX = "containered.application-master.env."; | ||
public static final String CONTAINERIZED_MASTER_ENV_PREFIX = "containerized.master.env."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this should be CONTAINER_MASTER_ENV_PREFIX = container.master.env.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it.
I think we misunderstood each other. I thought our discussion regarding container
/ containered
/ containerized
was about the prefix in general, not for the heap cutoff settings alone.
The YARN settings are now spread across three prefixes:
I don't think this is helpful for making the system easy to configure. I suggest to remove the |
That would be good to address. Can we get away with only |
There are a lot of config options that seem to exist in two variants: One for standalone setup, one for containered setup. Why are we making this distinction? |
I think it's a good idea to merge I think the idea is to introduce some general container parameters, such as the Which duplicate configuration parameters are you referring to @StephanEwen? |
I'll rename |
Sorry for chiming in late, but can't - Same for |
So you are suggesting to use the heap cutoff also for the JVM started in standalone mode? For the environment variables of the standalone mode, its quite easy to put the environment variables into the config.sh file, or just set the env variables. |
My thinking was that Mesos maybe didn't require this, yes. Having it for Yarn only makes for a better user experience right now. So we could leave it under the "container" namespace. |
But with this argument, it would be even better to leave the configuration parameters as they are in 1.0 until we are 100% certain that mesos needs a similar mechanism. |
Yeah, then let's leave them as they were since we don't know yet what mesos will require there. |
All the container* config keys are now renamed to I guess with the 1.2 release (when we are adding Mesos support) we can reconsider this decision and see how we are naming / grouping the configuration keys. I'm going to merge this pull request after 24 hours. |
I checked all the newly introduced methods in public APIs by going through the reports generated from japicmp.
I've also put the reports (before my PR) into the JIRA: https://issues.apache.org/jira/browse/FLINK-4127
I added the new configuration parameters to the documentation, and renamed some new configuration keys.
@uce @tillrohrmann @mxm: What do you think about the renaming of the configuration keys?