[FLINK-11653][DataStream] Add configuration to enforce custom UID's o… #7747

sjwiesman · 2019-02-18T19:28:24Z

…n datastream

What is the purpose of the change

Current best practice when deploying Flink applications to production is to set a custom UID, using DataStream#uid, so jobs can resume from savepoints even if they job graph has been modified. Flink should contain a configuration that can allow users to fail submission if their program contains an operator without a custom UID; enforcing best practices similarly to #disableGenericTypes.

Brief change log

Adds a new configuration #disableAutoGeneratedUIDs which will fail job submission if there are any operators without a custom uid.

Verifying this change

This change added tests and can be verified as follows:

(example:)

Added test to StreamingJobGraphGeneratorTest

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public(Evolving): yes
The serializers: no
The runtime per-record code paths (performance sensitive): no
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? yes
If yes, how is the feature documented? docs

flinkbot · 2019-02-18T19:28:38Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Review Progress

✅ 1. The [description] looks good.
- Approved by @dawidwys [committer]
✅ 2. There is [consensus] that the contribution should go into to Flink.
- Approved by @aljoscha [PMC], @dawidwys [committer]
❗ 3. Needs [attention] from.
- Needs attention by @aljoscha [PMC]
✅ 4. The change fits into the overall [architecture].
- Approved by @dawidwys [committer]
✅ 5. Overall code [quality] is good.
- Approved by @dawidwys [committer]

Please see the Pull Request Review Guide for a full explanation of the review process.

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

klion26 · 2019-02-19T01:51:19Z

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

+	 *
+	 * <p>Auto generated UID's are enabled by default.
+	 *
+	 * @see #enableAutoGeneratedUIDs()


do we need to change @see #enableAutoGeneratedUUIDs() -> @see {@link #enableAutoGeneratedUUIDs()}

A little confused by what your asking. @see automatically creates a link to the method if thats what you mean.

klion26 · 2019-02-19T01:53:55Z

...-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java

@@ -305,4 +308,15 @@ public void invoke(Integer value) throws Exception {
 			}
 		}
 	}
+
+	@Test(expected = IllegalStateException.class)
+	public void testDisabledAutoUIDGenerator() {


How about adding a test that enabled auto-generated uid

dawidwys · 2019-02-20T10:27:04Z

@flinkbot approve description
@flinkbot attention @aljoscha

aljoscha · 2019-02-21T09:21:07Z

@flinkbot approve consensus

dawidwys

I think we should get back to how do we want to approach this issue.

There are two mechanisms for setting hashes:

setUid - current, recommended approach
setUidHash - legacy mode, that works sort of like a fallback in case when the setUid cannot work

In current version of this PR, we would enforce users to always use the setUidHash, which is probably not what we want to do. I think ensuring that at least one of those is set will be extremely hard. There is no obvious one place when we can introduce that check.

My suggestion would be to force only setUid calls, as this is the recommended approach, whereas setUidHash as far as I understand it is just a last resort method. What do you think? I would appreciate @aljoscha opinion on that as well.

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

dawidwys · 2019-02-21T15:14:48Z

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

+
+	/**
+	 * Disable's auto-generated UID's forces users to manually specify UID's
+	 * on their datastreams using {@code DataStream#uid} or {@code DataStream#setUidHash}.


Maybe let's not link DataStream#setUidHash? As this is last resort, a hackish approach. What do you think?

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

dawidwys · 2019-02-21T15:15:29Z

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

+	 *
+	 * <p>It is highly recommended that user's specify UID's before deploying to
+	 * production since they are used to match state in savepoints to operators
+	 * in a job. Because auto-generated ID's are likely to change when modifying


Suggested change

* in a job. Because auto-generated ID's are likely to change when modifying

* in a job. Because auto-generated IDs are likely to change when modifying

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

dawidwys · 2019-02-21T15:18:48Z

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

+	 * @see #enableAutoGeneratedUIDs()
+	 * @see #disableAutoGeneratedUIDs()
+	 */
+	public boolean hasAutoGeneratedUIDsDisabled() {


Suggested change

public boolean hasAutoGeneratedUIDsDisabled() {

public boolean isAutoGeneratedUIDsDisabled() {

What do you think about inverting the check? isAutoGeneratedUIDsEnabled?

I was going for consistency with hasGenericTypesDisabled but I don't have a strong preference here.

sjwiesman · 2019-02-21T16:39:08Z

Thanks for the review @dawidwys, that is the correct understanding of setUidHash. So I think the question is, will we allow users to migrate existing jobs to use this configuration or say it is only allowed for new deployments. If we want to allow migration then a manually set hash should count, but maybe we make the documentation more clear that it is not recommended?

sjwiesman · 2019-02-21T17:58:04Z

I've updated the pr to properly handle DataStream#setUid
I've left setUidHash as allowed but updated the javadoc to reflect it being discouraged. Without a way to specify the existing hash and a new uid I think it should be allowed.

…n datastream

dawidwys

Thanks @sjwiesman for the update. I've put some additional comments.

dawidwys · 2019-02-22T11:22:30Z

...-java/src/test/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGeneratorTest.java

+	}
+
+	@Test
+	public void testDisabledAutoUIDWithManuallySetId() {


Shouldn't all DisabledAutoUID* tests fail currently? Each of the test have operators without uids assigned (e.g. .map(new NoOpIntMap()), .addSink(new DiscardingSink<>())),

No, again because of operator chaining. That’s why the third test explicitly disables chaining in one place to check that condition.

dawidwys · 2019-02-22T11:33:17Z

...ming-java/src/main/java/org/apache/flink/streaming/api/graph/StreamingJobGraphGenerator.java

@@ -348,6 +348,13 @@ private StreamConfig createJobVertex(
 					"Did you generate them before calling this method?");
 		}

+		if (!streamGraph.getExecutionConfig().hasAutoGeneratedUIDsEnabled()) {


I don't think this is the best place to check it.

Why don't we move the check to org.apache.flink.streaming.api.graph.StreamGraphGenerator#transform? I think what we want to effectively enforce is that a StreamTransformation has either uid or userHash set, don't we?

Unfortunately we can’t do that, flink only uses the uid for the first operator in each operator chain. Most users seem to understand that and only set UIDs in those places. This is the only place we can check that the opeator id assigned to each job vertex is from a manually set uid which is what we’re really after.

I tend to disagree. AFAIK for state mapping purposes all OperatorIds are preserved, even those that are chained into a single operator chain (a single chained JobVertex), otherwise it would be impossible to e.g. restore job with changed chaining, right?

Only the uid for the head operator in a chain is used to generate the OperatorId for a JobVertex, and I am not familiar with uids being anywhere else to partition state within a single vertex. I believe it is impossible to restore a job with changed chaining but I will double check.

I am pretty sure you can, and all operators are used to generate OperatorId. See org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator#createJobVertex:368-394.

Also see the ctor of JobVertex, it's true that the id of head operator is used as primaryId, but all chained operator ids are passed as operatorIds, which is later used in e.g. org.apache.flink.runtime.checkpoint.StateAssignmentOperation#assignTaskStateToExecutionJobVertices to assign proper state(s).

Thanks for pointing this out, I had no idea you could restore with different chaining. I'll make the necessary updates.

dawidwys · 2019-02-22T11:37:21Z

flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java

+
+	/**
+	 * Disables auto-generated UIDs. Forces users to manually specify UIDs
+	 * on their datastreams using {@code DataStream#uid} or {@code DataStream#setUidHash}


Actually DataStream has neither uid nor setUidHash methods. I think it is enough to say that it forces to use manually specified UIDs.

…n datastream

dawidwys · 2019-02-22T16:44:43Z

@flinkbot approve architecture

aljoscha · 2019-02-25T13:33:23Z

@dawidwys That was a very good point about setUidHash()! If we can only easily check for one, we should go for setUid(). If both work we can make either of the two fulfil the requirement of having hashes/uid specified.

dawidwys

I think it looks good now. Thank you @sjwiesman for your contibution.

Merging.

@flinkbot approve all

klion26 reviewed Feb 19, 2019

View reviewed changes

dawidwys requested changes Feb 21, 2019

View reviewed changes

sjwiesman force-pushed the FLINK-11653 branch from ca3ff21 to 5f0bd6a Compare February 21, 2019 17:53

sjwiesman force-pushed the FLINK-11653 branch from 5f0bd6a to d36ec88 Compare February 21, 2019 20:53

sjwiesman and others added 2 commits February 21, 2019 14:56

[FLINK-11653][DataStream] Add configuration to enforce custom UID's o…

5076b94

…n datastream

[FLINK-11653][DataStream] Add configuration to enforce custom UID's o…

3b3e225

…n datastream

sjwiesman force-pushed the FLINK-11653 branch from d36ec88 to 3b3e225 Compare February 21, 2019 20:57

dawidwys requested changes Feb 22, 2019

View reviewed changes

dawidwys self-assigned this Feb 22, 2019

rmetzger added the review=architecture? label Feb 22, 2019

[FLINK-11653][DataStream] Add configuration to enforce custom UID's o…

0c62043

…n datastream

rmetzger requested a review from aljoscha February 22, 2019 16:45

rmetzger added review=quality? and removed review=architecture? labels Feb 22, 2019

dawidwys approved these changes Mar 4, 2019

View reviewed changes

rmetzger added review=approved ✅ and removed review=quality? labels Mar 4, 2019

dawidwys merged commit 9c32948 into apache:master Mar 4, 2019

rmetzger added the component=API/DataStream label Mar 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-11653][DataStream] Add configuration to enforce custom UID's o… #7747

[FLINK-11653][DataStream] Add configuration to enforce custom UID's o… #7747

sjwiesman commented Feb 18, 2019 •

edited by dawidwys

flinkbot commented Feb 18, 2019 •

edited

klion26 Feb 19, 2019

sjwiesman Feb 19, 2019

klion26 Feb 19, 2019

dawidwys commented Feb 20, 2019

aljoscha commented Feb 21, 2019

dawidwys left a comment

dawidwys Feb 21, 2019

dawidwys Feb 21, 2019

dawidwys Feb 21, 2019

dawidwys Feb 21, 2019

sjwiesman Feb 21, 2019

sjwiesman commented Feb 21, 2019

sjwiesman commented Feb 21, 2019 •

edited

dawidwys left a comment

dawidwys Feb 22, 2019

sjwiesman Feb 22, 2019

dawidwys Feb 22, 2019

sjwiesman Feb 22, 2019

dawidwys Feb 22, 2019

sjwiesman Feb 22, 2019

dawidwys Feb 22, 2019 •

edited

sjwiesman Feb 22, 2019 •

edited

dawidwys Feb 22, 2019

dawidwys commented Feb 22, 2019

aljoscha commented Feb 25, 2019

dawidwys left a comment •

edited

	* in a job. Because auto-generated ID's are likely to change when modifying
	* in a job. Because auto-generated IDs are likely to change when modifying

	public boolean hasAutoGeneratedUIDsDisabled() {
	public boolean isAutoGeneratedUIDsDisabled() {

[FLINK-11653][DataStream] Add configuration to enforce custom UID's o… #7747

[FLINK-11653][DataStream] Add configuration to enforce custom UID's o… #7747

Conversation

sjwiesman commented Feb 18, 2019 • edited by dawidwys

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented Feb 18, 2019 • edited

Review Progress

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dawidwys commented Feb 20, 2019

aljoscha commented Feb 21, 2019

dawidwys left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sjwiesman commented Feb 21, 2019

sjwiesman commented Feb 21, 2019 • edited

dawidwys left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dawidwys Feb 22, 2019 • edited

Choose a reason for hiding this comment

sjwiesman Feb 22, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dawidwys commented Feb 22, 2019

aljoscha commented Feb 25, 2019

dawidwys left a comment • edited

Choose a reason for hiding this comment

sjwiesman commented Feb 18, 2019 •

edited by dawidwys

flinkbot commented Feb 18, 2019 •

edited

sjwiesman commented Feb 21, 2019 •

edited

dawidwys Feb 22, 2019 •

edited

sjwiesman Feb 22, 2019 •

edited

dawidwys left a comment •

edited