Skip to content
This repository has been archived by the owner on Mar 3, 2023. It is now read-only.

New API for topology reliability modes #2089

Merged
merged 11 commits into from Jul 21, 2017
Merged

New API for topology reliability modes #2089

merged 11 commits into from Jul 21, 2017

Conversation

srkukarni
Copy link
Contributor

Currently Heron topologies can run in either atmost_once, atleast_once(and in the upcoming change exactly_once or stateful) modes. The way you specify the mode is not very obvious. For instance, in order to run in atmost once, you need to setEnableAcking(false), and for atleast once, setEnableAcking(true). This way of specifying actually attaches acking(which is the Heron;'s way of tracking tuples) with the semantics of atleast/atmost once. Further more, newer mode like exactly once cause more confusion with more flags that are added and one now needs to worry about which flags are set and which are unset to be really sure of whats going on.
This change introduces TopologyReliabilityMode which is an enum of the various different modes that Heron supports. This is a one stop and intuitive way of specifying the reliability mode that you want to run the topology in.
Legacy setEnableAcking is now deprecated and will be removed in future releases.

@srkukarni
Copy link
Contributor Author

Fixes #1904

* upon their death, it does not guard against tuple losses that can happen because
* of network partitions and other reasons.
*/
STATEFUL,
Copy link
Contributor

@billonahill billonahill Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems orthogonal to the reliability mode IMO. If exactly once leverages the stateful impl that's an implementation detail. From the users perspective these are different things. The reliability mode should specify guarantees with tuple reliability, while the stateful mode should define whether herons state management is being leveraged and enabled by the user logic in the topology.

IIRC the reason for requiring the stateful enabled flag was so that users could turn stateful processing of their topology on or off (otherwise we could derive this implicitly by inspecting the topology). If that's the case they might choose to do either while still keeping exactly once enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats an interesting thought! It seems in general there might be lack of consensus/clarity on this one. So I've removed STATEFUL from the list of the modes. We'll add it back later.

@@ -302,8 +319,13 @@ public static void setSerializationClassName(Map<String, Object> conf, String cl
conf.put(Config.TOPOLOGY_SERIALIZER_CLASSNAME, className);
}

// This is deprecated. Please use setReliabilityMode instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and others we should annotate as deprecated:

/**
 * Acking is now internally set based on the reliability mode being used.
 *
 * @deprecated use {@link #setReliabilityMode()} instead.  
 */
@Deprecated
public static void setEnableAcking...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -37,6 +37,7 @@
private long totalTuplesEmitted;
private PhysicalPlanHelper helper;

@SuppressWarnings("deprecation")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where we add this, we should add a comment that it's only need until TOPOLOGY_ENABLE_ACKING is removed. That way it doesn't linger after TOPOLOGY_ENABLE_ACKING is removed. Same for other occurrences in this pr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments

*/
@Deprecated
public static final String TOPOLOGY_ENABLE_ACKING = "topology.acking";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I found no occurrences of TOPOLOGY_ENABLE_ACKING or setEnableAcking in twitters codebase. @nlu90 @maosongfu do you know how topologies set acking without using those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember @maosongfu @kramasamy mentioning that there are some native heron topologies inside Twitter that might be using these features.

Copy link
Contributor

@billonahill billonahill Jul 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought there were too, but I don't see it being enabled by using either of these. Is there any other way? I see this in one place:

.put("topology.acking", String.valueOf(true))

@srkukarni srkukarni merged commit eefcd08 into apache:master Jul 21, 2017
@srkukarni srkukarni deleted the sanjeevk/ext1_reliabilitymode branch July 21, 2017 01:51
@billonahill
Copy link
Contributor

👍

nicknezis pushed a commit that referenced this pull request Sep 14, 2020
* Added topology.reliability.mode to specify what reliability the user
wants to run the topology in. This will deprecate the topology.acking,
topology.stateful.enabled and topology.stateful.exactonce settings, thereby
providing user with a single switch to control the topology behaviour

* Fixed build

* Made the value of topology.reliability.mode to be mirroring the names

* Fixed integration build

* Fixed integration test

* Brought back setEnableAcking interface back to keep compat with
Heron native topologies inside Twitter

* Fixed old values of reliability mode

* Marked deprecated fields/methods with annotations

* Removed STATEFUL mode since there is still lack of clarity on what that means

* Added comments that the suppresswarnings will go away once
TOPOLOGY_ENABLE_ACKING goes away
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants