Skip to content

Conversation

@mwws
Copy link

@mwws mwws commented Feb 18, 2016

Currently accumulator is not recoverable from Checkpoint, So that if the streaming application is restarted or recovery from broken, the value in accumulator will be lost. I would like to create new accumulator interface in StreamingContext to make it possible to create recoverable accumulator for streaming application.

And I create an example to demonstrate how to use it:
examples/src/main/scala/org/apache/spark/examples/streaming/RecoverableAccumulator.scala

@SparkQA
Copy link

SparkQA commented Feb 18, 2016

Test build #51473 has finished for PR 11249 at commit 0c7e5d7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* checkpoint
* @param name name is required as identity to find corresponding accumulator.
*/
def accumulator[T](initialValue: T, name: String)(implicit param: AccumulatorParam[T])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOrCreateRecoverableAccumulator[T](createFunc: () => T, name: String)(...)? Will it also explicitly to tell developer that only accumulators created via this API can be recoverable?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... change func name might me a good idea to explicitly emphasize the recoverable feature. I will change it.

About the input parameter, I don't think createFunchere is necessary, initialValue should be enough.

@SparkQA
Copy link

SparkQA commented Feb 19, 2016

Test build #51511 has finished for PR 11249 at commit 368dc0f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 23, 2016

Test build #51746 has finished for PR 11249 at commit cca11ba.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


val newInitialValue: T = if (isCheckpointPresent) {
_cp.trackedAccs.find(_.name == name).map(_.value).getOrElse(initialValue).asInstanceOf[T]
} else initialValue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

if (...) {
   ...
} else {
   ...
}

@SparkQA
Copy link

SparkQA commented Feb 26, 2016

Test build #52033 has finished for PR 11249 at commit 0725def.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mwws mwws changed the title [Spark-13374][Streaming][wip] make it possible to create recoverable accumulator for streaming application [Spark-13374][Streaming] make it possible to create recoverable accumulator for streaming application Mar 3, 2016
@mwws
Copy link
Author

mwws commented Mar 3, 2016

@tdas could you help me review it?

@mwws mwws closed this May 6, 2016
@mwws mwws deleted the SPARK-Accumulator branch May 23, 2016 07:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants