Investigate consumer.id for better Spark Streaming + Kafka failure recovery #93

srowen · 2014-11-26T19:52:16Z

Spark 1.2.0 should no longer force consumers to always start from the beginning of a topic after recovering from failure. It may / should be possible to just use a consistent consumer.id so that the job picks up reading where it left off. This would probably be better semantically for the Batch and Speed Layer.

The text was updated successfully, but these errors were encountered:

…r.id, and use it when reading the input queue to read from where reads left off

srowen added enhancement BatchLayer SpeedLayer LambdaTier labels Nov 26, 2014

srowen self-assigned this Nov 26, 2014

srowen added this to the 2.0.0 milestone Nov 26, 2014

srowen added a commit to srowen/oryx that referenced this issue Jan 25, 2015

Closes issue OryxProject#93 : Add ID, and use it to determine consume…

f3d6156

…r.id, and use it when reading the input queue to read from where reads left off

srowen modified the milestones: 2.0.0-alpha-1, 2.0.0 Jan 25, 2015

srowen closed this as completed Jan 25, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate consumer.id for better Spark Streaming + Kafka failure recovery #93

Investigate consumer.id for better Spark Streaming + Kafka failure recovery #93

srowen commented Nov 26, 2014

Investigate consumer.id for better Spark Streaming + Kafka failure recovery #93

Investigate consumer.id for better Spark Streaming + Kafka failure recovery #93

Comments

srowen commented Nov 26, 2014