Epic: Realtime Ingestion Improvements #1642

Closed
himanshug opened this Issue Aug 19, 2015 · 7 comments

Comments

Projects
None yet
3 participants
@himanshug
Contributor

himanshug commented Aug 19, 2015

This issue tracks all the "related" efforts targeted towards making realtime ingestion better in one way or the other.
Here is a wishlist of items we should try to solve for. I understand that some of this is already solved by tranquility (probably not in kafka based ingestion), but we should ensure that those are not broken as we make changes and are supported by kafka based ingestion as well.

  1. Exactly once semantics: There should be no duplication introduced and no dropping of events (except when input event is malformed/unparseable). This directly relates to both "No window period" and "FirehoseV2 proposal"
  2. We should be able to scale realtime ingestion by adding more nodes.
  3. Realtime Ingestion and Querying on that data should continue to happen successfully in the event of some process/node failures. It only happens for standalone realtime nodes today. With ingestion done via "tasks", there is possibility of node dying leading to data loss (replication of ingestion tasks should potentially solve this).
  4. Query result consistency: Given query should get exactly same result independent of which realtime nodes/replicants served the query.
  5. No downtime upgrade: Realtime ingestion and Queries should continue to make progress while upgrade.
  6. Operational simplicity: For example kafka ingestion should be able to automatically handle kafka partition addition/removal

(Realtime Delta Ingestion: ability to ingest late events as they come would probably happen as a side effect of 1st)

Related Refs:
https://groups.google.com/forum/#!msg/druid-development/kHgHTgqKFlQ/fXvtsNxWzlMJ (No window period proposal)
https://groups.google.com/forum/#!msg/druid-development/9HB9hCcqvuI/L59RgsloZfoJ (FirehoseV2 proposal)
https://docs.google.com/document/d/1PUG3crI2jiPa_u926R0KrkZVM7t706rXp1IuUxVXB5E/edit?usp=sharing (doc covering design details for both above)
https://groups.google.com/forum/#!searchin/druid-development/tier/druid-development/1I3CmxlOipM/e3-SpWqG170J (Task Tiering proposal)

Related PRs:
#1609 (kafka simple consumer based firehose and initial FirehoseV2 updates)
#1639 (new plumber)

Related Issues:
#401 (log management for long-running tasks)
#1513 (preemption for indexing service locks)
#1514 (aggregatorFactories in segment metadata)
#1515 (AllocateSegmentAction)
#1516 (ElasticShardSpec)
#1517 (user-friendly Hadoop-based re-indexing/compaction)

@himanshug himanshug added the Discuss label Aug 19, 2015

@drcrallen

This comment has been minimized.

Show comment
Hide comment
@drcrallen

drcrallen Aug 19, 2015

Contributor

https://groups.google.com/forum/#!searchin/druid-development/tier/druid-development/1I3CmxlOipM/e3-SpWqG170J could fit here also?

That directly applies to 5, 2 (and maybe 6?) on the list.

Contributor

drcrallen commented Aug 19, 2015

https://groups.google.com/forum/#!searchin/druid-development/tier/druid-development/1I3CmxlOipM/e3-SpWqG170J could fit here also?

That directly applies to 5, 2 (and maybe 6?) on the list.

@himanshug

This comment has been minimized.

Show comment
Hide comment
@himanshug

himanshug Aug 19, 2015

Contributor

@drcrallen added

Contributor

himanshug commented Aug 19, 2015

@drcrallen added

@himanshug

This comment has been minimized.

Show comment
Hide comment
@himanshug

himanshug Aug 31, 2015

Contributor

I have created a document at https://docs.google.com/document/d/1PUG3crI2jiPa_u926R0KrkZVM7t706rXp1IuUxVXB5E/edit?usp=sharing to capture various design details of kafka/tranquility ingestion work . This has been created with inputs from @gianm and still under active development. feel free to discuss here.

Contributor

himanshug commented Aug 31, 2015

I have created a document at https://docs.google.com/document/d/1PUG3crI2jiPa_u926R0KrkZVM7t706rXp1IuUxVXB5E/edit?usp=sharing to capture various design details of kafka/tranquility ingestion work . This has been created with inputs from @gianm and still under active development. feel free to discuss here.

@gianm

This comment has been minimized.

Show comment
Hide comment
@gianm

gianm Sep 18, 2015

Contributor

I updated the doc with some thoughts and preliminary code around push-based/tranquility ingestion.

Contributor

gianm commented Sep 18, 2015

I updated the doc with some thoughts and preliminary code around push-based/tranquility ingestion.

@gianm

This comment has been minimized.

Show comment
Hide comment
@gianm

gianm Oct 28, 2015

Contributor

A couple of tangentially related things.

#1881 - Restorable indexing tasks (PR) - so middleManagers can be restarted similarly to realtime nodes
#1884 - Rack-aware availabilityGroup assignment (issue) - suggestion from @himanshug, to make rolling restarts batchable

Contributor

gianm commented Oct 28, 2015

A couple of tangentially related things.

#1881 - Restorable indexing tasks (PR) - so middleManagers can be restarted similarly to realtime nodes
#1884 - Rack-aware availabilityGroup assignment (issue) - suggestion from @himanshug, to make rolling restarts batchable

@gianm

This comment has been minimized.

Show comment
Hide comment
@gianm

gianm Jan 18, 2016

Contributor

Updated the google doc with the current state of kafka ingestion stuff.

Contributor

gianm commented Jan 18, 2016

Updated the google doc with the current state of kafka ingestion stuff.

@gianm

This comment has been minimized.

Show comment
Hide comment
@gianm

gianm Mar 15, 2017

Contributor

Work based off this proposal was released a couple releases ago. Circling back and closing this.

http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html

Contributor

gianm commented Mar 15, 2017

Work based off this proposal was released a couple releases ago. Circling back and closing this.

http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html

@gianm gianm closed this Mar 15, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment