[WIP] Druid 0.12.0 release notes #5211

jon-wei · 2018-01-04T01:53:05Z

DRAFT

Druid 0.12.0 contains over a hundred performance improvements, stability improvements, and bug fixes from almost 40 contributors. This release adds major improvements to the Kafka indexing service.

Other major new features include:

Prioritized task locking
Improved automatic segment management
Test stats post-aggregators
Numeric quantiles sketch aggregator
Basic auth extension
Query request queuing improvements
Parse batch support
Various performance improvements
Various improvements to Druid SQL

The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr%20is%3Aclosed%20milestone%3A0.12.0

Documentation for this release is at: http://druid.io/docs/0.12.0/

Highlights

Kafka indexing incremental handoffs and decoupled partitioning

The Kafka indexing service now supports incremental handoffs, as well as decoupling the number of segments created by a Kafka indexing task from the number of Kafka partitions. Please see #4815 (comment) for more information.

Added by @pjain1 in #4815.

Prioritized task locking

Druid now supports priorities for indexing task locks. When an indexing task needs to acquire a lock on a datasource + interval, higher-priority tasks can now preempt lower-priority tasks. Please see http://druid.io/docs/0.12.0-rc1/ingestion/tasks.html#task-priority for more information.

Added by @jihoonson in #4550.

Improved automatic segment management

Automatic pending segments cleanup

Indexing tasks create entries in the "pendingSegments" table in the metadata store; prior to 0.12.0, these temporary entries were not automatically cleaned up, leading to possible cluster performance degradation over time. Druid 0.12.0 allows the coordinator to automatically clean up unused entries in the pending segments table. This feature is enabled by setting druid.coordinator.kill.pendingSegments.on=true in coordinator properties.

Added by @jihoonson in #5149.

Compaction task

Compacting segments (merging a set of segments within a given interval to create a set with larger but fewer segments) is a common Druid batch ingestion use case. Druid 0.12.0 now supports a Compaction Task that merges all segments within a given interval into a single segment. Please see http://druid.io/docs/0.12.0-rc1/ingestion/tasks.html#compaction-task for more details.

Added by @jihoonson in #4985.

Test stats post-aggregators

New z-score and p-value test statistics post-aggregators have been added to the druid-stats extension. Please see http://druid.io/docs/0.12.0-rc1/development/extensions-core/test-stats.html for more details.

Added by @chunghochen in #4532.

Numeric quantiles sketch aggregator

A numeric quantiles sketch aggregator has been added to the druid-datasketches extension.

Added by @AlexanderSaydakov in #5002.

Basic auth extension

Druid 0.12.0 includes a new authentication/authorization extension that provides Basic HTTP authentication and simple role-based access control. Please see http://druid.io/docs/0.12.0-rc1/development/extensions-core/druid-basic-security.html for more information.

Added by @jon-wei in #5099.

Query request queuing improvements

Currently clients can overwhelm a broker inadvertently by sending too many requests which get queued in an unbounded Jetty worker pool queue. Clients typically close the connection after a certain client-side timeout but the broker will continue to process these requests, giving the appearance of being unresponsive. Meanwhile, clients would continue to retry, continuing to add requests to an already overloaded broker..

The newly introduced properties druid.server.http.queueSize and druid.server.http.enableRequestLimit in the broker configuration and historical configuration allow users to configure request rejection to prevent clients from overwhelming brokers and historicals with queries.

Added by @himanshug in #4540.

Parse batch support

For developers of custom ingestion parsing extensions, it is now possible for InputRowParsers to return multiple InputRows from a single input row. This can simplify ingestion pipelines by reducing the need for input transformations outside of Druid.
Added by @pjain1 in #5081.

Performance improvements

SegmentWriteOutMedium

When creating new segments, Druid stores some pre-processed data in temporary buffers. Prior to 0.12.0, these buffers were always kept in temporary files on disk. In 0.12.0, PR #4762 by @leventov allows these temporary buffers to be stored in off-heap memory, thus reducing the number of disk I/O operations during ingestion. To enable using off-heap memory for these buffers, the druid.peon.defaultSegmentWriteOutMediumFactory property needs to be configured accordingly. If using off-heap memory for the temporary buffers, please ensure that -XX:MaxDirectMemorySize is increased to accommodate the higher direct memory usage.

Please see http://druid.io/docs/0.12.0-rc1/configuration/indexing-service.html#SegmentWriteOutMediumFactory for configuration details.

Parallel merging of intermediate GroupBy results

PR #4704 by @jihoonson allows the user to configure a number of processing threads to be used for parallel merging of intermediate GroupBy results that have been spilled to disk. Prior to 0.12.0, this merging step would always take place within a single thread.

Please see http://druid.io/docs/0.12.0-rc1/configuration/querying/groupbyquery.html#parallel-combine for configuration details.

Other performance improvements

DataSegment memory optimizations: DataSegment memory optimizations #5094 by @leventov
Remove IndexedInts.iterator(): Remove IndexedInts.iterator() #4811 by @leventov
ExpressionSelectors: Add optimized selectors: ExpressionSelectors: Add optimized selectors. #5048 by @gianm

SQL improvements

Various improvements and features have been added to Druid SQL, by @gianm in the following PRs:

Improve translation of time floor expressions: SQL: Improve translation of time floor expressions. #5107
Add TIMESTAMPADD: SQL: Add TIMESTAMPADD. #5079
Add rule to prune unused aggregations: SQL: Add rule to prune unused aggregations. #5049
Support CASE-style filtered count distinct: SQL: Support CASE-style filtered count distinct. #5047
Add "array" result format, and document result formats: SQL: Add "array" result format, and document result formats. #5032
Fix havingSpec on complex aggregators: Fix havingSpec on complex aggregators. #5024
Improved behavior when implicitly casting strings to date/time literals: SQL: Improved behavior when implicitly casting strings to date/time literals. #5023
Add Router connection balancers for Avatica queries: Add Router connection balancers for Avatica queries #4983
Fix incorrect filter simplification: SQL: Incorrect filter simplification #4945
Fix Router handling of SQL queries: Fix Router handling of SQL queries. #4851

And much more!

The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr%20is%3Aclosed%20milestone%3A0.12.0

Updating from 0.11.0 and earlier

Please see below for changes between 0.11.0 and 0.12.0 that you should be aware of before upgrading. If you're updating from an earlier version than 0.11.0, please see release notes of the relevant intermediate versions for additional notes.

Rollback restrictions

Please note that after upgrading to 0.12.0, it is no longer possible to downgrade to a version older than 0.11.0, due to changes made in #4762. It is still possible to roll back to version 0.11.0.

com.metamx.java-util library migration

The Metamarkets java-util library has been brought into Druid. As a result, the following package references have changed:

com.metamx.common -> io.druid.java.util.common
com.metamx.emitter -> io.druid.java.util.emitter
com.metamx.http -> io.druid.java.util.http
com.metamx.metrics -> io.druid.java.util.metrics

This will affect the druid.monitoring.monitors configuration. References to monitor classes under the old com.metamx.metrics.* package will need to be updated to reference io.druid.java.metrics.* instead, e.g. io.druid.java.util.metrics.JvmMonitor.

If classes under the the com.metamx packages shown above are referenced in other configurations such as log4j2.xml, those references will need to be updated as well.

Extension developers will need to update their code to use the new Druid packages as well.

Caffeine cache extension

The Caffeine cache extension has been moved out of an extension, into core Druid. In addition, the Caffeine cache is now the default cache implementation. Please remove druid-caffeine-cache if present from the extension list when upgrading to 0.12.0. More information can be found at #4810.

Kafka indexing service changes

earlyMessageRejectPeriod

The semantics of the earlyMessageRejectPeriod configuration have changed. The earlyMessageRejectPeriod will now be added to (task start time + task duration) instead of just (task start time) when determining the bounds of the message window. Please see #4990 for more information.

Rolling upgrade

In 0.12.0, there are protocol changes between the Kafka supervisor and Kafka Indexing task and also some changes to the metadata formats persisted on disk. Therefore, to support rolling upgrade, all the Middle Managers will need to be upgraded first before the Overlord. Note that this ordering is different from the standard order of upgrade, also note that this ordering is only necessary when using the Kafka Indexing Service. If one is not using Kafka Indexing Service or can handle down time for Kafka Supervisor then one can upgrade in any order.

Until the point in time Overlord is upgraded, all the Kafka Indexing Task will behave in same manner (even if they are upgraded) as earlier which means no decoupling and incremental hand-offs. Once, Overlord is upgraded, the new tasks started by the upgraded Overlord will support the new features.

Please see #4815 for more info.

Roll back

Once both the overlord and middle managers are rolled back, a new set of tasks should be started, which will work properly. However, the current set of tasks may fail during a roll back. Please see #4815 for more info.

Interface Changes for Extension Developers

The ColumnSelectorFactory API has changed. Aggregator extension authors and any others who use ColumnSelectorFactory will need to update their code accordingly. Please see #4886 for more details.

The Aggregator.reset() method has been removed because it was deprecated and unused. Please see #5177 for more info.

The DataSegmentPusher interface has changed, and the push() method now has an additional replaceExisting parameter. Please see #5187 for details.

The Escalator interface has changed: the createEscalatedJettyClient method has been removed. Please see #5322 for more details.

Credits

Thanks to everyone who contributed to this release!

@a2l007
@akashdw
@AlexanderSaydakov
@b-slim
@ben-manes
@benvogan
@chuanlei
@chunghochen
@clintropolis
@daniel-tcell
@dclim
@dpenas
@drcrallen
@egor-ryashin
@elloooooo
@Fokko
@fuji-151a
@gianm
@gvsmirnov
@himanshug
@hzy001
@Igosuki
@jihoonson
@jon-wei
@KenjiTakahashi
@kevinconaway
@leventov
@mh2753
@niketh
@nishantmonu51
@pjain1
@QiuMM
@QubitPi
@quenlang
@Shimi
@skyler-tao
@xanec
@yuppie-flu
@zhangxinyu1

The text was updated successfully, but these errors were encountered:

QubitPi · 2018-01-07T16:07:48Z

@jon-wei Do we have an ETA for version 12?

Fokko · 2018-01-08T10:35:36Z

I see a lot of performance improvements, are there benchmarks available?

jon-wei · 2018-01-10T02:38:14Z

@QubitPi We don't have an ETA for 0.12.0 presently, that depends on how many issues surface during our release candidate phase(s). The first RC should be out soon though, possibly sometime next week, as the 0.12.0 milestone has almost been cleared of open issues.

@Fokko PR #4704 and #5048 have benchmarks in the PR threads.

@leventov Do you happen to have benchmarks for #4762, #4811, and #5094?

kgneng2 · 2018-01-10T06:22:07Z

When will do you release 0.12.0 ?

jon-wei · 2018-01-10T19:46:11Z

@kang-junyoung There's no ETA right now as I mentioned in the comment above, it depends on how many issues come up during the RCs

kgneng2 · 2018-01-11T06:09:56Z

@jon-wei Thank you.

leventov · 2018-01-12T13:07:12Z

@jon-wei

On #4762: I don't have benchmarks but we have seen 50% CPU reduction on Middle Manager nodes, in practice (between two Druid versions differing not only by this change, but I largely attribute this improvement to the changes that were equivalent to #4762). I suggest to emphasize in release notes, that this improvement won't happen automatically, but only if users explicitly change segmentWriteOutMediumFactory configuration, that increases memory usage during segment creation. It means that each Peon process should have bigger -XX:MaxDirectMemorySize configured. But if there are many Peons on a Middle Manager node, and segment handoff is not coordinated among the Peon processes, peak total memory usage on a node shouldn't increase more than by 10-20% from the peak usage when segmentWriteOutMediumFactory=tmpFile is used. (Probably the above text should be amended to the docs, if it's not clear from the docs yet.)

It's also important to note that those observations are made based on the experience with just one Druid installation, so I encourage everybody to try segmentWriteOutMediumFactory=offHeapMemory and share the experience.

On #4811: no benchmarks.

On #5094: no benchmarks but in practice we have seen that heap footprint of Brokers went down by 40%. But such improvement could be probably seen only on Druid installations with millions of live segments, as we have.

jon-wei · 2018-01-12T21:33:24Z

@leventov Thanks for the details, I've updated the section on the segmentWriteOutMedium to point out that the new off-heap buffers need to be explicitly configured

josephglanville · 2018-02-27T05:49:12Z

Big feature for us that isn't mentioned here yet is the new parseBatch API in InputRowParser.
This change allows an InputRowParser to emit multiple rows per message.

We are making use of this to eliminate a transformation that needed to happen prior to Druid ingestion. It also means that we can do reingestion from our raw input data without needing to have stored transformed intermediates for Druid to consume.
Big simplification of our data architecture and could prove useful for others that have input data that isn't sufficiently denormalised/flattened for direct ingestion.

Igosuki · 2018-02-27T08:13:40Z

Same for us Joseph

…

On Tue, Feb 27, 2018, 6:49 AM Joseph Glanville ***@***.***> wrote: Big feature for us that isn't mentioned here yet is the new parseBatch API in InputRowParser. This change allows an InputRowParser to emit multiple rows per message. We are making use of this to eliminate a transformation that needed to happen prior to Druid ingestion. It also means that we can do reingestion from our raw input data without needing to have stored transformed intermediates for Druid to consume. Big simplification of our data architecture and could prove useful for others that have input data that isn't sufficiently denormalised/flattened for direct ingestion. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5211 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAYyhrfVE1BP9xbRjZ0gqC4fFUgGYV4Rks5tY5dlgaJpZM4RSeuW> .

pdeva · 2018-03-13T21:41:44Z

any reason to keep this issue open now that 0.12 is released?

quenlang · 2018-03-14T03:15:19Z

@josephglanville @Igosuki
I did not find any describe in the 0.12.0 doc, any advice give me ?

Fokko · 2018-03-14T08:33:51Z

@quenlang What are you looking for? The description of the changes of 0.12.0 is here: https://github.com/druid-io/druid/releases/tag/druid-0.12.0

quenlang · 2018-03-14T14:01:54Z

@Fokko Thx. i had read this release note. but i want to konw how to use this feature ?
Is there any detail configurations or usage examples of kafka index service in the 0.12.0 doc ?

jon-wei added Release Notes WIP labels Jan 4, 2018

jon-wei added this to the 0.12.0 milestone Jan 4, 2018

jon-wei changed the title ~~Druid 0.12.0 release notes~~ [WIP] Druid 0.12.0 release notes Jan 5, 2018

jon-wei mentioned this issue Jan 5, 2018

add configs to enable fast request failure on broker and historical #4540

Merged

leventov removed the WIP label Mar 13, 2018

leventov closed this as completed Mar 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Druid 0.12.0 release notes #5211

[WIP] Druid 0.12.0 release notes #5211

jon-wei commented Jan 4, 2018 •

edited

Loading

QubitPi commented Jan 7, 2018

Fokko commented Jan 8, 2018

jon-wei commented Jan 10, 2018

kgneng2 commented Jan 10, 2018

jon-wei commented Jan 10, 2018

kgneng2 commented Jan 11, 2018

leventov commented Jan 12, 2018

jon-wei commented Jan 12, 2018

josephglanville commented Feb 27, 2018

Igosuki commented Feb 27, 2018 via email

pdeva commented Mar 13, 2018

quenlang commented Mar 14, 2018 •

edited

Loading

Fokko commented Mar 14, 2018

quenlang commented Mar 14, 2018

[WIP] Druid 0.12.0 release notes #5211

[WIP] Druid 0.12.0 release notes #5211

Comments

jon-wei commented Jan 4, 2018 • edited Loading

DRAFT

Highlights

Kafka indexing incremental handoffs and decoupled partitioning

Prioritized task locking

Improved automatic segment management

Automatic pending segments cleanup

Compaction task

Test stats post-aggregators

Numeric quantiles sketch aggregator

Basic auth extension

Query request queuing improvements

Parse batch support

Performance improvements

SegmentWriteOutMedium

Parallel merging of intermediate GroupBy results

Other performance improvements

SQL improvements

And much more!

Updating from 0.11.0 and earlier

Rollback restrictions

com.metamx.java-util library migration

Caffeine cache extension

Kafka indexing service changes

earlyMessageRejectPeriod

Rolling upgrade

Roll back

Interface Changes for Extension Developers

Credits

QubitPi commented Jan 7, 2018

Fokko commented Jan 8, 2018

jon-wei commented Jan 10, 2018

kgneng2 commented Jan 10, 2018

jon-wei commented Jan 10, 2018

kgneng2 commented Jan 11, 2018

leventov commented Jan 12, 2018

jon-wei commented Jan 12, 2018

josephglanville commented Feb 27, 2018

Igosuki commented Feb 27, 2018 via email

pdeva commented Mar 13, 2018

quenlang commented Mar 14, 2018 • edited Loading

Fokko commented Mar 14, 2018

quenlang commented Mar 14, 2018

jon-wei commented Jan 4, 2018 •

edited

Loading

quenlang commented Mar 14, 2018 •

edited

Loading