Skip to content

druid-0.11.0

Compare
Choose a tag to compare
@gianm gianm released this 05 Dec 03:56
· 6373 commits to master since this release

Druid 0.11.0 contains over a hundred performance improvements, stability improvements, and bug fixes from almost 40 contributors. This release adds two major security features, TLS support and extension points for authentication and authorization.

Major new features include:

  • TLS (a.k.a. SSL) support
  • Extension points for authentication and authorization
  • Double columns support
  • cachingCost Balancer Strategy
  • jq expression support in JSON parser
  • Redis cache extension
  • GroupBy performance improvements
  • Various improvements to Druid SQL

The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr%20is%3Aclosed%20milestone%3A0.11.0

Documentation for this release is at: http://druid.io/docs/0.11.0/

Highlights

TLS support

Druid now supports TLS, enabling encrypted client and inter-node communications. Please see http://druid.io/docs/0.11.0/operations/tls-support.html for details on configuration and related extensions.

Added by @pjain1 in #4270.

Authentication/authorization extension points

Extension points for authenticating and authorizing requests have been added to Druid. Please see http://druid.io/docs/0.11.0/configuration/auth.html for information on configuration and extension implementation.

The existing Kerberos authentication extension has been updated to implement the new Authenticator interface, please see the "Kerberos configuration changes" section under "Updating from 0.10.1 and earlier" for more information if you are using the Kerberos extension.

Added by @jon-wei in #4271

Double columns support

Druid now supports Double type aggregator columns. Please see http://druid.io/docs/0.11.0/querying/aggregations.html for documentation on the new Double aggregators.

Added by @b-slim in #4491.

cachingCost Balancer Strategy

Users upgrading to 0.11.0 are encouraged to try the new cachingCost segment balancing strategy on their coordinators. This strategy offers large performance improvements over the existing cost balancer strategy, and it is planned to become the default strategy in the release following 0.11.0.

This strategy can be selected by setting the following property on coordinators:

druid.coordinator.balancer.strategy=cachingCost

Added by @dgolitsyn in #4731

jq expression support in JSON parser

Druid's JSON input parser now supports jq expressions using jackson-jq, enabling more input transforms before ingestion. Please see http://druid.io/docs/0.11.0/ingestion/flatten-json.html for more details.

Added by @knoguchi in #4171.

Redis cache extension

A new cache implementation using Redis has been added in an extension, added by @QiuMM in #4615. Please refer to the preceding pull request for more details.

GroupBy performance improvements

Several new performance optimizations have been added to the GroupBy query by @jihoonson in the following PRs:

#4660 Parallel sort for ConcurrentGrouper
#4576 Array-based aggregation for groupBy query
#4668 Add IntGrouper to avoid unnecessary boxing/unboxing in array-based aggregation

PR #4660 offers a general improvement by parallelizing partial result sorting, while PR #4576 and #4668 offer significant improvements when grouping on a single String column.

SQL improvements

Various improvements and features have been added to Druid SQL, by @gianm in the following PRs:

#4750 - TRIM support
#4720 - Rounding for count distinct
#4561 - Metrics for SQL queries
#4360 - SQL expressions support

And much more!

The full list of changes is here: https://github.com/druid-io/druid/pulls?utf8=%E2%9C%93&q=is%3Apr%20is%3Aclosed%20milestone%3A0.11.0

Updating from 0.10.1 and earlier

Please see below for changes between 0.10.1 and 0.11.0 that you should be aware of before upgrading. If you're updating from an earlier version than 0.10.1, please see release notes of the relevant intermediate versions for additional notes.

Upgrading coordinators and overlords

The following patch changes the way coordinator->overlord redirects are handled:
#5037

The overlord leader election algorithm has changed in 0.11.0: #4699.

As a result of the two patches above, special care is needed when upgrading Coordinator or Overlord to 0.11.0. All coordinators and overlords must be shut down and upgraded together.

For example, to upgrade Coordinators, you would shutdown all coordinators, upgrade them to 0.11.0 and then start them. Overlords should be upgraded in a similar way.

During the upgrade process, there must not be any time period where a non-0.11.0 coordinator or overlord is running simultaneously with an 0.11.0 coordinator or overlord.

Note that at least one overlord should be brought up as quickly as possible after shutting them all down so that peons, tranquility etc continue to work after some retries.

Also note that the druid.zk.paths.indexer.leaderLatchPath property is no longer used now.

Service name changes

In earlier versions of Druid, / characters in service names defined by druid.service would be replaced by : characters because these service names were used in Zookeeper paths. Druid 0.11.0 no longer performs these character replacements.

Example:1 - if the old configuration had a broker with service name test/broker:
druid.service=test/broker

and a Router was configured assuming that / will be replaced with : in the broker service name,
druid.router.tierToBrokerMap={"hot":"test:broker","_default_tier":"test:broker"}

the Router configuration should be updated to remove that assumption:
druid.router.tierToBrokerMap={"hot":"test/broker","_default_tier":"test/broker"}

Example:2 - If the old configuration had overlord with service Name test/overlord then value of druid.coordinator.asOverlord.overlordService or druid.selectors.indexing.serviceName should be test/overlord and not test:overlord

Example:3 - If the old configuration had overlord with service Name test:overlord then value of druid.coordinator.asOverlord.overlordService or druid.selectors.indexing.serviceName should be test:overlord and not test/overlord

Following service name-related configurations are also affected and should be updated to exactly match the value of druid.service property on other node being discovered.

druid.coordinator.asOverlord.overlordService
druid.selectors.coordinator.serviceName
druid.selectors.indexing.serviceName
druid.router.defaultBrokerServiceName
druid.router.coordinatorServiceName
druid.router.tierToBrokerMap

Please see #4992 for more details.

Kerberos configuration changes

The Kerberos authentication configuration format has changed as a result of the new interfaces introduced by #4271. Please refer to http://druid.io/docs/0.11.0/development/extensions-core/druid-kerberos.html for the new configuration properties.

Users can point the Kerberos authenticator's authorizerName to an instance of an "allowAll" authorizer to replicate the pre-0.11.0 behavior of a cluster using Kerberos authentication with no authorization.

Lookups API path changes

The paths for the lookups configuration API have changed due to #5058.

Configuration paths that had the form /druid/coordinator/v1/lookups now have the form /druid/coordinator/v1/lookups/config.

Please see http://druid.io/docs/0.11.0/querying/lookups.html for the current API.

Migrating to Double columns

Prior to 0.11.0, the Double* aggregators would store column values on disk as Float while performing aggregations using Double representations.

PR #4491 allows the Double aggregators to store column values on disk as Doubles. Due to concerns related to rolling updates and version downgrades, this behavior is disabled by default and Druid will continue to store Double aggregators on disk as floats.

To enable Double column storage, set the following property in the common runtime properties:

druid.indexing.doubleStorage=double

Users should not set this property during an initial rolling upgrade to 0.11.0, as any nodes running pre-0.11.0 Druid will not be able to handle Double columns created during the upgrade period. Users will also need to reindex any segments with Double columns if downgrading from 0.11.0 to an older version. Please see #4944 and #4605 for more information.

Scan query changes

The Scan query has been moved from extensions-contrib to core Druid. As part of this migration: #4751, the scan query's handling of the time column has changed.

The time column is now is returned as "__time" rather than "timestamp", it is no longer included if you do not specifically ask for it in your "columns", and it is returned as a long rather than a string.

Users can revert the Scan query's time handling to the legacy extension behavior by setting "legacy" : true in their queries, or setting the property druid.query.scan.legacy = true. This is meant to provide a migration path for users that were formerly using the contrib extension.

Extension Interface Changes

Aggregator double column support

The Aggregator interface has gained a getDouble() method, which defaults to casting the result of getFloat(). The getDouble() method should be re-implemented for any custom aggregators that can support doubles.

See #4595 for more details.

QueryRunner interface change

The QueryRunner interface has changed and the old run() method has been removed, replaced by a new method that accepts a QueryPlus object.

Custom query extensions will need to implement the new interface.

Please see #4184 and #4482 for more details.

Filter interface change

The Filter.getBitmapResult() method no longer has a default implementation: #4481

Custom filter extensions will need to provide an implementation for getBitmapResult() now.

Other Notes

jvm/gc/time metric

The jvm/gc/time metric is no longer emitted, replaced by a new metric named jvm/gc/cpu for the reasons described here: #4480

Default worker select strategy

Please note that the default worker select strategy has changed from fillCapacity to equalDistribution. This change was introduced in 0.10.1, the previous release, but was not mentioned in the 0.10.1 release notes, so it is called out again here.

V8 segment creation removed

Druid will now always build V9 segments, creating V8 segments is no longer supported and the buildV9Directly property for ingestion tasks has been removed.

Please see #4420 for more details.

LogLevelAdjuster removed

Please note that the LogLevelAdjuster has been removed: #4236

Any user using mbeans to configure log levels should configure log4j2 using jmx instead.

Credits

Thanks to everyone who contributed to this release!

@a2l007
@akashdw
@Andy256
@asifmansoora
@b-slim
@benvogan
@blugowski
@chrisgavin
@dclim
@dgolitsyn
@drcrallen
@egor-ryashin
@erikdubbelboer
@Fokko
@fuji-151a
@gaodayue
@gianm
@ginoledesma
@himanshug
@hzy001
@jihoonson
@jon-wei
@kevinconaway
@knoguchi
@leiwangx
@leventov
@michalmisiewicz
@niketh
@pjain1
@praveev
@QiuMM
@scan-the-automator
@solimant
@SpotXPeterCunningham
@tkyaw
@wywlds
@xanec
@yuusaku-t
@zhangxinyu1