Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration path for double columns #4605

Closed
gianm opened this issue Jul 26, 2017 · 5 comments
Closed

Migration path for double columns #4605

gianm opened this issue Jul 26, 2017 · 5 comments
Milestone

Comments

@gianm
Copy link
Contributor

gianm commented Jul 26, 2017

#4491 added support for 64-bit double columns in such a way that you get one when you specify a "doubleSum" aggregator at ingestion time. This is a behavior change from 0.10, where "doubleSum" would get you a float column. It's possible to get the old behavior back by switching the ingestion-time aggregator to "floatSum", which generates a 32-bit float column.

But that will cause some problems when migrating existing clusters to 0.11. Consider rolling out the code to new middleManagers. The new "floatSum" aggregator cannot be used until all middleManagers are updated, since before then, not all of them will recognize it. So "doubleSum" must be used until all middleManagers are updated and stable.

During this rollout period, middleManagers will start creating double columns instead of float columns. There's no good way around that right now, which has a couple of down-sides.

  1. Sites that want to opt-out of the new behavior will not be able to, since there is some period of time where they are forced to generate 64-bit double columns. There are legitimate reasons for wanting to opt-out, including controlling segment size (the new 64-bit columns will be larger than the previously generated 32-bit ones), caution regarding new code paths (32-bit is more tried and tested), and desire to be able to roll back to 0.10 at some point (which will be difficult if segments with double columns exist).
  2. During the rollout period, delta indexing or reindexing can fail for no good reason, if the task is trying to read a segment with double columns and is scheduled on a middleManager that doesn't support double columns yet.

I suggest we address this by adding a runtime property that makes doubleSum revert to the old behavior of generating 32-bit columns. It has to be a runtime property (not part of the aggregator JSON) to make it possible to do the upgrade without modifying existing task definitions.

I think it should be 32-bit by default, since otherwise people will see the issues above during an upgrade from 0.10 to 0.11. But I also think that in 0.12 we should default to 64-bit since that makes more sense in general (it's weird that _double_Sum gets you a float column).

In the meantime, people can set the property by hand in 0.11 if they want the 64-bit columns. And we may even want to include that property in the template configs in the distribution, so new users pick it up "automatically".

See https://groups.google.com/d/msg/druid-development/xCGedPwoBh0/3kfkVUSwAgAJ for more discussion.

@gianm gianm added this to the 0.11.0 milestone Jul 26, 2017
@himanshug
Copy link
Contributor

defaulting to DOUBLE means various UI clients that depend on segment metadata query would need to be upgraded before druid upgrades.
e.g. our ui sends segmentMetadata queries and suddenly started getting DOUBLE columns that it couldn't understand and started failing.

so, for 0.11.0, i would also agree for defaulting doubleXXX aggregators to old behavior of storing FLOAT columns and provide a way to opt-in to get them to store DOUBLE column.

@jon-wei jon-wei modified the milestones: 0.11.0, 0.11.1 Sep 20, 2017
@jon-wei jon-wei modified the milestones: 0.11.1, 0.11.0 Oct 11, 2017
@gianm
Copy link
Contributor Author

gianm commented Oct 11, 2017

@b-slim do you have any thoughts on this? Unfortunately it fell by the wayside but I'm hoping we can figure it out quickly now…

IMO it's a 0.11.0 release blocker for the reasons outlined in the original comment. So I -1'ed the 0.11.0-rc1 (sorry) in https://groups.google.com/d/msg/druid-development/PB-A04ZP-3U/LeXDCGxdBAAJ. But hopefully the solution is relatively straightforward.

@b-slim
Copy link
Contributor

b-slim commented Oct 11, 2017

@gianm looking at it. totally forgot about this issue.

@gianm
Copy link
Contributor Author

gianm commented Oct 17, 2017

Fixed by #4944.

@b-slim
Copy link
Contributor

b-slim commented Mar 2, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants