Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add unusedMarkWaitDuration config to auto-kill feature #5315

Closed
wants to merge 5 commits into from

Conversation

himanshug
Copy link
Contributor

@himanshug himanshug commented Jan 31, 2018

Fixes #5305

Also, See config update to Coordinator.md for the new config added.

Release Notes Update:
For auto-kill users:
It is required to set appropriate value for newly added configuration druid.coordinator.kill.durationToRetainUnusedSegments . It is actually possible to give a default value of 0, but I would like to force users to set this appropriately.

For all users:
Run the following SQL statement on your db any time before upgrade. Note that it can be run while you are still running an older version of Druid and no need to stop the cluster while of after executing the statement below. Older Druid versions would continue to work fine even if the newly added column exists.
Change the table name in following statement if necessary.
ALTER TABLE druid_segments ADD used_update_date VARCHAR(255);

StringUtils.format("UPDATE %s SET used=false WHERE dataSource = :dataSource", getSegmentsTable())
).bind("dataSource", ds).execute()
StringUtils.format("UPDATE %s SET used=false, used_update_date = :now WHERE dataSource = :dataSource", getSegmentsTable())
).bind("dataSource", ds).bind("now", DateTimes.nowUtc().toString()).execute()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: can we change update used_update_date even when someone enables the datasource ?
this would not affect kill task, but would make the value in table consistent with the last time when someone toggled the used column

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made necessary updates to ensure new column staying up-to-date with segments creation, enable or disable.

@@ -38,14 +39,19 @@
@JsonIgnore
private final Interval interval;

@JsonIgnore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a Json serde test if not already present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -254,6 +254,7 @@ public void createSegmentTable(final String tableName)
+ " partitioned BOOLEAN NOT NULL,\n"
+ " version VARCHAR(255) NOT NULL,\n"
+ " used BOOLEAN NOT NULL,\n"
+ " used_update_date VARCHAR(255),\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to add a hook in the metadata connector to alter existing tables if the column is not present may be we can also set the default for this column as current time, so that users need not manually go and run alter table statements.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will try and add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of adding a hook at https://github.com/druid-io/druid/blob/master/server/src/main/java/io/druid/metadata/SQLMetadataConnector.java#L530 . However, I can't seem to find the standard sql statements to check if a particular column does not exist and then create it and that makes it non-trivial.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that there is standard SQL for this. Also, ALTER TABLE commands to add columns can cause a lot of downtime on some databases, so it might be harsh to run anyway. So far we have never done this sort of automatic table migration. I think we haven't even required manual ones ever (except maybe in the distant past).

@@ -32,6 +32,7 @@ The coordinator node uses several of the global configs in [Configuration](../co
|`druid.coordinator.kill.on`|Boolean flag for whether or not the coordinator should submit kill task for unused segments, that is, hard delete them from metadata store and deep storage. If set to true, then for all whitelisted dataSources (or optionally all), coordinator will submit tasks periodically based on `period` specified. These kill tasks will delete all segments except for the last `durationToRetain` period. Whitelist or All can be set via dynamic configuration `killAllDataSources` and `killDataSourceWhitelist` described later.|false|
|`druid.coordinator.kill.period`|How often to send kill tasks to the indexing service. Value must be greater than `druid.coordinator.period.indexingPeriod`. Only applies if kill is turned on.|P1D (1 Day)|
|`druid.coordinator.kill.durationToRetain`| Do not kill segments in last `durationToRetain`, must be greater or equal to 0. Only applies and MUST be specified if kill is turned on. Note that default value is invalid.|PT-1S (-1 seconds)|
|`druid.coordinator.kill.unusedMarkWaitDuration`| Do not kill segments marked unused within `unusedMarkWaitDuration`, must be greater or equal to 0. Only applies and MUST be specified if kill is turned on. Note that default value is invalid.|PT-1S (-1 seconds)|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if druid.coordinator.kill.durationToRetainUnusedSegments would be any better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@himanshug
Copy link
Contributor Author

himanshug commented Jan 31, 2018

@nishantmonu51 thanks for the review ... I'll reply/resolve comments shortly.

@himanshug himanshug added this to the 0.13.0 milestone Feb 1, 2018
@gianm
Copy link
Contributor

gianm commented Mar 13, 2018

This seems like a relatively minor feature to require a DB migration for.

And an automatic migration is potentially problematic for the reasons in #5315 (comment).

What about adding the column to new tables, like we do here, but for old tables, detecting if it's there and then disabling this feature if it's not there?

@jihoonson
Copy link
Contributor

Hi guys, I don't think this should be a blocking issue for 0.13.0 release. I'll remove the milestone.

@jihoonson jihoonson removed this from the 0.13.0 milestone Sep 17, 2018
@stale
Copy link

stale bot commented Feb 28, 2019

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Feb 28, 2019
@stale
Copy link

stale bot commented Mar 7, 2019

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@stale stale bot closed this Mar 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants