[Proposal] Consolidated segment metadata management #6849

jihoonson · 2019-01-13T01:10:29Z

Motivation

Druid currently stores segment metadata in two places, i.e., metadata store and deep storage. In metadata store, segment metadata is stored in segments table. In deep storage, it's stored in descriptor.json file.

Druid core retrieves segment metadata only from the metadata store, and only insert-segment-to-db tool uses descriptor.json file to find segment files in deep storage.

However, storing metadata in two different places has several drawbacks.

An additional effort is required to make sure that the same segment metadata is stored in metadata store and deep storage (S3DataSegmentPusher writes incomplete descriptor.json segment data to S3 #4170).
Deep storage migration is complex because it needs to update metadata in both metadata store and deep storage.

insert-segment-to-db tool seems to be introduced for recovery when the metadata store is broken (#1861), but another approach should be employed to handle that kind of error, e.g., replicating metadata store.

Public Interfaces

DataSegmentFinder will be removed.

Proposed Changes

The segment metadata is stored only in the metadata store.

Compatibility, Deprecation, and Migration Plan

This is not a backward compatible change. descriptor.json file is no longer stored in deep storage. insert-segment-to-db tool will be removed.

Deep storage migration would become simpler since segment metadata needs to be updated in only metadata store.

Test Plan

N/A

Rejected Alternatives

N/A

The text was updated successfully, but these errors were encountered:

josephglanville · 2019-01-13T04:00:31Z

I think this is a big simplicity win. Most people are already using replicated metadata databases with PITR (or managed options like Amazon RDS/Google Cloud SQL).

gianm · 2019-01-15T01:24:34Z

This is probably a good idea. The descriptors aren't used for anything besides the insert-segment-to-db and DataSegmentFinders. But those are flawed anyway: they assume that any segment in deep storage is valid. It's not the case, since segments can get pushed to deep storage but not published for a variety of reasons (mostly related to tasks running partially and then dying). That's not the only way that the descriptor.json in deep storage and the payload in the metadata store can get out of sync: it also happens if you do segment moves (the MoveTask) or if you manually edit the metadata store for some reason (like for a deep storage migration).

A question: how much of the descriptor.json could be re-created from the segment path & index.zip, if needed? I'm wondering about a disaster recovery scenario: let's say you did lose your metadata store and you wanted to try to recover whatever metadata you could from deep storage. How much could you get back?

jihoonson · 2019-01-15T19:34:34Z

It's quite limited. Only dataSource, interval, version, loadspec, binaryVersion, and size are fully restorable. It looks hard to determine what is dimensions or metrics from meta.smoosh. shardSpec cannot be restored since we don't know its type. This gets even worse after #6319. overshadowedSegments and atomicUpdateGroup cannot be restored from the segment path and index.zip, which means the overshadowing relation cannot be restored.

As I noted in the proposal, I think replicating metadata store is a better solution for disaster recovery. I would imagine that deep storage is likely broken if the metadata store is broken by some sort of disaster and they are in the same data center.

nishantmonu51 · 2019-02-05T16:37:50Z

I have seen people in past using insert-segment-to-db tool to recover metadata storage db segments.
Having said that, I think its ok to remove it and the descriptor file as well from deep storage, but at the same time we need to update docs to have more emphasis on recommending the increased importance of metadata storage in the docs and add recommendations for setting up recovery and backup policies for the metadata storage.

jihoonson · 2019-02-05T18:27:49Z

@nishantmonu51 thank you for the comment! It makes sense and I will update the doc.

capistrant · 2019-07-10T15:30:00Z

@jihoonson This was a cool change. I have a question regarding the existing descriptor.json files after the upgrade. Do they need to be manually removed? We use HDFS as a deep store so it would be nice to reduce the file count from Druid after upgrade validation. Thanks!

gianm · 2019-07-10T15:59:29Z

They don't need to be manually removed, but you can if you want to.

jihoonson · 2019-07-10T18:27:10Z

@capistrant as @gianm said, you don't have to. KillTask will remove them automatically. Also please note that KillTask will fail in 0.14.0 or earlier if descriptor.json file is missing. It means, once you remove those files, it might be hard for you to roll back to an earlier version.

This issue exists only for HDFS deep storage. The roll back should be fine with other types of deep storage.

jihoonson added Incompatible Release Notes Proposal labels Jan 13, 2019

jihoonson mentioned this issue Jan 24, 2019

Remove DataSegmentFinder, InsertSegmentToDb, and descriptor.json file in deep storage #6911

Merged

jihoonson closed this as completed in #6911 Feb 20, 2019

jihoonson added this to the 0.15.0 milestone May 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Consolidated segment metadata management #6849

[Proposal] Consolidated segment metadata management #6849

jihoonson commented Jan 13, 2019

josephglanville commented Jan 13, 2019 •

edited

Loading

gianm commented Jan 15, 2019

jihoonson commented Jan 15, 2019

nishantmonu51 commented Feb 5, 2019

jihoonson commented Feb 5, 2019

capistrant commented Jul 10, 2019

gianm commented Jul 10, 2019

jihoonson commented Jul 10, 2019 •

edited

Loading

[Proposal] Consolidated segment metadata management #6849

[Proposal] Consolidated segment metadata management #6849

Comments

jihoonson commented Jan 13, 2019

Motivation

Public Interfaces

Proposed Changes

Compatibility, Deprecation, and Migration Plan

Test Plan

Rejected Alternatives

josephglanville commented Jan 13, 2019 • edited Loading

gianm commented Jan 15, 2019

jihoonson commented Jan 15, 2019

nishantmonu51 commented Feb 5, 2019

jihoonson commented Feb 5, 2019

capistrant commented Jul 10, 2019

gianm commented Jul 10, 2019

jihoonson commented Jul 10, 2019 • edited Loading

josephglanville commented Jan 13, 2019 •

edited

Loading

jihoonson commented Jul 10, 2019 •

edited

Loading