Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Consolidated segment metadata management #6849

Closed
jihoonson opened this issue Jan 13, 2019 · 8 comments · Fixed by #6911
Closed

[Proposal] Consolidated segment metadata management #6849

jihoonson opened this issue Jan 13, 2019 · 8 comments · Fixed by #6911

Comments

@jihoonson
Copy link
Contributor

Motivation

Druid currently stores segment metadata in two places, i.e., metadata store and deep storage. In metadata store, segment metadata is stored in segments table. In deep storage, it's stored in descriptor.json file.

Druid core retrieves segment metadata only from the metadata store, and only insert-segment-to-db tool uses descriptor.json file to find segment files in deep storage.

However, storing metadata in two different places has several drawbacks.

  1. An additional effort is required to make sure that the same segment metadata is stored in metadata store and deep storage (S3DataSegmentPusher writes incomplete descriptor.json segment data to S3 #4170).
  2. Deep storage migration is complex because it needs to update metadata in both metadata store and deep storage.

insert-segment-to-db tool seems to be introduced for recovery when the metadata store is broken (#1861), but another approach should be employed to handle that kind of error, e.g., replicating metadata store.

Public Interfaces

DataSegmentFinder will be removed.

Proposed Changes

The segment metadata is stored only in the metadata store.

Compatibility, Deprecation, and Migration Plan

This is not a backward compatible change. descriptor.json file is no longer stored in deep storage. insert-segment-to-db tool will be removed.

Deep storage migration would become simpler since segment metadata needs to be updated in only metadata store.

Test Plan

N/A

Rejected Alternatives

N/A

@josephglanville
Copy link
Contributor

josephglanville commented Jan 13, 2019

I think this is a big simplicity win. Most people are already using replicated metadata databases with PITR (or managed options like Amazon RDS/Google Cloud SQL).

@gianm
Copy link
Contributor

gianm commented Jan 15, 2019

This is probably a good idea. The descriptors aren't used for anything besides the insert-segment-to-db and DataSegmentFinders. But those are flawed anyway: they assume that any segment in deep storage is valid. It's not the case, since segments can get pushed to deep storage but not published for a variety of reasons (mostly related to tasks running partially and then dying). That's not the only way that the descriptor.json in deep storage and the payload in the metadata store can get out of sync: it also happens if you do segment moves (the MoveTask) or if you manually edit the metadata store for some reason (like for a deep storage migration).

A question: how much of the descriptor.json could be re-created from the segment path & index.zip, if needed? I'm wondering about a disaster recovery scenario: let's say you did lose your metadata store and you wanted to try to recover whatever metadata you could from deep storage. How much could you get back?

@jihoonson
Copy link
Contributor Author

It's quite limited. Only dataSource, interval, version, loadspec, binaryVersion, and size are fully restorable. It looks hard to determine what is dimensions or metrics from meta.smoosh. shardSpec cannot be restored since we don't know its type. This gets even worse after #6319. overshadowedSegments and atomicUpdateGroup cannot be restored from the segment path and index.zip, which means the overshadowing relation cannot be restored.

As I noted in the proposal, I think replicating metadata store is a better solution for disaster recovery. I would imagine that deep storage is likely broken if the metadata store is broken by some sort of disaster and they are in the same data center.

@nishantmonu51
Copy link
Member

I have seen people in past using insert-segment-to-db tool to recover metadata storage db segments.
Having said that, I think its ok to remove it and the descriptor file as well from deep storage, but at the same time we need to update docs to have more emphasis on recommending the increased importance of metadata storage in the docs and add recommendations for setting up recovery and backup policies for the metadata storage.

@jihoonson
Copy link
Contributor Author

@nishantmonu51 thank you for the comment! It makes sense and I will update the doc.

@jihoonson jihoonson added this to the 0.15.0 milestone May 22, 2019
@capistrant
Copy link
Contributor

@jihoonson This was a cool change. I have a question regarding the existing descriptor.json files after the upgrade. Do they need to be manually removed? We use HDFS as a deep store so it would be nice to reduce the file count from Druid after upgrade validation. Thanks!

@gianm
Copy link
Contributor

gianm commented Jul 10, 2019

They don't need to be manually removed, but you can if you want to.

@jihoonson
Copy link
Contributor Author

jihoonson commented Jul 10, 2019

@capistrant as @gianm said, you don't have to. KillTask will remove them automatically. Also please note that KillTask will fail in 0.14.0 or earlier if descriptor.json file is missing. It means, once you remove those files, it might be hard for you to roll back to an earlier version.

This issue exists only for HDFS deep storage. The roll back should be fine with other types of deep storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants