Skip to content

[refactor](schema_hash) remove schema_hash since every tablet id in be is unique#8574

Merged
yiguolei merged 7 commits into
apache:masterfrom
caiconghui:schema_hash
Apr 7, 2022
Merged

[refactor](schema_hash) remove schema_hash since every tablet id in be is unique#8574
yiguolei merged 7 commits into
apache:masterfrom
caiconghui:schema_hash

Conversation

@caiconghui
Copy link
Copy Markdown
Contributor

@caiconghui caiconghui commented Mar 22, 2022

Proposed changes

Issue Number: close #xxx

Problem Summary:

1.Now, schema_hash is part of the data_dir path, but tablet id is unique, so we can read tablet info only by tablet_id,
so this PR only modify get_tablet and drop_tablet function in be, not refactor write schema_hash logic in be.
module in be who use get_tablet and drop_tablet function will be affected.
2. For users, this PR change two httpAction in be, one is metaAction, and the other is CompactionAction. so I also modify the doc and fe code.
3.The main Purpose of this PR is to simplify managing tablet, whether to remove all schema_hash still need more discussion, because we need to consider compatibility and correctness after refactor.

Checklist(Required)

  1. Does it affect the original behavior: (Yes)
  2. Has unit tests been added: (No)
  3. Has document been added or modified: (Yes)
  4. Does it need to update dependencies: (No)
  5. Are there any changes that cannot be rolled back: (No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@yiguolei
Copy link
Copy Markdown
Contributor

Great.
LGTM

@yangzhg
Copy link
Copy Markdown
Member

yangzhg commented Mar 22, 2022

  1. Does the command line of meta_tool also need to remove schema_hash?
  2. Do all the http_action of be has schema_hash should remove schema_hash? If so, you should change the url of fe in show tablet
  3. if the above is yes, document is also need to change

@caiconghui
Copy link
Copy Markdown
Contributor Author

  1. Does the command line of meta_tool also need to remove schema_hash?
  2. Do all the http_action of be has schema_hash should remove schema_hash? If so, you should change the url of fe in show tablet
  3. if the above is yes, document is also need to change

@yangzhg now, for upgrade compatibility , it is not the time to modify meta_tool, and url of show tablet command,
it just optimize the logic in be to find tablet and drop tablet now

@yangzhg
Copy link
Copy Markdown
Member

yangzhg commented Mar 22, 2022

  1. Does the command line of meta_tool also need to remove schema_hash?
  2. Do all the http_action of be has schema_hash should remove schema_hash? If so, you should change the url of fe in show tablet
  3. if the above is yes, document is also need to change

@yangzhg now, for upgrade compatibility , it is not the time to modify meta_tool, and url of show tablet command, it just optimize the logic in be to find tablet and drop tablet now

I think it's understandable to bring some compatibility problems when a big version is released, and there are not many users of meta_tool and url, so it's better to modify them together

@yangzhg yangzhg added the kind/refactor Issues or PRs to refactor code label Mar 22, 2022
@caiconghui
Copy link
Copy Markdown
Contributor Author

  1. Does the command line of meta_tool also need to remove schema_hash?
  2. Do all the http_action of be has schema_hash should remove schema_hash? If so, you should change the url of fe in show tablet
  3. if the above is yes, document is also need to change

@yangzhg now, for upgrade compatibility , it is not the time to modify meta_tool, and url of show tablet command, it just optimize the logic in be to find tablet and drop tablet now

I think it's understandable to bring some compatibility problems when a big version is released, and there are not many users of meta_tool and url, so it's better to modify them together

I find that I cannot refactor all schema hash code because the schema_hash dir now already in data path,
so in order to be less error prone, I just simplify the get tablet and drop tablet logic in be, and not change interface outside
@yiguolei @yangzhg

@morningman
Copy link
Copy Markdown
Contributor

Schema hash is currently widely used in various modules of BE and is strongly related to the data itself.
Strict review is required for schema hash modifications.

It is recommended to add the following content to the PR:

  1. Which modules currently use the schema hash, and whether it will affect it after removing it.
  2. Which modules are modified by this PR
  3. Which modules have not been modified
  4. What is the next step.

caiconghui1 added 2 commits March 28, 2022 20:58
@yiguolei
Copy link
Copy Markdown
Contributor

yiguolei commented Apr 6, 2022

Schema hash is currently widely used in various modules of BE and is strongly related to the data itself. Strict review is required for schema hash modifications.

It is recommended to add the following content to the PR:

  1. Which modules currently use the schema hash, and whether it will affect it after removing it.
  2. Which modules are modified by this PR
  3. Which modules have not been modified
  4. What is the next step.

This PR mainly modify following modules:

  1. It removes schema hash from tablet manager's get_tablet method, and modified the related usage.
  2. It removes schema hash from http service's api. [this may cause some usage problems but I think it is ok]

Copy link
Copy Markdown
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
I've pull the pr to local and build successfully.

@yiguolei yiguolei added the approved Indicates a PR has been approved by one committer. label Apr 6, 2022
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 6, 2022

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 98cab78 into apache:master Apr 7, 2022
@morningman morningman added the dev/backlog waiting to be merged in future dev branch label Apr 7, 2022
@@ -1552,10 +1549,9 @@ void TaskWorkerPool::_move_dir_thread_callback() {
Status TaskWorkerPool::_move_dir(const TTabletId tablet_id, const TSchemaHash schema_hash,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove schema_hash in parameter

@morningman morningman added dev/1.0.1-deprecated should be merged into dev-1.0.1 branch dev/backlog waiting to be merged in future dev branch and removed dev/backlog waiting to be merged in future dev branch dev/1.0.1-deprecated should be merged into dev-1.0.1 branch labels Apr 7, 2022
morningman pushed a commit that referenced this pull request Apr 15, 2022
weizhengte pushed a commit to weizhengte/incubator-doris that referenced this pull request Apr 22, 2022
weizhengte pushed a commit to weizhengte/incubator-doris that referenced this pull request Apr 22, 2022
zhengshiJ pushed a commit to zhengshiJ/incubator-doris that referenced this pull request Apr 27, 2022
starocean999 pushed a commit to starocean999/incubator-doris that referenced this pull request May 19, 2022
englefly pushed a commit to englefly/incubator-doris that referenced this pull request May 23, 2022
@caiconghui caiconghui deleted the schema_hash branch August 3, 2022 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/backlog waiting to be merged in future dev branch kind/refactor Issues or PRs to refactor code reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants