Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: index refactor for table partition #7062

Merged
merged 6 commits into from Jul 17, 2018

Conversation

tiancaiamao
Copy link
Contributor

What have you changed? (mandatory)

  • The prefix key in index is changed to partition prefix, so AddRecord for index would work fine
  • executor build will replace table ID with partition ID
  • Move some code from PartitionedTable.AddRecord to newPartitionedTable
  • A tiny adjust for 'rule_partition_processor.go'

What is the type of the changes? (mandatory)

  • Improvement (non-breaking change which is an improvement to an existing feature)

How has this PR been tested? (mandatory)

unit test
The test case is still very naive, more testes will be add when I finish admin check table in the following PRs

PTAL @zimulala @winkyao @shenli

@coocood
Copy link
Member

coocood commented Jul 16, 2018

/run-all-tests

@coocood
Copy link
Member

coocood commented Jul 16, 2018

LGTM

@coocood coocood added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 16, 2018
@tiancaiamao
Copy link
Contributor Author

/run-integration-ddl-test tidb-test=pr/587

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@zhexuany zhexuany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

idxInfo *model.IndexInfo
prefix kv.Key
}

// NewIndex builds a new Index object.
func NewIndex(tableInfo *model.TableInfo, indexInfo *model.IndexInfo) table.Index {
func NewIndex(partitionID int64, indexInfo *model.IndexInfo) table.Index {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a problem with the naming of partitionID. Is it always being paritonID? Is it possible being table ID?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Please comment it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be partition ID or table ID, depends on whether the table is a partitioned table or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case, then partitionID is not a good choice. It would be nice if you could change partitionID to ID and add some comments like "ID could be partitionID or table ID".

zhexuany
zhexuany previously approved these changes Jul 17, 2018
Copy link
Contributor

@zhexuany zhexuany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reset LGTM

@zhexuany zhexuany dismissed their stale review July 17, 2018 02:34

reset LGTM

@@ -334,6 +338,11 @@ type PhysicalUnionScan struct {
Conditions []expression.Expression
}

// IsPartition returns true and partition ID if it works on a partition.
func (p *PhysicalIndexScan) IsPartition() (bool, int64) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about s/IsPartition/GetPartitionID/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think IsPartition maybe better.

succ, _ :=  IsPartition()
if succ {
    doSomething()
}

It's also flexible when we need the partition id:

if succ, pid := IsPartition(); succ {
    doSomthingWith(pid)
}

GetPartitionID looks unnatural if we define:

GetPartitionID(pid int64, bool) // wow, it maybe not a partition

If we use partition id == 0 to represent it's not a partition:

GetPartitionID(pid int64)

This code introduce an implicit assumption with the caller.

@@ -105,6 +105,12 @@ func (s *partitionProcessor) prune(ds *DataSource) (LogicalPlan, error) {
newDataSource.baseLogicalPlan = newBaseLogicalPlan(ds.context(), TypeTableScan, &newDataSource)
newDataSource.isPartition = true
newDataSource.partitionID = pi.Definitions[i].ID
// TODO: Find a better way to handle here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we haven't find a way to optimize the old logic, we don't need a TODO.

idxInfo *model.IndexInfo
prefix kv.Key
}

// NewIndex builds a new Index object.
func NewIndex(tableInfo *model.TableInfo, indexInfo *model.IndexInfo) table.Index {
func NewIndex(partitionID int64, indexInfo *model.IndexInfo) table.Index {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the first parameter always the partition id? Can it be the table id?

idxInfo *model.IndexInfo
prefix kv.Key
}

// NewIndex builds a new Index object.
func NewIndex(tableInfo *model.TableInfo, indexInfo *model.IndexInfo) table.Index {
func NewIndex(partitionID int64, indexInfo *model.IndexInfo) table.Index {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Please comment it.

ret.recordPrefix = tablecodec.GenTableRecordPrefix(pid)
ret.indexPrefix = tablecodec.GenTableIndexPrefix(pid)
return &ret
return t.partitions[pid]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If GetPartition function create a Partition and Index everytime it's called, it's bad for performance.
So I store those information when PartitionedTable is created, and just get it here.


idx := NewIndex(tblInfo, idxInfo)
t.indices = append(t.indices, idx)
if err := initTableCommon(&t.tableCommon, tblInfo, tblInfo.ID, columns, alloc); err != nil {
Copy link
Contributor

@zimulala zimulala Jul 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of partitions, it seems that this function repeatedly builds the index. The first time is here, the other time is when building the index(https://github.com/pingcap/tidb/pull/7062/files#diff-c709fb799ac0eac6d7ea40b2278c90d7R69).

@tiancaiamao
Copy link
Contributor Author

PTAL @zimulala @zz-jason

@coocood coocood added status/LGT3 The PR has already had 3 LGTM. status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. status/LGT3 The PR has already had 3 LGTM. labels Jul 17, 2018
zhexuany
zhexuany previously approved these changes Jul 17, 2018
Copy link
Contributor

@zhexuany zhexuany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -105,6 +105,12 @@ func (s *partitionProcessor) prune(ds *DataSource) (LogicalPlan, error) {
newDataSource.baseLogicalPlan = newBaseLogicalPlan(ds.context(), TypeTableScan, &newDataSource)
newDataSource.isPartition = true
newDataSource.partitionID = pi.Definitions[i].ID
// Find a better way to handle here.
// There are so many place that refer to the old DataSource including the schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is confusion, I think you should directly explain why the newDataSource.id is set to the ds.id. It's not because we hold the reference of the Datasource, it's because we use the id as the FromID of the Columns in the Expression of parent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"use the id as the FromID of the Columns in the Expression of parent"
This is a kind of reference!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean, we should statement the comment in a straightforward way.

@zz-jason zz-jason dismissed zhexuany’s stale review July 17, 2018 05:10

still need some discuss

@zz-jason
Copy link
Member

LGTM

@coocood coocood merged commit 44e34bd into pingcap:master Jul 17, 2018
@coocood coocood added status/LGT3 The PR has already had 3 LGTM. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Jul 17, 2018
@tiancaiamao tiancaiamao deleted the index-refactor branch July 17, 2018 06:09
@tiancaiamao tiancaiamao added this to In Progress in Table Partitioning via automation Aug 26, 2018
@tiancaiamao tiancaiamao moved this from In Progress to Done in Table Partitioning Aug 26, 2018
@you06 you06 added the sig/sql-infra SIG: SQL Infra label Mar 4, 2020
@zz-jason zz-jason moved this from Issue: Done to PR: Donw in Table Partitioning Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra status/LGT3 The PR has already had 3 LGTM.
Projects
No open projects
Table Partitioning
  
PR: Done
Development

Successfully merging this pull request may close these issues.

None yet

8 participants