Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: support chunk for Sort #5221

Merged
merged 7 commits into from
Nov 27, 2017
Merged

executor: support chunk for Sort #5221

merged 7 commits into from
Nov 27, 2017

Conversation

coocood
Copy link
Member

@coocood coocood commented Nov 25, 2017

No description provided.

executor/sort.go Outdated
fields := e.schema.GetTypes()
var totalCount int
for {
chk := chunk.NewChunk(fields)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chk := e.children[0].newChunk()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there are many chunks to create, reuse one fields can save memory allocation.

)

// CompareFunc is a function to compare the two values in Row, the two columns must have the same type.
type CompareFunc func(a Row, aCol int, b Row, bCol int) int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. interface will be more efficient than function pointer
  2. s/a/lhs/, s/b/rhs/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a micro benchmark, the result shows that function pointer is faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preview micro benchmark didn't define the function as a type, I did it again with function type and interface is slightly faster

@@ -679,6 +679,7 @@ func (b *executorBuilder) buildSort(v *plan.Sort) Executor {
ByItems: v.ByItems,
schema: v.Schema(),
}
sortExec.supportChk = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while building child:

childExec := b.build(v.Children()[0])
if b.err != nil {
    b.err = errors.Trace(b.err)
    return nil
}

executor/sort.go Outdated
if err != nil {
return errors.Trace(err)
}
sort.Slice(e.rowPointers, e.newLess)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

golang/go#22504
Seems sometimes sort.Slice is slower than sort.Sort, can we do some experiment on sort.Sort vs. sort.Slice ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a micro benchmark with the rowPointer type slice, and the result show sort.Slice is faster.

@coocood
Copy link
Member Author

coocood commented Nov 26, 2017

@zz-jason
I changed function type definition to type alias and the performance is slightly better than the interface.

lBit := types.BinaryLiteral(l.GetBytes(lCol))
rBit := types.BinaryLiteral(r.GetBytes(rCol))
lUint, err := lBit.ToInt()
terror.Log(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the trace for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We log the error to pass errcheck, we never get this error or there is data corruption.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my personal culture, why simply not replace this:

	lUint, err := lBit.ToInt()
	terror.Log(err)

by this:

	lUint, _ := lBit.ToInt()

Thank's in advance 😉

executor/sort.go Outdated
totalCount int
}

type rowPointer struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for this.

executor/sort.go Outdated

// keyColumns is an optimization that when all by items are column, we don't need to evaluate them to keyChunks.
// just use the row chunks instead.
keyColumns []int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for the following attributes.

for i, colIdx := range e.keyColumns {
cmpFunc := e.keyCmpFuncs[i]
cmp := cmpFunc(rowI, colIdx, rowJ, colIdx)
if e.ByItems[i].Desc {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about passing bool desc as one of the parameter of the comparers ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes we only need to compare equal, Desc is not needed.


// rowPointer stores the address of a row in rowChunks by its chunk index and row index.
type rowPointer struct {
chkIdx uint32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using int, thus we can avoid many manual type conversions ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rowPointers slice need to allocate large contiguous memory, we should make the size as small as possible.

@zz-jason
Copy link
Member

LGTM

@zz-jason
Copy link
Member

/run-all-tests

@coocood
Copy link
Member Author

coocood commented Nov 27, 2017

/run-unit-test

@coocood coocood added the status/LGT1 Indicates that a PR has LGTM 1. label Nov 27, 2017
Copy link
Member

@hanfei1991 hanfei1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coocood coocood merged commit 20bd1b6 into pingcap:master Nov 27, 2017
@coocood coocood deleted the sort-chunk branch November 27, 2017 12:39
@zz-jason zz-jason mentioned this pull request Nov 29, 2017
41 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/LGT1 Indicates that a PR has LGTM 1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants