Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: cut off duration.fsp in chunk #7043

Merged
merged 13 commits into from Jul 17, 2018

Conversation

@lysu
Copy link
Member

commented Jul 12, 2018

What have you changed? (mandatory)

  • Remove fsp of types.Duration from Chunk to reduce memory in chunk
  • Fix fix time func ret decimal to fix #6968 via 1e12b0e

This PR is "easy way" to #7013 which is more right and more clear in theory, but with big change and be nightmare to both reviewee and reviewer, but I still think it must be done in future.

  • Datum's collation, decimal, length is short-cut for some part coding but wrong for design.
  • Duration's fsp need be remove with Datum 3 field which need big change Decimal
  • Seperate type infer flow with data flow is best.

But this time, we solve chunk in this small PR first.

What is the type of the changes? (mandatory)

  • Improvement (non-breaking change which is an improvement to an existing feature)
  • Bug fix (non-breaking change which fixes an issue)

How has this PR been tested? (mandatory)

  • unit tests
  • integration tests
  • and I will Port duration mysql_test for test so WIP but ready for review

This change is Reviewable

@lysu lysu force-pushed the lysu:dev/duration_cut_off_fsp_with_easy_way branch from 2c6a8e9 to aeb9f21 Jul 12, 2018
@lysu

This comment has been minimized.

Copy link
Member Author

commented Jul 12, 2018

/run-all-tests

@lysu lysu added the status/WIP label Jul 12, 2018
@lysu lysu changed the title *: cut off Duration.Fsp in chunk *: cut off duration.fsp in chunk Jul 12, 2018
@@ -88,7 +89,8 @@ func (r Row) GetTime(colIdx int) types.Time {
// GetDuration returns the Duration value with the colIdx.
func (r Row) GetDuration(colIdx int) types.Duration {

This comment has been minimized.

Copy link
@zz-jason

zz-jason Jul 12, 2018

Member

Does this change affect the benchmark of GetDuration?

This comment has been minimized.

Copy link
@lysu

lysu Jul 13, 2018

Author Member

I will benchmark later, but it seems will allocate more a types.Duration than before, but this is just as GetTime does...

if err != nil {
return nil, err
}
bf.tp.Decimal = mathutil.Max(arg0Dec, arg1Dec)

This comment has been minimized.

Copy link
@zz-jason

zz-jason Jul 12, 2018

Member

this is a bugfix?

This comment has been minimized.

Copy link
@lysu

lysu Jul 13, 2018

Author Member

this decimal field is no use before this PR, so it got wrong value but not affect the user, but now if the duration comes from chunk.row will need it to compensate missed fsp info.

@@ -1307,13 +1307,13 @@ func (s *testIntegrationSuite) TestTimeBuiltin(c *C) {

// test time
result = tk.MustQuery("select time('2003-12-31 01:02:03')")
result.Check(testkit.Rows("01:02:03"))
result.Check(testkit.Rows("01:02:03.000000"))

This comment has been minimized.

Copy link
@zz-jason

zz-jason Jul 12, 2018

Member

this result is different from mysql:

MySQL(localhost:3306) > select time('2003-12-31 01:02:03');
+-----------------------------+
| time('2003-12-31 01:02:03') |
+-----------------------------+
| 01:02:03                    |
+-----------------------------+
1 row in set (0.00 sec)
result = tk.MustQuery("select time('2003-12-31 01:02:03.000123')")
result.Check(testkit.Rows("01:02:03.000123"))
result = tk.MustQuery("select time('01:02:03.000123')")
result.Check(testkit.Rows("01:02:03.000123"))
result = tk.MustQuery("select time('01:02:03')")
result.Check(testkit.Rows("01:02:03"))
result.Check(testkit.Rows("01:02:03.000000"))

This comment has been minimized.

Copy link
@zz-jason
@lysu lysu force-pushed the lysu:dev/duration_cut_off_fsp_with_easy_way branch from d6fc40c to 4291000 Jul 13, 2018
@lysu

This comment has been minimized.

Copy link
Member Author

commented Jul 13, 2018

There are still some truncate test cases give the different result to MySQL(even in master branch), pingcap/tidb-test@21411ba

@zz-jason

This comment has been minimized.

Copy link
Member

commented Jul 13, 2018

LGTM

But we still need some benchmarks to know the exact performance regression.

@lysu lysu added the status/DNM label Jul 14, 2018
@lysu lysu force-pushed the lysu:dev/duration_cut_off_fsp_with_easy_way branch from 51318ff to 0730598 Jul 16, 2018
return true
}
return false
}

// IsStr returns a boolean indicating
// whether the field type is a string type.
func IsStr(ft *FieldType) bool {

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 16, 2018

Contributor

s/ IsStr/ IsString

@@ -20,6 +20,7 @@ import (
"github.com/pingcap/tidb/types"
"github.com/pingcap/tidb/types/json"
"github.com/pingcap/tidb/util/hack"
"time"

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 16, 2018

Contributor

put this line to line17

@@ -5370,3 +5401,19 @@ func (b *builtinLastDaySig) evalTime(row types.Row) (types.Time, bool, error) {
}
return ret, false, nil
}

func timePrecision(ctx sessionctx.Context, expression Expression) (int, error) {

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 16, 2018

Contributor

add a comment for this function.

This comment has been minimized.

Copy link
@lysu

lysu Jul 16, 2018

Author Member
    • name is not very suitable, it was renamed and add comments in new commit.
if isNil || err != nil {
return 0, errors.Trace(err)
}
if n := strings.LastIndexByte(str, '.'); n >= 0 {

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 16, 2018

Contributor

This check can be replaced by function GetFsp

@@ -2663,7 +2663,7 @@ func (s *testIntegrationSuite) TestCompareBuiltin(c *C) {
tk.MustExec(`insert into t2 values(1, 1.1, "2017-08-01 12:01:01", "12:01:01", "abcdef", 0b10101)`)

result = tk.MustQuery("select coalesce(NULL, a), coalesce(NULL, b, a), coalesce(c, NULL, a, b), coalesce(d, NULL), coalesce(d, c), coalesce(NULL, NULL, e, 1), coalesce(f), coalesce(1, a, b, c, d, e, f) from t2")
result.Check(testkit.Rows(fmt.Sprintf("1 1.1 2017-08-01 12:01:01 12:01:01 %s 12:01:01 abcdef 21 1", time.Now().In(tk.Se.GetSessionVars().GetTimeZone()).Format("2006-01-02"))))
result.Check(testkit.Rows(fmt.Sprintf("1 1.1 2017-08-01 12:01:01 12:01:01.000000 %s 12:01:01 abcdef 21 1", time.Now().In(tk.Se.GetSessionVars().GetTimeZone()).Format("2006-01-02"))))

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 16, 2018

Contributor

Is this compatible with mysql?

This comment has been minimized.

Copy link
@lysu

lysu Jul 16, 2018

Author Member

oh~ that's bug...fixed in new commits

@@ -22,6 +22,8 @@ import (
)

const (
// WaitFillFsp is fsp need fill by caller if they want to use fsp.
WaitFillFsp = -2

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 16, 2018

Contributor

When will the caller want to us fsp?

This comment has been minimized.

Copy link
@lysu

lysu Jul 16, 2018

Author Member
    • the old way needs GetDuration caller to check whether result.fsp == WaitFillFsp than refill Fsp or do nothing...that's buggy.

Now force pass fillFsp paramter in GetDuration(also add some comments) and no longer need this flag. PTAL~thx

@lysu lysu force-pushed the lysu:dev/duration_cut_off_fsp_with_easy_way branch from fe022f0 to 8fd9a50 Jul 16, 2018
@lysu

This comment has been minimized.

Copy link
Member Author

commented Jul 16, 2018

@XuHuaiyu

This comment has been minimized.

Copy link
Contributor

commented Jul 17, 2018

@lysu DNM label can be removed?

@@ -5370,3 +5401,16 @@ func (b *builtinLastDaySig) evalTime(row types.Row) (types.Time, bool, error) {
}
return ret, false, nil
}

// expressionFsp calculates the fsp from given expression.
func expressionFsp(ctx sessionctx.Context, expression Expression) (int, error) {

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

s/ expressionFsp/ getExpressionFsp

@@ -1249,7 +1249,7 @@ func (s *testInferTypeSuite) createTestCase4TimeFuncs() []typeInferTestCase {
{"addtime(c_timestamp, c_time_d)", mysql.TypeDatetime, charset.CharsetBin, mysql.BinaryFlag, 26, 4},
{"addtime(c_timestamp_d, c_time_d)", mysql.TypeDatetime, charset.CharsetBin, mysql.BinaryFlag, 26, 0},
{"addtime(c_time, c_time)", mysql.TypeDuration, charset.CharsetBin, mysql.BinaryFlag, 15, 3},
{"addtime(c_time_d, c_time)", mysql.TypeDuration, charset.CharsetBin, mysql.BinaryFlag, 15, 0},
{"addtime(c_time_d, c_time)", mysql.TypeDuration, charset.CharsetBin, mysql.BinaryFlag, 15, 3},

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

Do the changes in this file be more reasonable or more compatible?

This comment has been minimized.

Copy link
@lysu

lysu Jul 17, 2018

Author Member

Yes, in previous we didn't infer decimal and always be zero, but in MySQL takes that.

mysql> select version();
+---------------------------+
| version()                 |
+---------------------------+
| 5.7.22-0ubuntu18.04.1-log |
+---------------------------+
1 row in set (0.00 sec)

mysql> select addtime(c_time_d, c_time) from t;
Field   1:  `addtime(c_time_d, c_time)`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       TIME
Collation:  binary (63)
Length:     14
Max_length: 0
Decimals:   3
Flags:      BINARY 
0 rows in set (0.00 sec)
@@ -42,7 +42,9 @@ type Row interface {
GetTime(colIdx int) Time

// GetDuration returns the Duration value with the colIdx.
GetDuration(colIdx int) Duration
// fillFsp is needed for refill fsp info if duration came from chunk.Row which is no longer store fsp info.
// if caller make sure that data from Datum or only use Duration.Duration properties can pass 0 as fillFsp

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

s/ is/ If
add . at the end of this comment

@@ -42,7 +42,9 @@ type Row interface {
GetTime(colIdx int) Time

// GetDuration returns the Duration value with the colIdx.
GetDuration(colIdx int) Duration
// fillFsp is needed for refill fsp info if duration came from chunk.Row which is no longer store fsp info.
// if caller make sure that data from Datum or only use Duration.Duration properties can pass 0 as fillFsp

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

s/ is/ If
add . at the end of this comment

@@ -812,7 +813,7 @@ func (d Duration) String() string {
}

fmt.Fprintf(&buf, "%02d:%02d:%02d", hours, minutes, seconds)
if d.Fsp > 0 {
if d.Fsp > 0 && fraction >= 0 {

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

add a comment for this check

This comment has been minimized.

Copy link
@lysu

lysu Jul 17, 2018

Author Member

check fraction seems useless....I removed it

if err != nil {
return ret, errors.Trace(err)
}
if timeNum > MaxDuration && timeNum < 10000000000 {

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

what if timeNum > 10000000000

This comment has been minimized.

Copy link
@lysu

lysu Jul 17, 2018

Author Member

timeNum > MaxDuration and < 10000000000 must be a wrong value..
but timeNum > 10000000000 will try treat as 'YYYYMMDDHHmmss'

specially timeNum < 0 and timeNum < -10000000000 should NOT treat as 'YYYYMMDDHHmmss'

https://github.com/mysql/mysql-server/blob/e7586bb9c11953a81c816ca10708f612403edc24/sql-common/my_time.c#L846

I add some comments for this if~


arg0Dec, err := expressionFsp(ctx, args[0])
if err != nil {
return nil, err

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

errors.Trace(err)

and line 411

@lysu

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2018

/run-all-tests tidb-test=pr/574

@coocood

This comment has been minimized.

Copy link
Member

commented Jul 17, 2018

LGTM

@lysu

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2018

@zz-jason

func BenchmarkGetDurationFromChunk(b *testing.B) {
	chk := chunk.NewChunkWithCapacity([]*types.FieldType{{Tp: mysql.TypeDuration}}, 1)
	chk.AppendDuration(0, types.Duration{Duration: 11233*time.Second, Fsp:1})
	row := chk.GetRow(0)
	b.ResetTimer()
	for i := 0; i < b.N; i++ {
		x: = row.GetDuration(0, 1)
                x = x
	}
}

for master:

BenchmarkGetDurationFromChunk-8   	2000000000	         1.06 ns/op	       0 B/op	       0 allocs/op
PASS

for this pr:

BenchmarkGetDurationFromChunk-8   	2000000000	         1.03 ns/op	       0 B/op	       0 allocs/op
PASS
Copy link
Contributor

left a comment

rest LGTM

if err != nil {
return ret, errors.Trace(err)
}
// For For huge numbers(>'0001-00-00 00-00-00') try full DATETIME in ParseDuration.

This comment has been minimized.

Copy link
@XuHuaiyu

XuHuaiyu Jul 17, 2018

Contributor

remove one For

This comment has been minimized.

Copy link
@lysu

lysu Jul 17, 2018

Author Member

Done

@lysu lysu force-pushed the lysu:dev/duration_cut_off_fsp_with_easy_way branch from 6ed90f7 to 0532bbb Jul 17, 2018
Copy link
Contributor

left a comment

LGTM

@coocood coocood added status/LGT2 and removed status/LGT1 labels Jul 17, 2018
@coocood coocood merged commit 9cf670a into pingcap:master Jul 17, 2018
4 checks passed
4 checks passed
ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jenkins-ci-tidb/build Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details
@lysu lysu deleted the lysu:dev/duration_cut_off_fsp_with_easy_way branch Sep 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.