Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return u/int8, u/int16, u/int32 (sign-extend Int24), float32, float64 for interface{} as appropriate #28

Merged

Conversation

karmakaze
Copy link
Contributor

Related to Issue #26 the previously merged PR maintains the same return types always int64 for integral and float64 for floats.

This PR will return variable width and also the unsigned versions of types. This is somewhat of a breaking change, so consider if this seems better to you. It means that users will have to check for float32 as well as int8, uint8, ... int64, uint64 when interface{} is returned. The plus is that there's less chance of getting the wrong value. In particular for replication/row_event.go since there's no 'unsigned' meta info, signed is always returned. So an int32 for a value like 0x80000000 cast to uint32 will mean the right thing. Without this PR an int64 returned for a MYSQL 32-bit LONG with value 0x0000000080000000 will return 0xFFFFFFFF80000000 (with sign extension) and cast to uint64 will not be the intended value. Cast to uint32 is, but then user must remember to downcast width for unsigned.

Hope this is all clear. If very concerned with backward compatibility, perhaps a mode setting?

@siddontang
Copy link
Collaborator

Thanks @karmakaze , but I think it is not necessary here.
As you said above, handling multi int types is not easy, sometimes we should write

case n := v.(type) {
    case int8:
    case int16:
    case int32:
    ....
    case uin64:
}

It is a little redundant and not convenient than only using int64/uint64, maybe user will define its own ToInt64/Uint64 conversion function for outer use to make thing simple. :-)
You may see that for go database/sql lib, it only uses int64.

@karmakaze
Copy link
Contributor Author

I agree it's more convenient when it works. However it can lead to incorrect data when it doesn't. For the RowsEvent integral values are returned in interface{} as int64, say in v.

switch n := v.(type) {
    case int64:
        // for known unsigned value, cast to unsigned:
        u64 := uint64(v.(int64))
    // other cases are non-integral
}

Now if the original value was from a LONGLONG all is well. If it was from an unsigned LONG, say 0x80000000 the int64 will be 0xFFFFFFFF80000000 and the u64 will have a huge value, instead of 2147483648.

To handle this correctly the user code has to first have a type case for int64 then for each field name know what it's size is supposed to be when converting to unsigned. There's no automatic way of doing it, e.g.

func getUnsignedIntValue(v interface{}) uint64 {
    switch n := v.(type) {
    case int8:
        return uint64(uint8(n))
    case int16:
        return uint64(uint16(n))
    case int32:
        return uint64(uint32(n))
    case int64:
        return uint64(n)
    }
}

Such a function will continue to work without the user hardcoding field widths and even if a field in the db changes between unsigned int sizes. [The MySQL INT24 when unsigned requires masking out bits 25+ either way.]

@siddontang
Copy link
Collaborator

Hi @karmakaze

I think the problem is that we only return int64 for integer type now, maybe we will change it to returning int64/uint64 directly.

But I don't know whether this change will affect outer use like my go-mysql-elasticsearch or others.

@karmakaze
Copy link
Contributor Author

We don't know in advance whether to return int64 or uint64 in the case of RowsEvent--the column information doesn't include 'unsigned' meta info (in my testing).

Yes, this change can/will break some programs as behaviour is different (regardless of correctness of values).

@siddontang
Copy link
Collaborator

Hi @karmakaze

I dive into python mysql replication and find that it selects the column information(including unsigned flag) from INFORMATION_SCHEMA.COLUMNS table using schema name and table name when meets table map event, and I think we can use this way too.
But this may have a risk, the table with same schema name and table name may be altered, so when we meet a table map event and use its schema name and table name to fetch the column information, the table may have been altered. So we must check the table id validation in INFORMATION_SCHEMA.INNODB_SYS_TABLES, if the table id exits, the table is not altered, if not, we can't use the columns information.

Btw, I still think using int64/uint64/float64 is enough. :-)

@karmakaze
Copy link
Contributor Author

SELECTing column meta info with table id protection sounds like a good idea.
I'm currently using a fork/branch with the int8...int64 return types, and otherwise stay in sync with upstream for now.

siddontang added a commit that referenced this pull request Jan 19, 2016
Return u/int8, u/int16, u/int32 (sign-extend Int24), float32, float64 for interface{} as appropriate
@siddontang siddontang merged commit 9d1366c into go-mysql-org:master Jan 19, 2016
@siddontang
Copy link
Collaborator

Merged

@karmakaze karmakaze deleted the return-sized-types-for-interface branch January 20, 2016 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants