Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scan results are of wrong type when Query has no args #861

Closed
ComaVN opened this issue Sep 28, 2018 · 7 comments
Closed

Scan results are of wrong type when Query has no args #861

ComaVN opened this issue Sep 28, 2018 · 7 comments
Labels

Comments

@ComaVN
Copy link

ComaVN commented Sep 28, 2018

Issue description

I tried to use Rows.Scan to read fields of an sql result row into variables of type interface{}

I expected integer sql types to be read as go type int64, or at least, the go type to be the same whatever the query looks like.

However:

  • SELECT 123 WHERE ? = 1 returns int64: 123
  • SELECT 123 WHERE 1 = 1 returns []uint8: []byte{0x31, 0x32, 0x33}

Example code

See https://github.com/ComaVN/gosqltypespoc for a PoC, and golang/go#27828 (comment) for why it's probably a mysql driver issue.

Error log

n/a

Configuration

No idea how to determine the driver version, see the PoC: I just download the latest version from github, I think.

go version go1.11 linux/amd64

mysql:8.0.12, mariadb:10.1

Debian GNU/Linux 9 (stretch)

@methane
Copy link
Member

methane commented Sep 28, 2018

It's a not bug, but designed behavior.
If we change this behavior, it cause performance regression.

var s string
db.QueryRow("SELECT 123").Scan(&s)

If this driver parse "123" and convert it to int64, database/sql must convert it into string again.
This "string -> int -> string" conversion is totally unnecessary.

Why can't you use int or int64, instead of interface{}?
If your requirements is general enough, it may be considerable that adding option to control
textrow behavior.

@methane
Copy link
Member

methane commented Sep 28, 2018

One more comment.
Even if we convert values in textrow, we can't guarantee "same Go type is returned for same column type".

What MySQL returns for fieldType is not so stable. It will be changed by various
unexpected reasons.
So don't expect type in interface{} is stable for same type column. Think it is unstable by design.

@ComaVN
Copy link
Author

ComaVN commented Sep 28, 2018

It's a not bug, but designed behavior.

Inconsistent results are designed behaviour?

If we change this behavior, it cause performance regression.

var s string
db.QueryRow("SELECT 123").Scan(&s)

If this driver parse "123" and convert it to int64, database/sql must convert it into string again.
This "string -> int -> string" conversion is totally unnecessary.

This looks contrived: if the database field is of integer type, why would you store it in a string? And if you want to do so, shouldn't you accept that, yes, it must be converted, and yes, this has a performance cost?

Why can't you use int or int64, instead of interface{}?

Because I'm writing reusable code that tries to be flexible and doesn't have to know the type of each field in advance.

If your requirements is general enough, it may be considerable that adding option to control
textrow behavior.

For a statically typed language, I'd argue that type consistency is more important than performance in some edge case.

@methane
Copy link
Member

methane commented Sep 28, 2018

Inconsistent results are designed behaviour?

Yes. "Return most efficient type for returned data, and database/sql convert it to output type" is designed behavior.

This looks contrived: if the database field is of integer type, why would you store it in a string?

This is just an example why we don't parse text result into other types.
Only database/sql knows output type. So do nothing as possible is most efficient.
Same problem can be happend for many types: string -> []byte -> string, string -> tinyint -> bool, etc.

Because I'm writing reusable code that tries to be flexible and doesn't have to know the type of each field in advance.

You can use ColumnType to know actual column type.

For a statically typed language, I'd argue that type consistency is more important than performance in some edge case.

I can't agree. This topic is about consistent type for "dynamic type" (e.g. interface{}).
For Python (as a dynamic typed language), dynamic type is more important, because people
don't specify expected output type.
For Go, because most user use static type for scanning. That's why most people doesn't care this behavior and prefer more performance and less allocations.

@methane
Copy link
Member

methane commented Sep 28, 2018

One way to avoid text protocol is use db.Prepare() explicitly, for every query.

@ComaVN
Copy link
Author

ComaVN commented Sep 28, 2018

Thank you for your detailed responses.

I understand the reasoning, but the end result is that as a user of database/sql I need to be aware of implementation details of the driver, which is less than ideal.

Even when using ColumnType, I need to be aware of what datatypes the underlying database can return, and how to map them to go types. In a perfect world, I'd say this is the responsibility of the driver (like how eg. Doctrine DBAL does it), but I understand database/sql is not really intended to be a dbal.

Anyway, thanks.

@prekratko
Copy link

Hi guys.
I would like to add my experience.

Context: running db query that has, among db fields, one calculated field
That calculated field gives integer 0 or 1.
Driver estimated it is int64 and scan processes values like 1080864292821008385 or 1080864310000877568.
For illustration, those numbers converted to hex give F00005900000001 or F00005D00000000.
My guess is that my wanted 0/1 value is stored in least significant byte and everything to the left is "more than I needed". So, I can't really say where it came from.
I tried to change formula in query to give char. Char result was stored as array of unsignedint8[].
Anyway, from my point of view, as a user of this driver, it has problem when estimating data type of ad-hoc calculated fields.
My workaround was to put at the beginning of the calculated field subtraction of some other integer field from recordset, and add on top of that, my calculation.
Like: instead (my formula) I put (field1 - field1) + (my formula)
And then there is no more problem.

jonbodner added a commit to jonbodner/proteus that referenced this issue Sep 4, 2020
* fix problems with mysql.

* use a prepared statement explicitly to make mysql happy when there are no query parameters

* only attempt to use the prepared statement if there are no query parameters, as that's the bug in the mysql driver

* simplify code

The problem with MySQL's Go driver (https://github.com/go-sql-driver/mysql) is that it won't translate numbers in the response when the text protocol is used (which happens when a query is run without any parameters). Multiple issues have been filed against this for years, but the developers have responded with a wontfix due to the dubious claim that it's too much of a performance hit. See go-sql-driver/mysql#861 for an example.

This update checks to see if a query has no query args. If it does, it checks to see if the Querier/ContextQuerier also implements Prepare/PrepareContext and if so, it creates a statement, forcing MySQL to use the binary protocol. (Postgres will also be using a prepared statement unnecessarily, but it probably has minimal performance difference).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants