Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: expose database original column type #16652

Closed
kirubasankars opened this issue Aug 10, 2016 · 54 comments
Closed

database/sql: expose database original column type #16652

kirubasankars opened this issue Aug 10, 2016 · 54 comments
Assignees
Milestone

Comments

@kirubasankars
Copy link

@kirubasankars kirubasankars commented Aug 10, 2016

Please expose original database column type at least as string. this is currently so restricting on this package. this will help us to do some level dynamic column value parsing

@josharian josharian changed the title database/sql - database original column type database/sql: expose database original column type Aug 10, 2016
@quentinmit quentinmit added this to the Go1.8Maybe milestone Aug 11, 2016
@juicemia
Copy link

@juicemia juicemia commented Aug 11, 2016

I'm available to work on this. I would like to start contributing to Go.

More or less, what should the API look like for this?

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Aug 12, 2016

Much of the work is figuring out the API. Got a proposal?

@juicemia
Copy link

@juicemia juicemia commented Aug 12, 2016

What if we had a method on DB that was just responsible for returning metadata?

// Something more or less like this:
meta, err := db.QueryMeta("table_name", "column_name")
@quentinmit
Copy link
Contributor

@quentinmit quentinmit commented Aug 12, 2016

@juicemia That only works for table columns; the column in a query may be computed by the database engine, in which case there is no set of arguments to pass to your QueryMeta method that will return the correct type.

@kostya-sh
Copy link
Contributor

@kostya-sh kostya-sh commented Aug 12, 2016

How about a new method on Rows type:

func (rs *Rows) Types() ([]???, error)

I am not sure what type it should return though. It is certainly database specifict. E.g. there are ~150 different types in postgresql: https://github.com/lib/pq/blob/master/oid/types.go

JDBC API defines ~40 standard types: https://docs.oracle.com/javase/8/docs/api/java/sql/Types.html

@kostya-sh
Copy link
Contributor

@kostya-sh kostya-sh commented Aug 12, 2016

@juicemia, there is #7408 for table-level metadata

@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 12, 2016

This introduces the concept of type information, useful for #12381 as well for input parameter types. The following type information would work: https://godoc.org/github.com/kardianos/rdb#Type

@juicemia
Copy link

@juicemia juicemia commented Aug 12, 2016

@kostya-sh @kardianos defines a custom Type type in rdb. I'm interested in reading what other people have to say about it, but I guess that Types could just return []Type.

func (rs *Rows) Types() ([]Type, error)

I am not sure what type it should return though. It is certainly database specifict. E.g. there are ~150 different types in postgresql: https://github.com/lib/pq/blob/master/oid/types.go JDBC API defines ~40 standard types: https://docs.oracle.com/javase/8/docs/api/java/sql/Types.html

Type lets driver implementors specify custom types.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 12, 2016

That's actually not complete enough in my opinion. Also see https://godoc.org/github.com/kardianos/rdb#Schema and the Column def. There are other parameters that can be important to know. Is it nullable? Is it a primary key? Field length, decimal precision.

I do think it is important to allow drivers to define custom types. In my implementation there is a generic type as well as a specific type. For example: varchar, char, nvarchar, and text are all genetically "Text".

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Aug 12, 2016

Maybe it should just be an interface that it's up to drivers to define?

SQLite shrinks the lowest common denominator set of types quite a bit. Postgres let's you define your own types. SQLite doesn't really care if you use the right type. Postgres does. Some of their types share some names but that's about it: they behave quite differently.

I'm not sure what methods it should have (other than String() string) though. Stuff to work with other types and values, presumably.

@juicemia
Copy link

@juicemia juicemia commented Aug 12, 2016

@kardianos good point. Then maybe it could just be

// ColumnMeta is analagous to rdb.Column
func (rs *Rows) ColumnMeta() ([]ColumnMeta, error)
@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 12, 2016

@jimmyfrasche Please don't use SQLite as an example. If you are using SQLite, today's API is perfect. However every other RDBMS out there has a strong notion of type internally and in interfaces.

As for variety, I've done the survey. What you have in the rdb Types is like 99% of types used. There really isn't that much variety. go type string is either varchar, char, nvarchar, varchar2, or text (and varchar2 is an oracle specific type and can be ignored for our use, can be provided by driver). That's it. Numeric types are generally all tinyint, smallint int, bigint. There are custom types: SQL server allow .Net types to be defined, including table types, postgres allows arrays and other types. But... I think we can let drivers define those types.

So, yes, sqlite doesn't care about types, so the current API is perfect for it. However other databases care (for at least performance) if your param is a varchar(10) or a nvarchar(-1). If it isn't meaningful for the database system in question, ignore it. But that is mostly just sqlite where you would.

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Aug 12, 2016

@kardianos Sqlite may not be the best example (don't get me wrong, I love it) but it's still an example of a database that needs to be supported.

MySql is another example of a database that's wishy-washy about types (this one I do not love).

However, my point was that the definition of type in SQL across databases isn't something that can be set in stone. Even standardized types can be different between databases, sometimes in subtle ways.

If you define a lot of types, even if most dbs support all of them, the drivers for the ones that don't are going to need to deal with handling the ones they don't support. If they have types not in that set, the driver and the user has to deal with it.

I don't see a lot of value in filling database/sql up with a lot of constants that may or may not apply and don't really mean a lot on their own without knowing what driver you're using.

If there's a Type interface that has nice semantics it can cover as many or as few types as a particular database provides and reflect the native semantics more accurately—even for databases that don't exist yet. This does push the implementation burden onto the provider of the driver, but it also affords them the ability to define their types to fit their underlying database better.

Ultimately, what matters is how users of the database/sql package are going to be using this information. Admittedly, I've only ever used it for debugging, so strings would suit me just fine.

Constants are cheaper but do not cover all the cases, so you still need a way to compare arbitrary types anyway since TypeCustom equaling TypeCustom isn't really the whole story. So there should probably be an Equal(t Type) method on Type and an AssignableTo(t Type). This would have to be provided by the driver even for common types.

Most users will only be dealing with one database at a time so it doesn't really matter if sqlite3.TypeText != pg.TypeText: you're probably going to have to handle that specially anyway if you're writing code that cares about such things.

I'm not convinced an interface is better than just strings: strings are much simpler and easier for a driver to implement. I'm pretty sure constants in database/sql aren't the way to go here, though. It makes the world seem more uniform than it is.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 12, 2016

@jimmyfrasche Can you please point me to documentation explaining how My Sql is wishy washy about types?

Are you disagreeing that there are only so many distinct types or that there are many others then listed https://github.com/kardianos/rdb/blob/master/types.go#L53 ?

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Aug 12, 2016

MySql does a lot of implicit conversions https://dev.mysql.com/doc/refman/5.7/en/type-conversion.html though knowing the type upfront could allow you to work around that, if you knew that might get you in trouble and you knew you were dealing with MySql.

Are you disagreeing that there are only so many distinct types or that there are many others then listed

Both. There are databases with fewer types. There are databases with more types. (Many with infinite types in the sense that you can define arbitrary composite types like arrays and records).

I'm also trying to point out that just because two types have the same name in two different databases it doesn't mean they behave the same or that all the same rules apply.

None of that is to say that the type information isn't useful: just that it's very dependent on what database you're using.

I'm thinking something like

//in driver package
type Type interface {
    fmt.Stringer
    ValueConverter

    Equal(Type) bool
    AssignableTo(Type) bool
   //and so on
}

is going to be more useful since it can handle a lot of those database-specific nuances for you. An sqlite driver could just always return true for AssignableTo and a postgres driver could return false if Equal returns false (after checking a handful of special cases surrounding text types, iirc).

@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 12, 2016

MySql does a lot of implicit conversions https://dev.mysql.com/doc/refman/5.7/en/type-conversion.html though knowing the type upfront could allow you to work around that, if you knew that might get you in trouble and you knew you were dealing with MySql.

This is how SQL works, not just in My SQL. In MS SQL Server, if you send up a nvarchar param to a varchar column in a where clause, the performance goes down significantly due to increased conversion overhead. Silly? Yes. But I've gotta work with that without a bunch of hacks.

I'm also trying to point out that just because two types have the same name in two different databases it doesn't mean they behave the same or that all the same rules apply.

To a degree. A type Text in MS SQL server has a distinct meaning from varchar(max), whereas in postgresql type Text is just an unbounded varchar w/o significant differences. But an interface you are suggesting can't capture that well, as I'm not just taking about assignability to another database type. I also want to specify the input parameters #12381 where I need a database type.

There are databases with fewer types. There are databases with more types.

Fewer types, some types aren't used, more types are defined in the driver itself. Despite the many dialects of SQL, data types do have a common base. And the differences a driver may define and reference. I don't see the problem.


I find value in knowing what the database thinks a type is for both result columns and for input parameters. You are saying for result columns, provide a string representation and an interface to provide many functions you might want to do with it. I think that falls down in the input parameter case.

There may be value in letting the driver expose a function like:

type TypeCompare interface{
  Assignable(t1, t2 Type) bool
  Generic(t Type) Type
  ...
}

That could have value. But I think a standard type list, along with a driver type list is a first step. Do a survey among common and uncommon RDMBS types. They all share the vast majority of types, there is some variation as in the MS text vs varchar, but in those cases knowing the exact differences as a programmer is required (If a stored proc takes text, don't give it varchar for example).

@jimmyfrasche
Copy link
Member

@jimmyfrasche jimmyfrasche commented Aug 13, 2016

I see what you mean re #12381 though the driver could export the types for this use. That doesn't help if you're trying to remain database agnostic, though. But presumably you're passing this as a hint to the driver, so you're already not database agnostic. Maybe I don't understand the use case.

If a specific driver is defining its own types—which it would have to for databases that allow user defined types at the very least—you still need an interface. Then the Type constants in database/sql would have to implement that interface, so they could be used interchangeably with the driver's types. But then they couldn't have anything db specific, like equivalence, so none of the types in database/sql could ever contain information specific to a database, so the interface couldn't have things like AssignableTo. If you needed that information you'd need to encode that in the driver or a third location, so what's the point of having the type constants in database/sql? They'd just be a list of names that could mean different things. I do not see the point.

It would be nice for database/sql to have more features, but I'm wary of encoding too much information into it.

What is the use case for having these specific constants in database/sql? Like I said, I've only ever used it for debugging. I imagine it would be very handy for statement mappers and ORMs but I've never written such things.

@kostya-sh
Copy link
Contributor

@kostya-sh kostya-sh commented Aug 13, 2016

As far as I can see there are 3 places where types could be useful:

  • result set (this issue)
  • table/stored proc level metadata (#7408)
  • input and output query parameters (#12383)

I think it would help to discuss and understand requirements for each of these cases separately.

Before jumping to the API discussions I suggest we should understand the use case for the requested feature better.

@kirubasankars, can you please explain what you meant by "some level dynamic column value parsing"?

Has anyone needed this functionality before? What was the use case?

@kostya-sh
Copy link
Contributor

@kostya-sh kostya-sh commented Aug 13, 2016

For the reference this is what JDBC provides: http://docs.oracle.com/javase/8/docs/api/java/sql/ResultSetMetaData.html

I can imagine only a few database drivers support gettting all this information though. For example this is what you can get out of ProstgreSQL: https://www.postgresql.org/docs/9.5/static/protocol-message-formats.html (search for RowDescription (B))

@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 13, 2016

@jimmyfrasche Yes, this would not be database agnostic. That would need a different system layer to filter it through. It would just expose what the RDBMS SQL type is.

Use case for results where switching on go type doesn't work:

var sql = `
select
  d.ID, d.Name, d.Mime,
 fromp.FullName as  FromFullName,  forp.FullName as ForFullName
from
  seq.Document d
  join acc.Person fromPerson fromp on fromp.ID = d.FromPerson
  left join acc.Person forPerson forp on forp.ID = d.ForPerson
where
  1=1 and ...;
`

If you need to encode a information regarding ForFullName, you can't rely on it being there in the first row, in fact it may never be represented. This isn't a problem if you ad-hoc everything, but if you push more to app specific framework, it can be more problematic.

However, per Brad in comment #12381 (comment) I think this is moot. So while Brad also said #16652 (comment) , it would need to be without specific type information being represented in the sql pacakge.

@kirubasankars
Copy link
Author

@kirubasankars kirubasankars commented Aug 15, 2016

@kostya-sh, Trying to create entire row sets as map[string]interface{}.

While executing that query, my code have no idea about how many column, their names and types

Basic types are automatically mapped to GO types, that's good. Custom type are not, (example PostgreSQL custom type)

driver's can get me those custom values by only as RawBytes

I was expecting this package should expose type and size information.

in my opinion, may be java handle this better,
java sample click here

Also have a look at JDBC sql types

@juicemia
Copy link

@juicemia juicemia commented Aug 17, 2016

Something just occurred to me.

At the risk of sounding dumb: how would I go about adding a method on an interface type?

// This is illegal because Rows is an interface type
func (rs *Rows) Schema() Schema

Adding it to the Rows interface breaks the API and thus the API tool fails. I added the old interface to except.txt but I'm not sure if that's such a good idea? Some help from more experienced contributors would be awesome.

EDIT: I'm only doing this to start playing around and getting familiar with the source. I don't feel like we've reached a consensus as to what should be done.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 17, 2016

@juicemia *sql.Rows is a struct. As such it can be added to safely.

@juicemia
Copy link

@juicemia juicemia commented Aug 17, 2016

@kardianos wow how embarrassing. 😅 i was looking in driver and totally forgot about the rest of the package. 😶

So what's the direction we should take on the types then? I agree with @jimmyfrasche that we should just expose a Type interface, and leave the implementation of the types to the database drivers.

@juicemia
Copy link

@juicemia juicemia commented Aug 23, 2016

Has discussion for this moved somewhere else, or has it just tapered off?

@kardianos
Copy link
Contributor

@kardianos kardianos commented Aug 23, 2016

@juicemia It has not moved.
However, it is now my explicit understanding that @bradfitz does not want to see SQL Database types exposed in this package. So even though it isn't closed, I put it into the "highly doubtful" range.

See #12381 (comment)

Also see: #16673 (comment) for the scope of changes I hope to tackle for go1.8.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Aug 23, 2016

@kardianos, no, I'm fine with making type information available. What I was opposed to that in that comment is making users redundantly specify type information when inserting data, since an interface value already conveys type information.

I'd like to see an API proposal for this issue.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Sep 5, 2016

@juicemia The diff can be found here: juicemia@53d1cf5
It may be easier in the future to view these in a standard CL in gerrit. Just ensure you add DO NOT REVIEW on in it.

@juicemia
Copy link

@juicemia juicemia commented Sep 7, 2016

Oh okay. Yeah sorry I should have linked the commit.

@bradfitz
Copy link
Contributor

@bradfitz bradfitz commented Sep 8, 2016

After a chat with @kardianos, we've converged on a design like:

type ColumnInfo struct {
        // unexported fields                                                                                               
        // driver handle some                                                                                              
}

func (ci ColumnInfo) Name() string { ... }
func (ci ColumnInfo) ScanType() reflect.Type  { ... } // something that's suitable for Scan                                         
func (ci ColumnInfo) Nullable() (nullable, ok bool) { ... }
func (ci ColumnInfo) DriverTypeName() string { ... }     // "CHAR" without (length)                                                

func (rs *Rows) ColumnInfo() ([]ColumnInfo, error) {

}
@kirubasankars
Copy link
Author

@kirubasankars kirubasankars commented Sep 13, 2016

Are we missing column size from the driver?

@neilotoole
Copy link

@neilotoole neilotoole commented Sep 18, 2016

Seconding @kirubasankars on column size.

Also DriverTypeName... does this mean driver-specific strings? It would be highly convenient if there were a standardized mapping of the most common types to constants a la JDBC etc. Otherwise I'll end up doing something like stdType := allDrivers['postgres'][driverTypeName] (which is basically what I do today). Not the end of the world, but could be confusing for newcomers.

@neilotoole
Copy link

@neilotoole neilotoole commented Sep 26, 2016

@kardianos @bradfitz @kirubasankars

I hope the ship hasn't sailed on the column size decision. I've been working with this very issue literally two minutes ago on my own project, decided I better drop in on this ticket again.

I'll give you a specific use-case where knowing the column size is important (and this is only just my particular case from tonight, this is a common occurrence with ETL software). I'm working on an app that moves data between heterogenous datasources (where the schemata are not known ahead of time, etc). Sometimes this means that there's a scratch DB in the middle where data munging/staging goes on. Let's say I've got oracle1.tblUser and postgres1.tblAddress, and there's a bunch of text fields on each, let's say email VARCHAR(64) and city VARCHAR(255). Right now, how I make this work is that I have a thin layer wrapping each and every SQL driver so that I can get access to the field type/size etc so that I can present it in a uniform manner to the worker (with go-mysql I made my own fork because for some inexplicable reason the driver doesn't export the field info).

Now, under the proposal above, yes I would know that a column is a text-ish field, but without knowing the size the best one can do is create a LONGTEXT col or such, even though the drivers have access to the fact that those text cols are of max size 64 and 255. If you're dealing with throughput of billions of rows, the performance difference between a VARCHAR(64) / VARCHAR(255) and a LONGTEXT does become significant (and in some databases, egregiously so).

Again, the col size is data that is ordinarily available to the driver. If for some reason the driver can't present that information, we could use -1 as a sentinel to indicate it's not available. Hell, you can make that the default in the sql layer. But please, at least make it possible for drivers to provide this information back up through the API layer to the caller. There's no downside to it.

@neilotoole
Copy link

@neilotoole neilotoole commented Sep 26, 2016

One more thing: should we not also include a method as follows?

func (ci ColumnInfo) Alias() string { ... }

This can default to ci.Name().

@neilotoole
Copy link

@neilotoole neilotoole commented Sep 26, 2016

OK, maybe just one more thing... let's not forget about:

func (ci ColumnInfo) IsPrimaryKey() bool { ... }

Although, that's not ideal either? I'd be more inclined to go this path:

func (ci ColumnInfo) Attrs() Attr
// or Attributes() to be verbose
// if ci.Attrs() & AttrPrimaryKey ... or AttrNullable, AttrIndex etc.

This would be more extensible/future-proof; implementations could choose how much (if any) detail to send back in the attributes, and would allow us to drop the ci.Nullable() method, which is really just a special case of col attributes.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Sep 26, 2016

@neilotoole I agree we need length, and possibly Scale and Precision.
I'm not familiar with a protocol that exposes Alias as a separate property then Name.
It is important to know that the design is chosen to allow adding additional method if the need arises.

@gopherbot
Copy link

@gopherbot gopherbot commented Sep 30, 2016

CL https://golang.org/cl/29961 mentions this issue.

@neilotoole
Copy link

@neilotoole neilotoole commented Oct 4, 2016

@kardianos I think you're probably correct about alias. But yes, length/size and scale/precision would be useful.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Oct 4, 2016

@neilotoole Size and length are included in the CL. Please review if you want to.

@neilotoole
Copy link

@neilotoole neilotoole commented Oct 6, 2016

@kardianos thanks, I just reviewed patch set 5.

@neilotoole
Copy link

@neilotoole neilotoole commented Oct 7, 2016

@kardianos @bradfitz

I had some time this week to look over the proposal. I'm generally ok with the proposed sql pkg API, but I think the driver pkg changes are going off in the wrong direction. I'm referencing driver.go PS7 (Gerrit).

TLDR: Replace the five new driver pkg interfaces with one ColumnTyper interface.


It seems to me that having these five separate interfaces is overly elaborate.
This is what is currently proposed (comments omitted):

package driver

type RowsColumnTypeScanType interface {
    Rows
    ColumnTypeScanType(index int) reflect.Type
}

type RowsColumnTypeDatabaseTypeName interface {
    Rows
    ColumnTypeDatabaseTypeName(index int) string
}

type RowsColumnTypeLength interface {
    Rows
    ColumnTypeLength(index int) (length int64, ok bool)
}

type RowsColumnTypeNullable interface {
    Rows
    ColumnTypeNullable(index int) (nullable, ok bool)
}

type RowsColumnTypePrecisionScale interface {
    Rows
    ColumnTypePrecisionScale(index int) (precision, scale int64, ok bool)
}

And this from the sql package:

package sql


func (rs *Rows) ColumnTypes() ([]ColumnType, error) {
    if rs.isClosed() {
        return nil, errors.New("sql: Rows are closed")
    }
    if rs.rowsi == nil {
        return nil, errors.New("sql: no Rows available")
    }
    return rowsColumnInfoSetup(rs.rowsi), nil
}

func rowsColumnInfoSetup(rowsi driver.Rows) []ColumnType {
    names := rowsi.Columns()

    list := make([]ColumnType, len(names))
    for i := range list {
        ci := &list[i]
        ci.name = names[i]

        if prop, ok := rowsi.(driver.RowsColumnTypeScanType); ok {
            ci.scanType = prop.ColumnTypeScanType(i)
        } else {
            ci.scanType = reflect.TypeOf(new(interface{})).Elem()
        }
        if prop, ok := rowsi.(driver.RowsColumnTypeDatabaseTypeName); ok {
            ci.databaseType = prop.ColumnTypeDatabaseTypeName(i)
        }
        if prop, ok := rowsi.(driver.RowsColumnTypeLength); ok {
            ci.length, ci.hasLength = prop.ColumnTypeLength(i)
        }
        if prop, ok := rowsi.(driver.RowsColumnTypeNullable); ok {
            ci.nullable, ci.hasNullable = prop.ColumnTypeNullable(i)
        }
        if prop, ok := rowsi.(driver.RowsColumnTypePrecisionScale); ok {
            ci.precision, ci.scale, ci.hasPrecisionScale = prop.ColumnTypePrecisionScale(i)
        }
    }
    return list
}

I suggest replacing those five driver interfaces with one interface:

package driver


type ColumnTyper interface {
  Rows
  ColumnTypes() ([]ColumnType, error)
}

The issues I see with this current proposal include:

  • First and foremost, why separate the interfaces at all? In practice, driver authors
    are not going to implement just a subset of this new API; what would the reasoning for that be?
  • As a consumer of the API, a partial implementation by a driver would be somewhat silly. The sort of use cases I'm thinking of that
    require this type of column metadata generally aren't satisfied by a partial implementation.
  • This increases the burden on the driver author who now has five methods they need to write test cases for, as opposed to one. Also it potentially complicates their implementation (will they need to keep data structures hanging around expecting subsequent calls for the next col etc?)
  • The interface and method names are unidiomatic.
  • This breaks the design symmetry between the sql and driver packages (as exemplified by the Rows.Columns() methods).
  • A single call to sql.Rows.ColumnTypes() results in 5 x [num cols] calls to the driver impl.
    That's potentially a great deal of calls (at Teradata I saw customer queries with hundreds of cols). Even if in practice this is not a huge perf burden, something about it seems wrong (an "API smell" if you will).
@neilotoole
Copy link

@neilotoole neilotoole commented Oct 7, 2016

If we were to go with the ColumnTyper approach: for safety, I (manually) verified that there is no exported ColumnTypes field or method on the struct implementing driver.Rows in any of the known driver packages, so this should be a non-breaking change.

List of drivers is from golang wiki. Each link is to the source for the struct that implements driver.Rows:

@neilotoole
Copy link

@neilotoole neilotoole commented Oct 7, 2016

For my own private project, I took almost this exact approach when hacking the sql API. My changes looked something like this (omitting the ColumnType type in both packages):

package driver

type ColumnTyper interface {
    ColumnTypes() ([]ColumnType, error)
}

EDIT FOR CLARITY : this copy-n-edit snippet below (which exposes the driver package) is a poor example: driver.ColumnType would need to be wrapped in sql.ColumnType.

package sql

func (rs *Rows) ColumnTypes() ([]driver.ColumnType, error) {
    if rs.closed {
        return nil, errors.New("sql: Rows are closed")
    }
    if rs.rowsi == nil {
        return nil, errors.New("sql: no Rows available")
    }

    if colTyper, ok := rs.rowsi.(driver.ColumnTyper); ok {
        return colTyper.ColumnTypes()
    }

    return nil, errors.New("sql: driver does not provide column type data")
}

In my own project I have already implemented the ColumnTyper API changes for Postgres (lib/pq), MySQL (go-sql-driver) and SQLite (mattn), so I know from first-hand implementation experience that this approach works just fine. And it seems less complicated than the proposed approach.

@kardianos
Copy link
Contributor

@kardianos kardianos commented Oct 7, 2016

First and foremost, why separate the interfaces at all? In practice, driver authors are not going to implement just a subset of this new API; what would the reasoning for that be?

The reason is for backwards compatibility and extensibility. If we need to add another property, it can be done. Also the driver package cannot reference the sql package, so we would need to copy the values from each driver.ColumnType to sql.ColumnType.

As a consumer of the API, a partial implementation by a driver would be somewhat silly. The sort of use cases I'm thinking of that require this type of column metadata generally aren't satisfied by a partial implementation.

Some drivers may not have all the information available. But I generally agree with this point.

This increases the burden on the driver author who now has five methods they need to write test cases for, as opposed to one. Also it potentially complicates their implementation (will they need to keep data structures hanging around expecting subsequent calls for the next col etc?)

I don't really buy this one. At the end of the day you are testing the same amount of properties.

This breaks the design symmetry between the sql and driver packages (as exemplified by the Rows.Columns() methods).

I don't understand this.

A single call to sql.Rows.ColumnTypes() results in 5 x [num cols] calls to the driver impl. That's potentially a great deal of calls (at Teradata I saw customer queries with hundreds of cols). Even if in practice this is not a huge perf burden, something about it seems wrong (an "API smell" if you will).

This is true. I'm not expecting this to be in a super hot loop. I also want to be able to work with tables with hundreds of columns ( http://www.jdetables.com/ sigh). In this case backwards compatibility and extensibility are the key design issues here.


Let's take a step back and impose the following restrictions on design:

  • ColumnType needs to be defined in the database/sql package.
  • If we want to add another property in the future, we need to be able to do so without
    breaking backwards compatibility.
  • We need to not break backwards compatibility on this change.

Thank you for the above links and the concrete counter proposal. Could you modify the proposal to meet the above, or describe how it meets the above? I agree that it meets the last point.

@neilotoole
Copy link

@neilotoole neilotoole commented Oct 7, 2016

The reason is for backwards compatibility and extensibility. If we need to add another property, it can be done.

Yeah, I take your point on the extensibility. I guess my assumption was that this proposal is the fix for the sql package, and that there won't be any changes coming ever again (in our lifetimes at least).

Also the driver package cannot reference the sql package, so we would need to copy the values from each driver.ColumnType to sql.ColumnType.

The example I gave above is from my own hack of the API for a private project where I didn't care about exposing the driver package (I'll add a comment to clarify this, probably should have omitted that example). But yes, I was imagining the two ColumnTypes, but rather than copying the values, the sql.ColumnType would wrap the driver.ColumnType, similar to what is done with Rows:

package sql

type Rows struct {
    dc          *driverConn // owned; must call releaseConn when closed to release
    releaseConn func(error)
    rowsi       driver.Rows
@neilotoole
Copy link

@neilotoole neilotoole commented Oct 7, 2016

I guess ^^^ would look something like this:

package sql

type ColumnType struct {
    cti driver.ColumnType
}

func (c *ColumnType) Name() string {
    return c.cti.Name()
}

func (c *ColumnType) Length() (length int64, ok bool) {
    return c.cti
}
// etc...

func (rs *Rows) ColumnTypes() ([]ColumnType, error) {
    if rs.closed {
        return nil, errors.New("sql: Rows are closed")
    }
    if rs.rowsi == nil {
        return nil, errors.New("sql: no Rows available")
    }

    if colTyper, ok := rs.rowsi.(driver.ColumnTyper); ok {

        drvCols, err := colTyper.ColumnTypes()

        if err != nil {
            sqlCols := make([]ColumnType, len(drvCols))
            for i := range drvCols {
                sqlCols[i] = ColumnType{drvCols[i]}
            }
            return sqlCols, nil
        }

        return nil, err
    }

    return nil, errors.New("sql: driver does not provide column type data")
}

Which does result in a bunch of allocations, hmmmn.

@neilotoole
Copy link

@neilotoole neilotoole commented Oct 7, 2016

@kardianos I'm coming around to your approach due to your very valid point about future extensibility (which wasn't a design consideration for me in my own hack).

Also, it's probably the case that my point about the amount of calls to the driver to construct the sql.ColumnType object is a bit of a red herring. In practice for most applications I imagine that sql.Rows.ColumnTypes() will likely only be invoked once per unique query, and the details of how that function is implemented will only have negligible overall impact on performance. I'll give it some more thought this weekend, but given that the key thing here is the API design for the sql package (as opposed to the driver package changes), I don't want to hold up the overall proposal. Thanks for taking the time to get back to me.

@quentinmit quentinmit added the Proposal label Oct 10, 2016
@quentinmit quentinmit modified the milestones: Proposal, Go1.8Maybe Oct 10, 2016
@gopherbot gopherbot closed this in 2a85578 Oct 18, 2016
@golang golang locked and limited conversation to collaborators Oct 18, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
10 participants
You can’t perform that action at this time.