Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement per-column type overrides. #123

Merged
merged 3 commits into from Nov 6, 2019

Conversation

inconshreveable
Copy link
Contributor

Often a user may want to override the Go type used for model
generation and query generation, but it's not a 1-to-1 mapping
of SQL types to Go type. Instead, the override to use depends
entirely on the column of a particular table. This commit adds the
functionality to override types of generated columns in models and query
parameters by adding a new 'overrides' configuration settings object
on a per-package basis.

This required tracking down a number of places where the column's Table
FQN wasn't properly propagated through to the the model or query
parameter Column. This change necessitated updating a bunch of tests to
match.

Lastly, I've added a test to prove the functionality works (in addition
to testing it on a large code base) and added a documentation section in
the README.

Copy link

@kyle-meter kyle-meter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the first pass. For some reason, the tests did not run. Could you rebase on master and push again? I accidentally merged one of your other changes without a test run and it broke the build.

Global and package override configuration should use the same format. The current global override format looks like this:

{
  "go_type": "uuid.UUID",
  "package": "github.com/gofrs/uuid",
  "postgres_type": "uuid"
}

I think we can take what you've added and combine it into

{
  "go_type": "github.com/gofrs/uuid.UUID",
  "table": "authors",
  "column": "id",
  "postgres_type": "uuid"
}

Table and column should be split out so that we can parse the table name as a FQN. The validation code will need to be more complex, but this means that you can specify the same overrides in different places.

Comment on lines 40 to 84
lastDot := strings.LastIndex(o.GoType, ".")
if lastDot == -1 {
return fmt.Errorf("Package override `go_type` specificier '%s' is not the proper format, expected 'package.type', e.g. 'github.com/segmentio/ksuid.KSUID'", o.GoType)
}
lastSlash := strings.LastIndex(o.GoType, "/")
if lastSlash == -1 {
return fmt.Errorf("Package override `go_type` specificier '%s' is not the proper format, expected 'package.type', e.g. 'github.com/segmentio/ksuid.KSUID'", o.GoType)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation is good, but we don't do it for the global overrides struct. Could we move it into a validation function and use it to check the other override sites?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, may have more discussion here after we decide how to unify types

}

func (o *ColumnOverride) Parse() error {
colParts := strings.Split(o.Column, ".")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The column identifier needs to support schema names. For example, the authors table in the library schema would be library.authors.id.


// set up column-based override test
o := ColumnOverride{
GoType: "example.com/pkg.CustomType",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a standard way to refer to types in Go packages? I haven't seen it used before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not particularly well documented, but it is a convention used by a lot of tooling that deals with Go types e.g. guru: golang/go#20024 (comment), k8s: https://github.com/kubernetes/gengo/blob/master/types/types.go#L49

@inconshreveable
Copy link
Contributor Author

Thanks for the first pass. For some reason, the tests did not run. Could you rebase on master and push again? I accidentally merged one of your other changes without a test run and it broke the build.

yep, can rebase

Global and package override configuration should use the same format. The current global override format looks like this:

{
  "go_type": "uuid.UUID",
  "package": "github.com/gofrs/uuid",
  "postgres_type": "uuid"
}

I think we can take what you've added and combine it into

{
  "go_type": "github.com/gofrs/uuid.UUID",
  "table": "authors",
  "column": "id",
  "postgres_type": "uuid"
}

If we change this format, do we want to rev the config version number when we do this? Do we want to support old versions?

Table and column should be split out so that we can parse the table name as a FQN. The validation code will need to be more complex, but this means that you can specify the same overrides in different places.

seems unnecessary to separate column/table? we should just parse schema.table.column or if you specify just table.column we assume public and it becomes public.table.column?

@inconshreveable
Copy link
Contributor Author

last question: are you suggesting both that:

  • column-based overrides should be able to be specified at the global level?
  • AND type-based overrides should be able to be specified at the package level?

i agree that these are both laudable goals but it seems like something that could be pushed into a separate PR. i don't think they should be prerequisites

@kyleconroy
Copy link
Collaborator

seems unnecessary to separate column/table? we should just parse schema.table.column or if you specify just table.column we assume public and it becomes public.table.column?

Table names can actually have three parts catalog.schema.rel. Take a look at ParseRange and ParseList in catalog package. I'd like to keep that parsing consistent, hence the suggestion to split out table and column.

Another option would be to understand how PostgreSQL treats column identifiers. Maybe it allows catalog.schema.rel.column?

i agree that these are both laudable goals but it seems like something that could be pushed into a separate PR. i don't think they should be prerequisites

Yes, I am suggesting both. The reason is that I don't want two ways to do configure overrides, both with different options. A global override should be the same as a package override.

@kyleconroy
Copy link
Collaborator

If we change this format, do we want to rev the config version number when we do this? Do we want to support old versions?

I'm fine just breaking things at this point. The main users of the override functionality is you, me, and https://github.com/kevinburke

@inconshreveable
Copy link
Contributor Author

seems unnecessary to separate column/table? we should just parse schema.table.column or if you specify just table.column we assume public and it becomes public.table.column?

Table names can actually have three parts catalog.schema.rel. Take a look at ParseRange and ParseList in catalog package. I'd like to keep that parsing consistent, hence the suggestion to split out table and column.

Another option would be to understand how PostgreSQL treats column identifiers. Maybe it allows catalog.schema.rel.column?

i'd prefer the second option since it's less options for a user to think about and I don't think we lose any fidelity. rules would be:

x.y -> table.col (we assume default catalog, public schema)
x.y.z -> schema.table.col (we assume default catalog)
w.x.y.z -> catalog.schema.table.col (we assume nothing)

i agree that these are both laudable goals but it seems like something that could be pushed into a separate PR. i don't think they should be prerequisites
Yes, I am suggesting both. The reason is that I don't want two ways to do configure overrides, both with different options. A global override should be the same as a package override.

agreed. how about i just nuke package settings for the time being and move the column-based overrides to global settings for now then? that can be a separate PR if/when needed.

lastly, something we haven't addressed: we're going to need some additional validation that forces "one of" behavior that says "you can't specify both postgres_type and column" or we could separate them into separate objects e.g.

"overrides": {
  "types": [
    "postgres_type": "...",
    "go_type": "..."
  ],
  "columns": [
     "column": "...",
     "go_type": "..."
  ]
}

i think this latter proposal is cleaner, but it's more of a change

@kyleconroy
Copy link
Collaborator

i'd prefer the second option since it's less options for a user to think about and I don't think we lose any fidelity. rules would be: ...

Yeah, those rules make sense to me

lastly, something we haven't addressed: we're going to need some additional validation that forces "one of" behavior that says "you can't specify both postgres_type and column"

I prefer the validation approach.

Often a user may want to override the Go type used for model
generation and query generation, but it's not a 1-to-1 mapping
of SQL types to Go type. Instead, the override to use depends
entirely on the column of a particular table. This commit adds the
functionality to override types of generated columns in models and query
parameters by adding a new 'overrides' configuration settings object
on a per-package basis.

This required tracking down a number of places where the column's Table
FQN wasn't properly propagated through to the the model or query
parameter Column. This change necessitated updating a bunch of tests to
match.

Lastly, I've added a test to prove the functionality works (in addition
to testing it on a large code base) and added a documentation section in
the README.
@inconshreveable
Copy link
Contributor Author

@kyleconroy Rebased on master and all tests pass, I believe this is good to go. I've tested on a larger codebase as well where the overrides are being used and that all works well too.

go test ./...
?   	github.com/kyleconroy/sqlc/cmd/sqlc	[no test files]
ok  	github.com/kyleconroy/sqlc/internal/catalog	(cached)
?   	github.com/kyleconroy/sqlc/internal/cmd	[no test files]
ok  	github.com/kyleconroy/sqlc/internal/dinosql	(cached)
?   	github.com/kyleconroy/sqlc/internal/pg	[no test files]
?   	github.com/kyleconroy/sqlc/internal/postgres	[no test files]

The overrides config structure has been unified and I've added some more validation to make sure it's harder to use it incorrectly.

I decided to keep package-level settings after all, which means there's a bonus new feature where it's possible to override a postgres type on a per-package basis.

README docs have been updated as well to reflect the new functionality.

@kyleconroy kyleconroy merged commit c3f1473 into sqlc-dev:master Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants