-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pgx v5 beta 5 available for testing #1273
Comments
Hello, I just tested beta1 with SFTPGo by replacing lib/pq with pgx, all tests pass, but I'm a little worried about the increase in binary size. I'm using pgx as database/sql driver:
can we at least have a binary size increase as with alpha.1/.2? I'm using pgx as database/sql driver since SFTPGo also supports MySQL/MariaDB and SQLite with the same code, so no pgx specific features are used. Just for reference here are my changes:
Thanks for your hard work, really appreciated! |
Glad it worked! I've looked into the size increase and have found where it is occurring, but its not obvious to me what can or should be done about it. It is the combination of a few things. PostgreSQL supports a simple and an extended protocol with which to execute queries and the extended protocol has several ways it can be used. pgx supports multiple query execution modes In pgx's normal mode of operation it uses the extended protocol and gets the statement description before executing a statement. This means it knows the type of every parameter. This enables one of pgx's most convenient features: you can add support for encoding and decoding 3rd party types without modifying those types. e.g. https://github.com/shopspring/decimal can be used with directly without a wrapper type. This is only possible because it knows the underlying PostgreSQL type. However, pgx also supports the simple protocol which never knows the PostgreSQL parameter types (and one of the extended protocol modes lacks parameter type knowledge as well). These modes are necessary for people behind connection pools such as PgBouncer that do not support prepared statements. In these modes, pgx needs some way of determining what type to encode the parameter into. Thus: https://pkg.go.dev/github.com/jackc/pgx/v5@v5.0.0-beta.1/pgtype#Map.RegisterDefaultPgType. It takes a Go value and associates it with the name of a PostgreSQL type. e.g. m.RegisterDefaultPgType(int32(0), "int4") This builds a map that is used to to decide how to encode parameters. pgx also supports arrays. PostgreSQL arrays are extremely flexible. They support multidimensional arrays, arbitrary lower bounds, and NULL values. However, that is inconvenient for applications that don't need it (which is presumably the overwhelming majority). So pgx provides generic types pgx needs to register all those generic types for the execution modes that lack known parameter types. So it does this for each type: func registerDefaultPgTypeVariants[T any](m *Map, name string) {
arrayName := "_" + name
var value T
m.RegisterDefaultPgType(value, name) // T
m.RegisterDefaultPgType(&value, name) // *T
var sliceT []T
m.RegisterDefaultPgType(sliceT, arrayName) // []T
m.RegisterDefaultPgType(&sliceT, arrayName) // *[]T
var slicePtrT []*T
m.RegisterDefaultPgType(slicePtrT, arrayName) // []*T
m.RegisterDefaultPgType(&slicePtrT, arrayName) // *[]*T
var arrayOfT Array[T]
m.RegisterDefaultPgType(arrayOfT, arrayName) // Array[T]
m.RegisterDefaultPgType(&arrayOfT, arrayName) // *Array[T]
var arrayOfPtrT Array[*T]
m.RegisterDefaultPgType(arrayOfPtrT, arrayName) // Array[*T]
m.RegisterDefaultPgType(&arrayOfPtrT, arrayName) // *Array[*T]
var flatArrayOfT FlatArray[T]
m.RegisterDefaultPgType(flatArrayOfT, arrayName) // FlatArray[T]
m.RegisterDefaultPgType(&flatArrayOfT, arrayName) // *FlatArray[T]
var flatArrayOfPtrT FlatArray[*T]
m.RegisterDefaultPgType(flatArrayOfPtrT, arrayName) // FlatArray[*T]
m.RegisterDefaultPgType(&flatArrayOfPtrT, arrayName) // *FlatArray[*T]
} This means that a lot of code is generated for each type pgx registers (59 at the present) regardless of if the application actually uses it. Removing these saves 2.3 MB (about 1.9 MB of which is array support). pgx tries real hard to have the same behavior regardless of the query execution mode and succeeds in almost all cases. The only way I see to save that 2.3 MB is to move this code to a different package and require the application to perform some additional step if it planned to use one of those non-default query execution modes. It would probably be less than 10 additional lines of code for the affected applications, but it bothers me for it to not "just work" out of the box. But 2.3 MB wasted for most users bothers me to... I don't know... |
@jackc thanks for your detailed explanation and hard work. I want to add a little more context.
Obviously the binary size does not necessarily affect the runtime speed, however to avoid comments like this I pay attention to the binary size. The latest versions of Go (apart from 1.19) have greatly reduced the size of the SFTPGo binary, so the developers of Go seem to be paying attention too.
For example I added SQLState to I think that pgx should "just work" out of box. What do you think about adding a build tag ( I mean something like this https://github.com/drakkan/sftpgo/blob/main/internal/dataprovider/pgsql_disabled.go#L15 so users that don't need these features can decide to add this build tag and reduce the final binary size. This method works fine in SFTPGo, default build:
minimal build:
thanks! |
Thanks! I confirm that it works:
we still have about 2MB more than |
Just tagged Here are the changes since
|
BTW, are there huge performance wins between v4 and v5 ? |
FYI: We ran into this bug introduced in v5: #1281 |
Can there be an optional method that blocks until the We don't want to start our application accepting requests until our pool has at least MinConns. |
@cristaloleg There may be minor changes one way or another but performance should be similar in most cases. |
@jameshartig We could, but this can be done easily in application code with a function that calls Acquire n times and doesn't Release them until it has acquired min conns. |
If I'm understanding the code correctly that would result in more than MinConns to be created since createIdleResources would create MinConns asynchronously but while that's happening Acquire would also potentially create connections if it's called before createIdleResources or before the connections are ready. |
It might, I'd have to look at the code to be sure... but I don't see how that would really matter. When it is done the actual number of connections will be >= minconns and <= maxconns. If you really wanted to be sure you didn't create connections you could loop over a |
Hi @jackc, Regarding the new type system, whats the best implementation flow for handling cases like the following: Example query would be:
where id is a uuid. It also does appear scanning a JSON value in which the JSON value is empty results in a scan fault. Example select data->'property' from table; if err := rows.Scan(&item.Property); err != nil {
return nil, err
} error: This seems to extend into many locations which use to handle null values but no longer, another example is:
error: |
Regarding There are several possible solutions:
row := db.QueryRow(ctx, "select deleted_at from table where id = $1", id)
var timeStamp time.Time
err := row.Scan(&timeStamp)
That is the same behavior as I suspect the JSON example is the same -- trying to scan a SQL |
@jackc Thank you for getting back to me, I did have a quick look into wrapping Additionally I can confirm with the second point that this did work in V4. We did try to upgrade to v4 but the new memory pinning caused us a large amount of memory allocation, in a production environment not rotating the connections caused memory crashes. If this is not meant to work and should have errored in V4 the error is not bubbling up. |
That's not what my test shows: package main
import (
"context"
"log"
"os"
"time"
"github.com/jackc/pgx/v4"
)
func main() {
ctx := context.Background()
conn, err := pgx.Connect(ctx, os.Getenv("DATABASE_URL"))
if err != nil {
log.Fatal(err)
}
defer conn.Close(ctx)
var timeStamp time.Time
err = conn.QueryRow(ctx, "select null::timestamptz").Scan(&timeStamp)
if err != nil {
log.Fatal(err)
}
log.Println(timeStamp)
}
|
@jackc I have some repeatable examples, I should have been more clear as I was focusing around JSON values, sorry. This might be now by design but it requires a lot of annoying tweaks to code like JSON ParsingThe following example works in V4 & V3 but no longer works in V5. The below example shows how JSON values now returned as type Example struct {
id int
stub bool
timestamp time.Time
}
func main() {
var example Example
query := "select id, data->'stub', data->'SourceTimestamp' from example where id=$1"
rows := conn.QueryRow(ctx, query, 1)
scanErr := rows.Scan(&example.id, &example.stub, &example.timestamp)
must(scanErr)
} Schemacreate table example
(
data jsonb,
id serial
constraint example_pk
primary key
);
create unique index example_id_uindex
on example (id); |
@stephensli Thank you. That example was very helpful. I was able to step through the code in In
This leads to this behavior: b := true
err := json.Unmarshal([]byte("null"), &b)
fmt.Println(b, err) // => true <nil> I would have expected an error or for Frankly, this strikes me as wrong behavior especially in the context of of scanning rows. This means that if you were reading and processing row by row into a single record that a value from one row could leak into the next row. e.g. in your structure if row 1 had a So I'm inclined to say that the Not really sure... |
Hi @jackc, I ran into an issue, while upgrading to the latest v5 beta, I've a struct that looks smth like this type RPS struct {
ID string `db:"id"`
Name string `db:"name"`
Timer time.Duration `db:"timer"`
MinEntry decimal.Decimal `db:"min_entry"` // this is using the shopspring decimal lib
} Since upgrading to v5 I get this error on Insert: |
@buni looks like you are missing the adapter: https://github.com/jackc/pgx-shopspring-decimal you can set it up with. import (
"context"
shop "github.com/jackc/pgx-shopspring-decimal"
"github.com/jackc/pgx/v5"
"github.com/deliveroo/co-restaurants/internal/database/types"
)
func AfterConnectHook(_ context.Context, conn *pgx.Conn) error {
shop.Register(conn.TypeMap())
} |
@stephensli and @buni As it happens there was a common fix that solved encoding @stephensli Regarding the json / NULL issue. After considering it more the last couple days I am convinced that the new |
Just tagged Here are the changes since
|
Thank you for the update Jack! |
Hello @jackc, Thanks for your work. I'm trying to update pgxmock to support pgx/v5. The issue is // CommandTag is the status text returned by PostgreSQL for a query.
type CommandTag struct {
s string
} Thus I cannot init the result inside my library to mock it (I cannot create a result expectation as well to compare them later), e.g. func NewResult(op string, rowsAffected int64) pgconn.CommandTag {
return pgconn.CommandTag{fmt.Sprint(op, rowsAffected)}
} I would suggest either declaring CommandTag as // CommandTag is the status text returned by PostgreSQL for a query.
type CommandTag struct {
S string
} or provide a public constructor (now it's private) func (pgConn *PgConn) MakeCommandTag(buf []byte) CommandTag {
return CommandTag{s: string(buf)}
} Kind regards |
Useful for mocking and testing. #1273 (comment)
@pashagolub Fair enough. I added the function |
Just tagged Here are the changes since
Overall, this is a very small change since beta 3. However, any change to tricky concurrent code like the connection pool is worth thorough testing. Assuming no further issues arise this will be the final beta release. |
Not sure if this is a bug or a feature. Consider this: type S struct{
F1 int
F2 string
}
type SuperS struct{
S
F3 bool
} Now, If I'm trying to rows, _ := pge.ConfigDb.Query(ctx, "select 1, 'foo', true")
dest, err := pgx.CollectRows(rows, pgx.RowToStructByPos[SuperS]) I got the error: Should pgx unwrap embedded structs? UPD: rows, _ := pge.ConfigDb.Query(ctx, "select ROW(1, 'foo'), true")
dest, err := pgx.CollectRows(rows, pgx.RowToStructByPos[SuperS]) UPD2: type S struct{
f1 int `db:"int_field"`
f2 string
} UPD3: |
As you found by using I don't think it is possible to do what you originally suggested because somehow we would need to know whether to scan the first field into Also, I'm assuming the private fields are just a mistake in the example code -- obviously they would need to be public for any mapping based on reflection to work at all. |
Most other libraries handle this (mongo, json, etc) by scanning into f1 because S is embedded. If it was |
Yes, my bad. Of course, they should be public. I was shrinking real-life code to the minimal valuable example. :) Thanks for noticing |
Oh, that's interesting. Hadn't thought about that distinction. Surprised to find out that is what the json library does too. More magical than I would have expected. Just added support for this in ee2622a. Didn't really want to make a change this late close to final release, but since changing it is a breaking change it seemed worthwhile. |
Just tagged The only change is |
hey @jackc, thanks a lot for your fix. I want to propose Here is the implementation followed by tests with full coverage: // RowToStructByName returns a T scanned from row. T must be a struct. T must have the same number a named public fields as row
// has fields. The row and T fields will by matched by name.
func RowToStructByName[T any](row pgx.CollectableRow) (T, error) {
var value T
err := row.Scan(&namedStructRowScanner{ptrToStruct: &value})
return value, err
}
// RowToAddrOfStructByPos returns the address of a T scanned from row. T must be a struct. T must have the same number a
// named public fields as row has fields. The row and T fields will by matched by name.
func RowToAddrOfStructByName[T any](row pgx.CollectableRow) (*T, error) {
var value T
err := row.Scan(&namedStructRowScanner{ptrToStruct: &value})
return &value, err
}
type namedStructRowScanner struct {
ptrToStruct any
}
func (rs *namedStructRowScanner) ScanRow(rows pgx.Rows) error {
dst := rs.ptrToStruct
dstValue := reflect.ValueOf(dst)
if dstValue.Kind() != reflect.Ptr {
return fmt.Errorf("dst not a pointer")
}
dstElemValue := dstValue.Elem()
scanTargets, err := rs.appendScanTargets(dstElemValue, nil, rows.FieldDescriptions())
if err != nil {
return err
}
for i, t := range scanTargets {
if t == nil {
return fmt.Errorf("struct doesn't have corresponding row field %s", rows.FieldDescriptions()[i].Name)
}
}
return rows.Scan(scanTargets...)
}
const structTagKey = "db"
func fieldPosByName(fldDescs []pgconn.FieldDescription, field string) (i int) {
i = -1
for i, desc := range fldDescs {
if strings.EqualFold(desc.Name, field) {
return i
}
}
return
}
func (rs *namedStructRowScanner) appendScanTargets(dstElemValue reflect.Value, scanTargets []any, fldDescs []pgconn.FieldDescription) ([]any, error) {
var err error
dstElemType := dstElemValue.Type()
if scanTargets == nil {
scanTargets = make([]any, len(fldDescs))
}
for i := 0; i < dstElemType.NumField(); i++ {
sf := dstElemType.Field(i)
if sf.PkgPath != "" && !sf.Anonymous {
// Field is unexported, skip it.
continue
}
// Handle anoymous struct embedding, but do not try to handle embedded pointers.
if sf.Anonymous && sf.Type.Kind() == reflect.Struct {
scanTargets, err = rs.appendScanTargets(dstElemValue.Field(i), scanTargets, fldDescs)
if err != nil {
return nil, err
}
} else {
dbTag, dbTagPresent := sf.Tag.Lookup(structTagKey)
if dbTagPresent {
dbTag = strings.Split(dbTag, ",")[0]
}
if dbTag == "-" {
// Field is ignored, skip it.
continue
}
colName := dbTag
if !dbTagPresent {
colName = sf.Name
}
fpos := fieldPosByName(fldDescs, colName)
if fpos == -1 || fpos >= len(scanTargets) {
return nil, fmt.Errorf("cannot find field %s in returned row", colName)
}
scanTargets[fpos] = dstElemValue.Field(i).Addr().Interface()
}
}
return scanTargets, err
} Tests: package row_test
import (
"context"
"testing"
"github.com/jackc/pgx/v5"
"github.com/stretchr/testify/assert"
)
func TestRowToStructByNameEmbeddedStruct(t *testing.T) {
type Name struct {
Last string `db:"last_name"`
First string `db:"first_name"`
}
type person struct {
Ignore bool `db:"-"`
unxported bool
Name
Age int32
}
ConfigDb := <init the database connection>
ctx := context.Background()
rows, _ := ConfigDb.Query(ctx, `select 'John' as first_name, 'Smith' as last_name, n as age from generate_series(0, 9) n`)
slice, err := pgx.CollectRows(rows, pgx.RowToStructByName[person])
assert.NoError(t, err)
assert.Len(t, slice, 10)
for i := range slice {
assert.Equal(t, "Smith", slice[i].Name.Last)
assert.Equal(t, "John", slice[i].Name.First)
assert.EqualValues(t, i, slice[i].Age)
}
// check missing fields in a returned row
rows, _ = ConfigDb.Query(ctx, `select 'Smith' as last_name, n as age from generate_series(0, 9) n`)
_, err = pgx.CollectRows(rows, pgx.RowToStructByName[person])
assert.ErrorContains(t, err, "cannot find field first_name in returned row")
// check missing field in a destination struct
rows, _ = ConfigDb.Query(ctx, `select 'John' as first_name, 'Smith' as last_name, n as age, null as ignore from generate_series(0, 9) n`)
_, err = pgx.CollectRows(rows, pgx.RowToAddrOfStructByName[person])
assert.ErrorContains(t, err, "struct doesn't have corresponding row field ignore")
} |
@pashagolub Yes, I'm interested in that feature. But I don't want to add any new features before the release of |
The only changes since
|
I'm pleased to announce the first beta release of pgx v5 is available.
The code is in branch v5-dev. There are a significant number of changes. Check out the changelog to see what's new.
Try it with:
If no major issues are reported I plan to release
v5.0.0
in September.There have only been minor changes since
v5.0.0-alpha.5
.pgxpool.NewConfig()
toNewWithConfig()
pgxpool.Pool.Reset()
(and prerequisite puddle update)sslpassword
support in pgconn from pgx v4The text was updated successfully, but these errors were encountered: