Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres: support a connection string DSN #435

Closed
wants to merge 3 commits into from

Conversation

gregwebs
Copy link

use DATABASE_DSN and DBMATE_DRIVER
The format of DATABASE_DSN is defined under connection string here: https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING

The DATABASE_URL is problematic because it is parsed to a URL in Go. This requires encoding the password for the URL which does not necessarily get decoded properly.

It is simpler to not alter the parameters to fit them into a url: that is what DATABASE_DSN does. Currently it will fail if a configuration parameter has a space. However, spaces are normally not used for any of the parameters. And that limitation can be fixed by taking on a DSN parser dependency.

Testing

Unit tests for DSN manipulation.
Manual tests were done to see that DATABASE_DSN works with dbmate status. I am using it for other commands now as well.

gregwebs and others added 2 commits April 20, 2023 20:37
use DATABASE_DSN and DBMATE_DRIVER
The format of DATABASE_DSN is defined under connection string here: https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING

The DATABASE_URL is problematic because it is parsed to a URL in Go. This requires encoding the password for the URL which does not necessarily get decoded properly.

It is simpler to not alter the parameters to fit them into a url: that is what DATABASE_DSN does.
Currently it will fail if a configuration parameter has a space. However, spaces are normally not used for any of the parameters. And that limitation can be fixed by taking on a DSN parser depdendency.
Copy link
Collaborator

@dossy dossy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly not a fan of introducing dbutil.DSN at all.

First, the acronym DSN in this context is short for "Data Source Name" which is just the name of the data source being defined. What is actually being stored in this structure is an ODBC connection string. A connection string can contain the key-value "DSN" which names the data source defined by the connection string.

Second, I dislike the internals of dbmate needing to know the difference between a database URL and this new DSN structure. It feels like this change is moving against maintaining separation of concerns and the Law of Demeter here.

My preference would be to only add one new function that can parse the ODBC-style connection string and return a JDBC-style database URL from it, and the rest of dbmate can remain unchanged, as it doesn't need to know anything about connection strings or how they're parsed. The rest of the dbmate implementation only needs to know how to handle database URLs.

Is there something that can currently only be specified in the database connection string format that cannot be encoded as a database URL? Could you provide an example?

@@ -196,6 +196,8 @@ $ dbmate -u "postgres://postgres@127.0.0.1:5432/myapp_test?sslmode=disable" up

The only advantage of using `dbmate -e TEST_DATABASE_URL` over `dbmate -u $TEST_DATABASE_URL` is that the former takes advantage of dbmate's automatic `.env` file loading.

Some drivers (currently just Postgres) support specifying a DSN string instead of a url by using `DATABASE_DSN` instead of `DATABBASE_URL`. When using `DATABASE_DSN` you can specify the driver with `DBMATE_DRIVER`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix: DATABBASE_URL -> DATABASE_URL

Comment on lines +259 to +260
if dsn := os.Getenv("DATABASE_DSN"); dsn != "" {
driver := os.Getenv("DBMATE_DRIVER")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of hard-coding the environment variable names here.

I would prefer new CLI args be introduced, with this as a suggested implementation:

--dsn value          specify the database DSN
--env-dsn value      specify an environment variable containing the database DSN (default: "DATABASE_DSN")
--driver value       specify the database driver when DSN is used
--env-driver value   specify an environment variable containing the database driver (default: "DBMATE_DRIVER")

if db.DatabaseURL == nil || db.DatabaseURL.Scheme == "" {
return nil, ErrInvalidURL
if db.DatabaseDSN != nil {
scheme = strings.ToLower(db.DatabaseDSN.Driver())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike converting the case at usage time, and would rather that be done in one single place, in dbutil.NewDSN().

postgresURL.Path = "postgres"
return sql.Open("postgres", postgresURL.String())
} else if drv.databaseDSN != nil {
_ = drv.databaseDSN.SetKey("dbname=", dsnPostgresDB)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the user wants to specify an alternative database to use in their session? Is this always overwriting dbname in the DSN with the string stored in dsnPostgresDB?

Comment on lines +424 to +425
// no schema specified with table name, try search path if available
schema = drv.takeParam("search_path")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to implementing DSN support, and should probably be in a separate PR.

Comment on lines +401 to +414
func (drv *Driver) takeParam(param string) string {
if drv.databaseURL != nil {
searchPath := strings.Split(drv.databaseURL.Query().Get(param), ",")
u := dbutil.MustParseURL(connectionString(drv.databaseURL))
query := u.Query()
query.Del(param)
u.RawQuery = query.Encode()
drv.databaseURL = u
return strings.TrimSpace(searchPath[0])
}

return drv.databaseDSN.DeleteKey("search_path")
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to implementing DSN support, and should probably be in a separate PR.

Comment on lines +41 to +49
const (
envUser = "PGUSER"
envPassword = "PGPASSWORD"
envDatabase = "PGDATABASE"
envHost = "PGHOST"
envPort = "PGPORT"
envSSLMode = "PGSSL"
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble finding where in the code these constants are being used. If they're unused, then add them?

@gregwebs
Copy link
Author

@dossy thanks for attending to pull requests. I am still working on this pull request.

At a high level it seems that the point of this pull request is still not understood.
I don't see the point in doing a code review if the purpose is not understood.
As per the description:

The DATABASE_URL is problematic because it is parsed to a URL in Go. This requires encoding the password for the URL which does not necessarily get decoded properly.

I had an auto-generated (strong) db password that did not work due to the URL encoding issues (yes I did try encoding it). Whereas the key=value connection string just works. So parsing it into a URL would defeat the purpose. Another approach could be to not parse the url but just leave it as a string (Postgres doesn't actually want it to be url encoded), but this would be a big change that would require changing every driver. And it is still worse than key=value because '@' still needs to get escaped in a password. Whereas I am quite happy to not put any spaces in my configuration.

@amacneil
Copy link
Owner

I don't think I understand the problem this PR is solving. You need to encode special characters in the password using URL encoding, that should be very simple in any programming language. Or you could instead generate a random strong password using URL-safe characters. Either option is fine.

Any connection string format will need to escape special characters (e.g. if your password contains a space such as 3jrk foo=bar will it break your parser implementation?).

If you are having problems with URL encoding please post details and we can help you.

@dossy
Copy link
Collaborator

dossy commented Apr 21, 2023

[...] I am still working on this pull request.

If this pull request isn't ready for review, I suggest you mark it as such by converting it to a draft pull request.

I had an auto-generated (strong) db password that did not work due to the URL encoding issues (yes I did try encoding it). Whereas the key=value connection string just works. So parsing it into a URL would defeat the purpose. [...]

I would appreciate seeing an example password that cannot be used by dbmate when URL-encoded correctly.

As an experiment, I gave this a try:

# start a PostgreSQL 10 test server
$ docker run --rm --name postgres-test -d -p 5432:5432 -e POSTGRES_PASSWORD=postgres postgres:10
b0ce2a7d9cf4d7b80e5721e1ff5e32a82c9108a1c580d68b5cb5122a5b95ba94

# generate a password of all printable ASCII characters
$ printables=$(i=33; while [ $i -le 126 ]; do printf "\x$(printf %x $i)"; test $i -eq 79 && printf " "; i=$(( i + 1 )); done)

# confirm that our password contains what we think it does
$ echo $printables
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO PQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~

# encrypt the password for PostgreSQL
$ printf "%s%s" "$printables" test | md5sum
16d8f2a726c35fe08c61fe9c780c6562  -

# create our test user with our all-printable-ASCII password
$ PGPASSWORD=postgres psql -h localhost -p 5432 -U postgres -c "CREATE USER test WITH ENCRYPTED PASSWORD 'md516d8f2a726c35fe08c61fe9c780c6562';"
CREATE ROLE

# confirm that we can connect to our test user with psql
$ PGPASSWORD="$printables" psql -h localhost -p 5432 -U test -d postgres -c "SELECT current_user;"
 current_user 
--------------
 test
(1 row)

# let's try connecting using database URL from dbmate with an incorrect password
$ docker run --rm --name dbmate --link postgres-test:db amacneil/dbmate --url "postgres://test:test@db:5432/postgres?sslmode=disable" status
Error: pq: password authentication failed for user "test"

# now, let's encode the password
$ export printables
$ node -e 'process.stdout.write(process.env.printables);'
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNO PQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
$ encoded="$(node -e 'process.stdout.write(encodeURIComponent(process.env.printables));')"
$ echo $encoded
!%22%23%24%25%26'()*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNO%20PQRSTUVWXYZ%5B%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D~

# and try again
$ docker run --rm --name dbmate --link postgres-test:db amacneil/dbmate --url "postgres://test:$encoded@db:5432/postgres?sslmode=disable" status
Error: could not find migrations directory `./db/migrations`

# okay, let's create a migration
$ mkdir -p db/migrations
$ printf -- "-- migrate:up\nselect current_user;\n-- migrate:down\n" > db/migrations/1_test.sql

# and use it
$ docker run --rm --name dbmate --link postgres-test:db --volume $(pwd)/db:/db amacneil/dbmate --url "postgres://test:$encoded@db:5432/postgres?sslmode=disable" status
[ ] 1_test.sql

Applied: 0
Pending: 1

# and run it
$ docker run --rm --name dbmate --link postgres-test:db --volume $(pwd)/db:/db amacneil/dbmate --url "postgres://test:$encoded@db:5432/postgres?sslmode=disable" up --verbose
Applying: 1_test.sql
Rows affected: 1
Writing: ./db/schema.sql

# shut down our test postgres instance
$ docker stop postgres-test

dbmate has no problem connecting to PostgreSQL with a password containing any of the printable ASCII characters in it, if the password is properly encoded.

One Go function that parses a ODBC-style connection string and produces a properly encoded database URL is all that's needed in order for dbmate to support ODBC-style connection strings.

@gregwebs
Copy link
Author

I was able to get the password encoding working (in the encoding process I accidentally introduced a newline). It's still a hassle though and key=value would simplify things since it's very rare to want a space, but very common to want a url encoded character in a password.

I will close this PR though since the maintainers don't like it. There are still a lot of other obviously good PRs that need to get merged- better to spend energy there.

@gregwebs gregwebs closed this Apr 21, 2023
@dossy
Copy link
Collaborator

dossy commented Apr 21, 2023

I do think adding --connection-string value and parsing the ODBC-style connection string and returning a well-formed database URL is a good idea, and would provide you the functionality that you were looking for in this PR.

The difficulty here is that, unfortunately, PostgreSQL connection strings do not follow the ODBC specification for connection strings. 😞

This means having a single connection string parsing implementation won't be possible, and it'll need to be per-driver, and we'll need something like --driver value to specify which driver the connection string is intended for.

I captured these details in a new issue so when someone does go to implement connection string support, these details aren't overlooked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants