dat
(Data Access Toolkit) is a fast, lightweight and intuitive Postgres
library for Go. dat
likes SQL.
Highlights
-
Ordinal placeholders - friendlier than
?
conn.SQL(`SELECT * FROM people WHERE state = $1`, "CA").Exec()
-
Intuitive - looks like SQL
err := conn. Select("id, user_name"). From("users"). Where("id = $1", id). QueryStruct(&user)
-
Performant
- ordinal placeholder logic has been optimized to be nearly as fast as
?
placeholders dat
can interpolate queries locally before sending to server
- ordinal placeholder logic has been optimized to be nearly as fast as
import (
"database/sql"
"github.com/mgutz/dat"
"github.com/mgutz/dat/sqlx-runner"
_ "github.com/lib/pq"
)
// global connection (pooling provided by SQL driver)
var connection *runner.Connection
func init() {
// create a normal database connection through database/sql
db, err := sql.Open("postgres", "dbname=dat_test user=dat password=!test host=localhost sslmode=disable")
if err != nil {
panic(err)
}
// set this to enable interpolation
dat.EnableInterpolation = true
// set to log SQL, etc
dat.SetVerbose(false)
// set to check things like sessions closing.
// Should be disabled in production/release builds.
dat.Strict = false
conn = runner.NewConnection(db, "postgres")
}
type Post struct {
ID int64 `db:"id"`
Title string `db:"title"`
Body string `db:"body"`
UserID int64 `db:"user_id"`
State string `db:"state"`
UpdatedAt dat.Nulltime `db:"updated_at"`
CreatedAt dat.NullTime `db:"created_at"`
}
func main() {
var post Post
err := conn.
Select("id, title").
From("posts").
Where("id = $1", 13).
QueryStruct(&post)
fmt.Println("Title", post.Title)
}
Query Builder
var posts []*Post
err := conn.
Select("title", "body").
From("posts").
Where("created_at > $1", someTime).
OrderBy("id ASC").
Limit(10).
QueryStructs(&posts)
Plain SQL
conn.SQL(`
SELECT title, body
FROM posts WHERE created_at > $1
ORDER BY id ASC LIMIT 10`,
someTime,
).QueryStructs(&posts)
Note: dat
does not clean the SQL string, thus any extra whitespace is
transmitted to the database.
In practice, SQL is easier to write with backticks. Indeed, the reason this library exists is my dissatisfaction with other SQL builders introducing another domain language or AST-like expressions.
Query builders shine when dealing with data transfer objects, records (input structs).
Query then scan result to struct(s)
var post Post
err := sess.
Select("id, title, body").
From("posts").
Where("id = $1", id).
QueryStruct(&post)
var posts []*Post
err = sess.
Select("id, title, body").
From("posts").
Where("id > $1", 100).
QueryStructs(&posts)
Query scalar values or a slice of values
var n int64
conn.SQL("SELECT count(*) FROM posts WHERE title=$1", title).QueryScalar(&n)
var ids []int64
conn.SQL("SELECT id FROM posts", title).QuerySlice(&ids)
Control which columns get inserted or updated when processing external data
// userData came in from http.Handler, prevent them from setting protected fields
conn.InsertInto("payments").
Blacklist("id", "updated_at", "created_at").
Record(userData).
Returning("id").
QueryScalar(&userData.ID)
// ensure session user can only update his information
conn.Update("users").
SetWhitelist(user, "user_name", "avatar", "quote").
Where("id = $1", session.UserID).
Exec()
applicable when dat.EnableInterpolation == true
Simpler IN queries which expand correctly
ids := []int64{10,20,30,40,50}
b := conn.SQL("SELECT * FROM posts WHERE id IN $1", ids)
b.MustInterpolate() == "SELECT * FROM posts WHERE id IN (10,20,30,40,50)"
dat
was designed to have clear separation between SQL builders and Query execers.
This is why the runner is in its own package.
sqlx-runner
- based on sqlx
Use Returning
and QueryStruct
to insert and update struct fields in one
trip
post := Post{Title: "Swith to Postgres", State: "open"}
err := conn.
InsertInto("posts").
Columns("title", "state").
Values("My Post", "draft").
Returning("id", "created_at", "updated_at").
QueryStruct(&post)
Use Blacklist
and Whitelist
to control which record (input struct) fields
are inserted.
post := Post{Title: "Go is awesome", State: "open"}
err := conn.
InsertInto("posts").
Blacklist("id", "user_id", "created_at", "updated_at").
Record(post).
Returning("id", "created_at", "updated_at").
QueryStruct(&post)
// use wildcard to include all columns
err := sess.
InsertInto("posts").
Whitelist("*").
Record(post).
Returning("id", "created_at", "updated_at").
QueryStruct(&post)
Insert Multiple Records
// create builder
b := conn.InsertInto("posts").Columns("title")
// add some new posts
for i := 0; i < 3; i++ {
b.Record(&Post{Title: fmt.Sprintf("Article %s", i)})
}
// OR (this is more efficient as it does not do any reflection)
for i := 0; i < 3; i++ {
b.Values(fmt.Sprintf("Article %s", i))
}
// execute statement
_, err := b.Exec()
var other Post
err = conn.
Select("id, title").
From("posts").
Where("id = $1", post.ID).
QueryStruct(&other)
Use Returning
to fetch columns updated by triggers. For example,
an update trigger on "updated_at" column
err = conn.
Update("posts").
Set("title", "My New Title").
Set("body", "markdown text here").
Where("id = $1", post.ID).
Returning("updated_at").
QueryScalar(&post.UpdatedAt)
applicable when dat.EnableInterpolation == true
To reset columns to their default DDL value, use DEFAULT
. For example,
to reset payment\_type
res, err := conn.
Update("payments").
Set("payment_type", dat.DEFAULT).
Where("id = $1", 1).
Exec()
Use SetBlacklist
and SetWhitelist
to control which fields are updated.
// create blacklists for each of your structs
blacklist := []string{"id", "created_at"}
p := paymentStructFromHandler
err := conn.
Update("payments").
SetBlacklist(p, blacklist...)
Where("id = $1", p.ID).
Exec()
Use a map of attributes
attrsMap := map[string]interface{}{"name": "Gopher", "language": "Go"}
result, err := conn.
Update("developers").
SetMap(attrsMap).
Where("language = $1", "Ruby").
Exec()
result, err = conn.
DeleteFrom("posts").
Where("id = $1", otherPost.ID).
Limit(1).
Exec()
All queries are made in the context of a session which are acquired from the underlying SQL driver's pool
For one-off operations, use a Connection
directly
// a global connection usually created in `init`
var conn *dat.Connection
conn = runner.NewConnection(db, "postgres")
err := conn.SQL(...).QueryStruct(&post)
For multiple operations, create a session. Note that session
is really a transaction due to database/sql
connection pooling.
Session.AutoCommit() or Session.AutoRollback()
MUST be called
func PostsIndex(rw http.ResponseWriter, r *http.Request) {
sess := conn.NewSession()
defer sess.AutoRollback()
// Do queries with the session
var post Post
err := sess.Select("id, title").
From("posts").
Where("id = $1", post.ID).
QueryStruct(&post)
)
if err != nil {
// `defer AutoRollback()` is used, no need to rollback on error
r.WriteHeader(500)
return
}
// do more queries with session ...
// MUST commit or AutoRollback() will rollback
sess.Commit()
}
applicable when dat.EnableInterpolation == true
dat
provides often used constants in SQL statements
dat.DEFAULT
- insertsDEFAULT
dat.NOW
- insertsNOW()
UnsafeStrings and constants will panic unless dat.EnableInterpolation == true
To define SQL constants, use UnsafeString
const CURRENT_TIMESTAMP = dat.UnsafeString("NOW()")
conn.SQL("UPDATE table SET updated_at = $1", CURRENT_TIMESTAMP)
UnsafeString
is exactly that, UNSAFE. If you must use it, create a
constant and NEVER use UnsafeString
directly as an argument. This
is asking for a SQL injection attack
conn.SQL("UPDATE table SET updated_at = $1", dat.UnsafeString(someVar))
Load scalar and slice values.
var id int64
var userID string
err := conn.
Select("id", "user_id").From("posts").Limit(1).QueryScalar(&id, &userID)
var ids []int64
err = conn.Select("id").From("posts").QuerySlice(&ids)
// Columns are mapped to fields breadth-first
type Post struct {
ID int64 `db:"id"`
Title string `db:"title"`
User *struct {
ID int64 `db:"user_id"`
}
}
var post Post
err := conn.
Select("id, title, user_id").
From("posts").
Limit(1).
QueryStruct(&post)
// dat.Null* types serialize to JSON properly
post := Post{ID: 1, Title: "Test Title"}
jsonBytes, err := json.Marshal(&post)
fmt.Println(string(jsonBytes)) // {"id":1,"title":"Test Title","created_at":null}
// Start transaction
tx, err := conn.Begin()
if err != nil {
return err
}
// safe to call tx.Rollback() or tx.Commit() when deferring AutoCommit()
defer tx.AutoCommit()
// AutoRollback() is also available if you would rather Commit() at the end
// and not deal with Rollback on every error.
// Issue statements that might cause errors
res, err := tx.
Update("posts").
Set("state", "deleted").
Where("deleted_at IS NOT NULL").
Exec()
if err != nil {
tx.Rollback()
return err
}
TL;DR: Interpolation avoids prepared statements and argument processing.
Interpolation is DISABLED by default. Set dat.EnableInterpolation = true
to enable.
dat
can interpolate locally to inline query arguments. Let's start with a
normal SQL statements with arguments
db.Exec(
"INSERT INTO (a, b, c, d) VALUES ($1, $2, $3, $4)",
[]interface{}[1, 2, 3, 4],
)
The driver first asks the database to create a prepared statement for the INSERT statement, then it will send the arguments with the prepared statement to the database
In contrast, dat
can interpolate the statement locally resulting in
a SQL statement with often no arguments. The code above results in
this interpolated SQL
"INSERT INTO (a, b, c, d) VALUES (1, 2, 3, 4)"
Some of the reasons you might want to use interpolation:
- Performance improvement
- Debugging is simpler with interpolated SQL
- Use SQL constants like
NOW
andDEFAULT
- Expand placeholders with expanded slice values
$1 => (1, 2, 3)
[]byte
, []*byte
and any unhandled values are passed through to the
driver when interpolating.
Postgres 9.1+ does not allow any escape sequences by default. See String Constants with C-style Escapes. In short, all backslashes are treated literally not as escape sequences.
It's rare to need backslashes to represent special characters in user input. Do
you trust users to enter C-like expressions? dat
only escapes apostrophes to
double apostrophes, eg "Go's world"
becomes 'Go''s world'
.
As an added safety measure, dat
checks the Postgres database
standard_conforming_strings
setting value on a new connection when
dat.EnableInterpolation == true
. If standard_conforming_strings != "on"
you
should either set it to "on"
or disable interpolation. dat
will panic if
you try to use interpolation with an unsafe setting.
Here is a comment from lib/pq connection source, which was prompted by me asking why was Python's psycopg2 so much faster in my benchmarks a year or so back:
// Check to see if we can use the "simpleExec" interface, which is
// *much* faster than going through prepare/exec
if len(args) == 0 {
// ignore commandTag, our caller doesn't care
r, _, err := cn.simpleExec(query)
return r, err
}
That snippet bypasses the prepare/exec roundtrip to the database.
Keep in mind that prepared statements are only valid for the current session and unless the same query is be executed MANY times in the same session there is little benefit in using prepared statements other than protecting against SQL injections. See Interpolation Safety below.
Interpolation also offloads dabatabase workload to your application servers. The database does less work and less network chatter when interpolation is performed locally. It's usually much more cost effective to add application servers than to vertically scale a database server.
- Dat2 - mgutz/dat runner with 2 args
- Sql2 - database/sql with 2 args
- Sqx2 - jmoiron/sqlx with 2 args
Replace 2 with 4, 8 for variants of argument benchmarks. All source is under sqlx-runner/benchmark*
This benchmark compares the time to execute an interpolated SQL statement with zero args against executing the same SQL statement with args.
BenchmarkExecSQLDat2 5000 208345 ns/op 280 B/op 10 allocs/op
BenchmarkExecSQLSql2 5000 298789 ns/op 881 B/op 30 allocs/op
BenchmarkExecSQLSqx2 5000 296948 ns/op 881 B/op 30 allocs/op
BenchmarkExecSQLDat4 5000 210759 ns/op 296 B/op 10 allocs/op
BenchmarkExecSQLSql4 5000 306558 ns/op 978 B/op 35 allocs/op
BenchmarkExecSQLSqx4 5000 305569 ns/op 978 B/op 35 allocs/op
The logic is something like this
// already interpolated
for i := 0; i < b.N; i++ {
conn.Exec("INSERT INTO t (a, b, c, d) VALUES (1, 2, 3 4)")
}
// not interpolated
for i := 0; i < b.N; i++ {
db.Exec("INSERT INTO t (a, b, c, d) VALUES ($1, $2, $3, $4)", 1, 2, 3, 4)
}
To be fair, this benchmark is not meaningful. It does not take into account the time to perform the interpolation. It is only meant to show that interpolated queries avoid the overhead of arguments and skip the prepared statement logic in the underlying driver.
This benchmark compares the time to build and execute interpolated SQL statement resulting in zero args against executing the same SQL statement with args.
BenchmarkBuildExecSQLDat2 5000 215449 ns/op 832 B/op 21 allocs/op
BenchmarkBuildExecSQLSql2 5000 296281 ns/op 881 B/op 30 allocs/op
BenchmarkBuildExecSQLSqx2 5000 296259 ns/op 881 B/op 30 allocs/op
BenchmarkBuildExecSQLDat4 5000 221287 ns/op 1232 B/op 26 allocs/op
BenchmarkBuildExecSQLSql4 5000 305807 ns/op 978 B/op 35 allocs/op
BenchmarkBuildExecSQLSqx4 5000 305671 ns/op 978 B/op 35 allocs/op
BenchmarkBuildExecSQLDat8 5000 254252 ns/op 1480 B/op 33 allocs/op
BenchmarkBuildExecSQLSql8 5000 347407 ns/op 1194 B/op 44 allocs/op
BenchmarkBuildExecSQLSqx8 5000 346576 ns/op 1194 B/op 44 allocs/op
The logic is something like this
// dat's SQL interpolates the statement then executes it
for i := 0; i < b.N; i++ {
conn.SQL("INSERT INTO (a, b, c, d) VALUES ($1, $2, $3, $4)", 1, 2, 3, 4).Exec()
}
// non interpolated
for i := 0; i < b.N; i++ {
db.Exec("INSERT INTO (a, b, c, d) VALUES ($1, $2, $3, $4)", 1, 2, 3, 4)
}
The results suggests that local interpolation is both faster and does less allocations. Interpolation comes with a cost of more bytes used as it has to inspect the args and splice them into the statement.
database/sql when presented with arguments prepares a statement on the connection by sending it to the database then using the prepared statement on the same connection to execute the query. Keep in mind, these benchmarks are local so network latency is not a factor which would favor interpolation even more.
This benchmark compares the performance of interpolation within a transaction on "level playing field" with database/sql. As mentioned in a previous section, prepared statements MUST be prepared and executed on the same connection to utilize them.
BenchmarkTransactedDat2 10000 111959 ns/op 832 B/op 21 allocs/op
BenchmarkTransactedSql2 10000 173137 ns/op 881 B/op 30 allocs/op
BenchmarkTransactedSqx2 10000 175342 ns/op 881 B/op 30 allocs/op
BenchmarkTransactedDat4 10000 115383 ns/op 1232 B/op 26 allocs/op
BenchmarkTransactedSql4 10000 182626 ns/op 978 B/op 35 allocs/op
BenchmarkTransactedSqx4 10000 181641 ns/op 978 B/op 35 allocs/op
BenchmarkTransactedDat8 10000 145419 ns/op 1480 B/op 33 allocs/op
BenchmarkTransactedSql8 10000 221476 ns/op 1194 B/op 44 allocs/op
BenchmarkTransactedSqx8 10000 222460 ns/op 1194 B/op 44 allocs/op
The logic is something like this
// dat interpolates the statement then execute it as part of transaction
tx := conn.Begin()
defer tx.Commit()
for i := 0; i < b.N; i++ {
tx.SQL("INSERT INTO (a, b, c, d) VALUES ($1, $2, $3, $4)", 1, 2, 3, 4).Exec()
}
// non-interpolated
tx = db.Begin()
defer tx.Commit()
for i := 0; i < b.N; i++ {
tx.Exec("INSERT INTO (a, b, c, d) VALUES ($1, $2, $3, $4)", 1, 2, 3, 4)
}
Again, interpolation seems faster with less allocations. The underlying driver still has to process and send the arguments with the prepared statement name. I expected database/sql to better interpolation here. Still thinking about this one.
import "github.com/mgutz/dat"
builder := dat.Select("*").From("posts").Where("user_id = $1", 1)
// Get builder's SQL and arguments
sql, args := builder.ToSQL()
fmt.Println(sql) // SELECT * FROM posts WHERE (user_id = $1)
fmt.Println(args) // [1]
// Use raw database/sql for actual query
rows, err := db.Query(sql, args...)
// Alternatively build the interpolated sql statement
sql, args := builder.MustInterpolate()
if len(args) {
rows, err = db.Query(sql)
} else {
rows, err = db.Query(sql, args...)
}
To setup the task runner and create database
# install godo task runner
go get -u gopkg.in/godo.v1/cmd/godo
# install dependencies
cd tasks
go get -a
# back to root and run
cd ..
Then run any task
# (re)create database
godo createdb
# run tests
godo test
# run benchmarks
godo bench
# see other tasks
godo
When createdb prompts for superuser, enter superuser like 'postgres' to create the test database. On Mac + Postgress.app use your own user name and password.
- more tests
- hstore query suppport
- stored procedure support