Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database/sql: missing escape functions #18478

Open
nefthy opened this issue Dec 30, 2016 · 45 comments
Open

database/sql: missing escape functions #18478

nefthy opened this issue Dec 30, 2016 · 45 comments

Comments

@nefthy
Copy link

nefthy commented Dec 30, 2016

What version of Go are you using (go version)?

go version go1.7.4 linux/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/nefthy/go-test/"
GORACE=""
GOROOT="/usr/lib/go"
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
CC="x86_64-pc-linux-gnu-gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/home/nefthy/go-test/tmp/go-build451484149=/tmp/go-build -gno-record-gcc-switches"
CXX="x86_64-pc-linux-gnu-g++"
CGO_ENABLED="1"

What did you do?

There are situations when strings need to be escaped in queries that can not be done with placeholders. An example the following queries cannot be expressed with ? placeholders:

SELECT id, ? FROM table
-- Must be escaped as an identifier
SELECT id FROM ?
-- Also identifier quoting
SELECT id FROM table WHERE ? LIKE ?
-- With either the first or second parameter being a column reference

Using Sprintf is no option, since the identifiers need to be properly quoted. The quoting and escaping is inherently vendor specific and may even depend on configuration on a per database/connection basis (hello there MySql...).

What did you expect to see?

The driver must export Quoting which are passed along by the database/sql Api. As far as I can tell the folling functions are needed

  • QuoteString: quotes and escapes a string so it can be used as a string literal (ex: mysql_real_escape_string)
  • QuoteIdentifier: quote and escapes a string so it can be used as an identifier*
  • QuoteBinary: quote and escapes binary data (ex: PQescapeBytea)
  • I am not sure if all identifiers are quoted consistently among all Databases. It might be that separate functions are needed depending on the type of the identifier.

What did you see instead?

No escaping/quoting functions

@kardianos
Copy link
Contributor

There isn't anything here that must live in database/sql to be useful. What you are talking about are primitives needed for a query builder. I also build queries programmatically and agree, that QuoteIdentifier and QuoteValue are two functions which are useful in something like this.

If you are interested in this, I would create a new public go package with something like this:

package sqlutil

type Quoter interface {
    QuoteIdentifier(name string) string
    QuoteValue(v interface{}) string // operates on both string and []byte and int or other types.
}

func DBQuote(d driver.Driver) Quoter {
    if q, is := d.(Quoter); is {
        return q
    }
    // TODO: hard-code some known driver types that you want to support
}

At that point you could suggest to drivers that they upstream your Quoter interface.

You would use the Quoter interface like:

var q Quoter = // Get from db type.
sql := fmt.Sprintf(`select %[1]s  from %[2]s where Cats = %[3]s;`,
    QuoteIdentifier("Local ID"), QuoteIdentifier("My Table"),
    QuoteValue("mice's"),
)
// sql = `select [Local ID] from [My Table] where Cats = 'mice''s';` // SQL Server
// sql = `select "Local ID" from "My Table" where Cats = 'mice''s';` // Postgresql

In other words, it would always be a pre-processor step to the actual DB.Query methods. I do agree MySQL is especially bad here.

I can't really see this going into the std library. Have you talked to any sql driver maintainers about this or do you have any issues you can link here?

@nefthy
Copy link
Author

nefthy commented Dec 31, 2016

I would argue to the contrary. Escaping primitives are needed for working securely with databases and they are inherently dependent on the SQL-server behind the connection. A package using database.sql might not even know what server it is talking to and how to properly quote and escape for that server, if it just gets passed a reference.

As is, database.sql can only be used for queries known ahead of time and under the constraint, that the only dynamic entities in the query are value and never columns (think ORDER BY <user input>) or tables

@kardianos
Copy link
Contributor

database/SQL abstracts the database interface, not the query text. You can build up the query text today with go packages that exist today.

@nefthy
Copy link
Author

nefthy commented Dec 31, 2016

quoting query text is part of the database interface.

@nefthy
Copy link
Author

nefthy commented Dec 31, 2016

placeholders are also abstracted. Native postgres uses $<number> not ?, oracle :<name>

@dominikh
Copy link
Member

Placeholders are not abstracted. database/sql itself doesn't care at all, it just passes the query to a driver. And most drivers I know of don't abstract it away, either, requiring you to use $1 for PostgreSQL and ? for SQLite.

Similarly, quoting arguments would depend on the specific DBMS.

@nefthy
Copy link
Author

nefthy commented Dec 31, 2016

So proper quoting is impossible, since neither database.sql nor database.sql.driver export quoting functions and when I am sitting behind those apis my best bet is type-switching the driver to find out what I am talking to and worst case implementing that my self? That does not look solid to me.

@bradfitz
Copy link
Contributor

Maybe the current situation is not ideal, but there have been no solid proposals yet that consider all popular databases. If the driver community wants to put together a proposal, we're all ears.

@ghost
Copy link

ghost commented Mar 11, 2017

Perl/DBI does this right. Golang needs to do the same.

@bradfitz
Copy link
Contributor

@caseyallenshobe, see https://golang.org/wiki/NoMeToo

If you have a proposal, please propose.

@ghost
Copy link

ghost commented Mar 11, 2017

I didn't say "me too", actually. I referenced a good implementation. Check out DBI and quote_identifier / quote_literal functions. Just like database/sql, DBI leaves platform-specific implementation up to the drivers.

@bradfitz
Copy link
Contributor

That's more helpful. For what it's worth, I lived and breathed Perl and Perl's DBI for a decade+, and that's what inspired my writing the database/sql package.

I'll leave this to @kardianos, who's taken over maintenance of the package.

kardianos added a commit to golang-sql/sqlexp that referenced this issue Mar 30, 2017
Quoter defines an interface to safely quote identifiers and values.

Updates golang/go#18478
@kardianos
Copy link
Contributor

I'm generally in favor of doing this. However, I'm not sure if or where in the std repo it would live. For now I've implemented a first pass https://godoc.org/github.com/golang-sql/sqlexp .

@nefthy
Copy link
Author

nefthy commented Mar 31, 2017

MySQL has configuration options for quoting, so the quoting is also dependent on the connection.

As a user I would expect the function to be on the sql.DB object, since I probably already have a reference to it when I am constructing queries and sql.DB already has all the information to set up the quoting. Thats also where you find them in similar packages for other languages like Perl/DB, PHP/PDO, PHP/Mysqli.

@kardianos
Copy link
Contributor

@nefthy Good point about MySQL especially. I've updated the calls to be network friendly.

You can get the driver from sql.DB.Driver() so while it isn't a direct method from it in this shim, you can use in in a similar way right now.

Here's what I'd like to see:

  1. evolve and verify the API that is in sqlexp,
  2. get a few drivers to implement it (they usually appreciate PRs),
  3. then if it makes sense (which I think it would at that point), bring it into the std lib.

Perhaps it would be good to combine the escaper functions with the database name function. We'd probably want to make it easy to expand it in the future as well, which would mean changing it into a struct with methods.

@hareesh-blippar
Copy link

This is especially problematic for batch inserts, since database/sql doesn't currently support batch inserts, and the only way to do them today is by constructing the raw query string.

@ghost
Copy link

ghost commented Apr 13, 2017

Likewise for database-side arrays and variable-amount IN clauses.

@kardianos
Copy link
Contributor

@hareesh-blippar Could you give an example of what you mean by batch inserts, like COPY IN or Bulk Copy?

@caseyallenshobe database side arrays will be handled in a different way, by adding support for them.

@hareesh-blippar
Copy link

hareesh-blippar commented Apr 13, 2017

@kardianos For example,
INSERT INTO mytable (a,b,c) VALUES (a1,b1,c1), (a2,b2,c2), (a3,b3,c3), ...... (an,bn,cn);

@ghost
Copy link

ghost commented Apr 13, 2017

Batch inserts would be inserting multiple rows with a single statement, I believe.

And yes, Go 1.9 promises something better, as does sqlx today. My company is still stuck on Go 1.4.3 and upgrades slowly, so it will be quite a while for us yet...

Forced needs for implementing something to properly quote to overcome some shortcoming aren't the only reason either...just a good example. :)

@kardianos
Copy link
Contributor

@hareesh-blippar I don't understand why database/sql doesn't understand multi-row value insert today. You would need to add those as parameters (which would be great for a little 10 line function to do, but other then that, it should be easy enough... Are you expecting a different API to support this?

@caseyallenshobe Well, your situation would give additional weight to develop this out of tree first, then anyone can use it regardless of go version.

@hareesh-blippar
Copy link

@kardianos is that possible with database/sql today when the value of n in my example is variable and in the order of thousands?

@kardianos
Copy link
Contributor

kardianos commented Apr 14, 2017

I would think so. SQL Server can only insert 1000 rows with that syntax, so built a little helper around it to break it up into statements of 1000. So I don't see why not.

@ghost
Copy link

ghost commented May 6, 2019

One case not yet discussed: Generating SQL without executing. For a patch generator I am in dire need of the equivalent mysql_real_escape in Go (no, mysql_dump is insufficient).

As it stands, I have two options:

  1. Write a module that interfaces with libmysql.so.
  2. Write my own implementation of mysql_real_escape that worksformebutnotanyoneelse™.

Both are not very attractive, I guess I'll go with the latter option and keep fixing it till it stops breaking?

@methane
Copy link
Contributor

methane commented May 7, 2019

Even though MySQL behavior is different ANSI_QUOTES, "You can still use to quote identifiers with this mode enabled." We can use backslash always. It meansQuoteIdentifier` doesn't need connection.
It can be added to Driver directly, instead of connection.

On the other hand, QueryString can not chose how to escape backslash without
NO_BACKSLASH_ESCAPE variable.
But if we want to add it on Driver, I think it's OK to assume NO_BACKSLASH_ESCAPE
is disabled.

BTW, if we add QuoteString, why no FormatDate, FormatDecimal, FormatFloat, FormatGeometry,
FormatJSON, etc, etc...?
I am very unsure about database/sql/driver should provide such helper functions to build SQL.
How about API like DB.FormatSQL(sqlWithPlaceHolder string, args... driver.Value) (string, error)?

Note that QuoteIdentifier is special, because we can not use placeholder for it on some RDB including MySQL. So QuoteIdentifier will be worth enough even if DB.FormatSQL is added.

@dbuteyn You can copy escaping code from go-sql-driver/mysql.

@ghost
Copy link

ghost commented May 8, 2019

@methane Thanks, its actually the first place I went to but I couldn't find what I was looking for at a glance. A string replace was sufficient for my needs and I didn't want to waste any more time. Obviously what I did is in no way correct and won't work for others.

@methane
Copy link
Contributor

methane commented May 8, 2019

@David00
Copy link

David00 commented Jan 5, 2020

I'm new to Go so my apologies if this is extraordinarily obvious to others, but I have managed to use Sprintf to create a string for the query containing a dynamic table name for Sqlite3.

import (
        _ "github.com/mattn/go-sqlite3"
    )

func main() {
            
	var result string
	tableName := "testTable"
	value := "10"
	dynamicQuery := fmt.Sprintf("select myField from '%s' where myValue = '%s'", tableName, value)

        database, _ := sql.Open("sqlite3", "/path/to/my/Sqlite3.db")
	rows, err := database.Query(dynamicQuery)
	defer rows.Close()
	if err != nil {
            fmt.Println("Error running db Query.")
	    fmt.Println(err)
	} else {
            for rows.Next() {
                err := rows.Scan(&result)
                if err != nil {
                    fmt.Println("Error scanning row")
                }
	        fmt.Printf("Table result is %v", result)
	    }
	}
}

This has allowed me to query my Sqlite3 database using a dynamic table name which comes from elsewhere in my program (never as user input). Hope this helps someone though I understand this discussion is more tailored around larger DB systems. Who knows, it might work with them too!

@cben
Copy link

cben commented Apr 30, 2020

Reading this, I initially thought only "identifier.quoting" is non-portable.
'string quoting with '' doubling' should be 100% standard, going back at least to SQL-92 standard, right??

Alas, no. For example MySQL by default interprets backlash escapes inside 'strings', which violates the standard 🤦, unless one enables NO_BACKSLASH_ESCAPES.
(compare to PostgreSQL doing it better, adding a new E'strings with escapes' syntax).

Still, I wonder — do all DBs have a set of flags that make them parse literals according to standard?
If yes, maybe database/sql could offer a common interface to "give me a connection that respects standard syntax [as much as possible]"?

@kardianos
Copy link
Contributor

@cben Nope, this isn't a thing. Many databases don't have such a setting, and I wouldn't recommend such a setting if there was one. Many people write SQL that make really important assumptions about quoting.

The interface I would recommend would be something like this:
https://pkg.go.dev/github.com/golang-sql/sqlexp?tab=doc#Quoter

@David00 You're on the right track and for your purposes, that might be enough. But almost inevitably, if you are talking about escaping inputs, you'll want check for values within the string as well as placing quotes around it. Also note that SQLite escaping is different then other systems.


I'm honestly not sure the best direction for the quoter interface. Should there be a registry? Probably not. Could there be a per database quoter? Maybe. The problem is easy enough to frame. I'm not sure what the best solution would look like for this.

@crussell52
Copy link

@kardianos

Maybe I'm being naive, but I would expect that quoting would have to be the responsibility of the driver have to be at the connection level to get it right. (MySQL being the obvious case in point.)

It would then be up to the sql package to provide an interface to obtain the quoter from the connection. An error return value allows for compatibility with drivers that have not implemented a quoter. Something like:

// Quoter provides a driver-appropriate utility for quoting values and identifiers.
//
// If the driver does not provide a Quoter, then a nil value is returned with a 
// non-nil error.
func (*sql.Conn) Quoter() (Quoter, error) { ... }

If I'm in a project and I feel good with the assumption that my driver of choice provides a Quoter, then I can just ignore the error case. If I want maximum abstraction, I write a few extra lines to handle that case.

@paudley
Copy link

paudley commented Mar 29, 2022

For reference, here is the PGX Sanitize function for identifiers:

https://pkg.go.dev/github.com/jackc/pgx/v4#Identifier

@jrfondren
Copy link

I'm new to Go so my apologies if this is extraordinarily obvious to others, but I have managed to use Sprintf to create a string for the query containing a dynamic table name for Sqlite3.
...
This has allowed me to query my Sqlite3 database using a dynamic table name which comes from elsewhere in my program (never as user input). Hope this helps someone though I understand this discussion is more tailored around larger DB systems. Who knows, it might work with them too!

We now have very short steps for someone to take to a disaster:

  1. someone tries using placeholders for column names and finds that this fails
  2. this person asks around, finds out that these need special quoting, and looks for this feature in database/sql--and doesn't find it
  3. this person searches the internet, finds this issue, scrolls down, aha! Here is helpful code to use.
  4. this person doesn't notice the bold never or thinks (wrongly) that it's fine
  5. this person has a trivial SQL injection vulnerability! yay!

This person isn't me, because I'm translating code using Perl's DBI and DBI did this right decades ago. We've have decades of trivial SQL vulnerabilities due to string construction of queries from untrusted input, and when placeholders are finally pretty common, people are still pooh-poohing other necessary measures with the exact same arguments used to say that placeholders weren't necessary? This is disgraceful.

@kevin-lindsay-1
Copy link

I'm honestly kinda surprised that after 6 1/2 years there doesn't seem to be a standard way of escaping a user-input string in sql using go.

@leandrocurioso
Copy link

leandrocurioso commented Jul 5, 2023

For: github.com/go-sql-driver/mysql

I couldn't stand to keep mexing quotes with + concatenation with fmt.Sprintf so I tried to create a standard.

I know this is far from perfect but I will share what I did and fixed for me:

Code

package repository

import (
	"fmt"
	"reflect"
	"strings"
)

type Repository struct{}

type BuildQueryParams struct {
	Placeholders []BuildQueryPlaceholder
}

type BuildQueryPlaceholder struct {
	Key    string
	Params []interface{}
}

// NewRepository get a new instance
func NewRepository() *Repository {
	return &Repository{}
}

func (r *Repository) BuildQuery(rawSql string, buildQueryParams BuildQueryParams) (string, []interface{}) {
	finalQuery := strings.ReplaceAll(rawSql, "%q", "`")
	finalParams := []any{}
	for _, placeholder := range buildQueryParams.Placeholders {
		pMask, params := r.getPlaceholders(placeholder.Params...)
		finalParams = append(finalParams, params...)
		finalQuery = strings.ReplaceAll(finalQuery, fmt.Sprintf(":%s", placeholder.Key), pMask)
	}
	return finalQuery, finalParams
}

func (r *Repository) SpreadSlice(slice any) []interface{} {
	s := reflect.ValueOf(slice)
	if s.Kind() != reflect.Slice {
		panic("InterfaceSlice() given a non-slice type")
	}

	// Keep the distinction between nil and empty slice input
	if s.IsNil() {
		return nil
	}

	ret := make([]interface{}, s.Len())

	for i := 0; i < s.Len(); i++ {
		ret[i] = s.Index(i).Interface()
	}
	return ret
}

func (r *Repository) getPlaceholders(values ...interface{}) (placeholders string, parameters []interface{}) {
	n := len(values)
	p := make([]string, n)
	parameters = make([]interface{}, n)
	for i := 0; i < n; i++ {
		p[i] = "?"
		parameters[i] = values[i]
	}
	placeholders = strings.Join(p, ",")
	return placeholders, parameters
}

Usage

sql, params := r.repository.BuildQuery(`
SELECT 	
        ch.%qname%q,
	ch.client_username
FROM %qcharacter%q ch
INNER JOIN %qclient%q cl ON ch.client_username = cl.username
WHERE ch.%qname%q IN (:p1)
AND cl.client_status_id = :p2
AND ch.character_status_id = 1;`, BuildQueryParams{
Placeholders: []BuildQueryPlaceholder{
	{
		Key: "p1",
		Params: r.repository.BindSlice([]any{
			"character1", "character2", "character3",
		}),
	},
	{
		Key: "p2",
		Params: r.repository.BindSlice([]any{
			1,
		}),
	},
},
})

fmt.Println("Final SQL", sql)
fmt.Println("Final Params", params)

Output

Final SQL 
SELECT 	
               ch.`name`,
               ch.client_username
FROM `character` ch
INNER JOIN `client` cl ON ch.client_username = cl.username
WHERE ch.`name` IN (?,?,?)
AND cl.client_status_id = ?
AND ch.character_status_id = 1;

Final Params [character1 character2 character3 1]

Query it

rows, err := connection.QueryContext(executionCtx, sql, params...)

The final sql will be replaced with "sql quotes" for each %q and the placeholders automatically added for each mapped key that starts with semi colon (:).

All placerholder parameters will be combined in sequence to be used for the prepared statement bind.

BindSlice function is used to generically pass any kind of slice to be appended to final parameters.

That's is!

@ghost
Copy link

ghost commented Jul 6, 2023

For: github.com/go-sql-driver/mysql

I couldn't stand to keep mexing mixing quotes with + concatenation with fmt.Sprintf so I tried to create a standard.

I know this is far from perfect but I will share what I did and fixed for me:

Your solution does not escape input strings, its a workaround to being unable to use backticks in a multi-line literal.

This issue was opened because string replace isn't a proper solution. It makes the application vulnerable to SQL injection.

What we asking for is to add func (*sql.Conn) Escape(string) (string, err) which calls func (*driver.Driver) Escape(*sql.Conn, string) (string, err) which then does the correct driver-specific escaping by calling mysql_real_escape_string or similar.

Yes, it's possible to roll your own solution by following the docs but the whole point of the database/sql package is that users don't have to deal with driver-specific nonsense.

It's only been 7 years, there is no rush.

@nefthy
Copy link
Author

nefthy commented Jul 13, 2023

There is also the issue, that quoting rules are database or even connection dependent, so if the drivers do not provide quoting functions, there is no way to write database agnostic code. I fail to see why the maintainers chose not to require drives to provide such functions. Database vendors do provide quoting functions in their C libraries, for good reasons.

@ghost
Copy link

ghost commented Oct 16, 2023

In case you need an SQL escape method deadly:

It's Apache-2.0, Copy from: https://github.com/pingcap/tidb/blob/master/pkg/util/sqlexec/utils.go#L97

May want to use a permalink instead as code often changes over time. Regular links will eventually point to nowhere: https://github.com/pingcap/tidb/blob/eafb78a9c6b8ff6f9c00188ca16fc63b41ae4e56/pkg/util/sqlexec/utils.go#L97

This code doesn't appear to provide escaping but rather formats parameters as inline SQL which doesn't really solve anything and is prone to SQL injection. It is also using MySQL-specific syntax per readme.md of the project.

@cben
Copy link

cben commented May 1, 2024

What we asking for is to add func (*sql.Conn) Escape(string) (string, err) which ...

Nit: "Escape" sounds like it would do something like don'tdon''t which one might still concatenate with other pieces to form a longer "escaped string".
Such API invites passing around three levels of things — unescaped' stringsescaped'' string fragments"SQL'' fragments". Three is too many IMHO, too easy to mix those up an inject unescaped string directly into SQL.
Plus, we already know some DBs have different escaping & quoting rules for identifiers vs. string values, so even more distinct concepts like escaped""identifier...

I think we can skip escaped level and have a "QuoteString" API which simultaneously escapes and adds the quotes: don't'don''t'. So you only deal with two concepts: Go valuesSQL literal representation of those values.
[in an ideal world I'd wish for a polymorphic "LiteralValue" that can also serialize booleans, arrays of strings etc. but no idea if realistic.]

TLDR compare these two hypothetical styles:

-Sprintf("select myField from \"%s\" where myValue = '%s'", conn.EscapeIdentifier(tableName) conn.EscapeString(value))
+Sprintf("select myField from %s where myValue = %s", conn.QuoteIdentifier(tableName), conn.QuoteString(value))

the former might not even be portable due to hard-coded " and ' choices, but even if it was, I argue the latter is a cleaner API.
[I'm not saying either is great; there is a lot of type safety & ergonomics to add but those might belong in 3rd-party libraries(?) What we need from stdlib is minimum viable interface for DB drivers, and all I'm saying is "Quote" beats "Escape".]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests