Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
db: fix CSV escaping by switching to jackc/pgx
lib/pq has been in maintenance mode for a while, and issue timescale#61 appears to have run into one of its idiosyncrasies: its COPY implementation assumes that you're using a query generated via pq.CopyIn(), which uses the default TEXT format, so it runs all of the incoming data through an additional escaping layer. Our code uses CSV by default (and there appears to be no way to use TEXT format, since we're using the old COPY syntax), which means that incoming CSV containing its own escapes will be double-escaped and corrupted. This is most visible with bytea columns, but the tests currently document additional problems with tab and backslash characters, and there are probably other problematic cases too. To fix, switch from lib/pq over to jackc/pgx, and reimplement db.CopyFromLines() using the PgConn.CopyFrom() API. We were already depending on a part of this library before, so the new dependency isn't as big of a change as it would have been otherwise, but the switch isn't free. The compiled binary gains roughly 1.5 MB in size -- likely due to jackc's extensive type conversion system, which is unfortunate because we're not using it. Further optimization could probably be done, at the expense of having most of the DB logic go through the low-level APIs rather than database/sql. We make use of the new sql.Conn.Raw() method to easily drop down to the lowest API level, so bump our minimum Go version to 1.13. (1.12 has been EOL for about three years now.) This escaping fix is a breaking change for anyone who may have already worked around this problem, so bump the utility's version to 0.4.0.
- Loading branch information