sql: fix COPY bugs for binary/csv formats #69066

rafiss · 2021-08-18T00:48:32Z

See individual commits:

sql: return correct format code for COPY
fixes #63235

sql: COPY BINARY messages can be split at arbitrary boundaries
fixes #68805

sql: COPY CSV handles arbitrary message boundaries
refs #68805

sql: fix byte escaping in COPY CSV input
fixes #68804

Release justification: Fixes for high-priority bugs in existing functionality. ("Another rule of thumb to consider: if a customer hit this bug after releasing, would we patch the release to fix it? If we would patch the release later to fix the bug, then usually it's best to fix the bug now.")

cockroach-teamcity · 2021-08-18T00:48:41Z

This change is

otan · 2021-08-18T00:56:42Z

pkg/sql/copy.go

@@ -387,6 +387,15 @@ func (c *copyMachine) readTextData(ctx context.Context, final bool) (brk bool, e
 }

 func (c *copyMachine) readCSVData(ctx context.Context, final bool) (brk bool, err error) {
+	// Don't try to parse the CSV until we read all the data. This is because


should we look at improving this in future / make an issue?

as @steven-sheehy says this behavior isn't that good, and it might even be better to leave the CSV parsing as-is instead of doing this workaround

i'll give a real fix another shot.

otan · 2021-08-18T00:59:19Z

pkg/sql/copy.go

@@ -424,6 +424,17 @@ func (c *copyMachine) readCSVTuple(ctx context.Context, record []string) error {
 			exprs[i] = tree.DNull
 			continue
 		}
+		switch t := c.resultColumns[i].Typ; t.Family() {
+		case types.BytesFamily,


where did you get this list from? how do we know it's complete?
(it's weird to me that e.g. Timestamp is ehre but Time is not)

the list comes from from what COPY FROM ... TEXT is already doing:

cockroach/pkg/sql/copy.go

Lines 660 to 670 in 61fdb08

switch t := c.resultColumns[i].Typ; t.Family() {

case types.BytesFamily,

types.DateFamily,

types.IntervalFamily,

types.INetFamily,

types.StringFamily,

types.TimestampFamily,

types.TimestampTZFamily,

types.UuidFamily:

s = decodeCopy(s)

}

it seems old, wouldn't be surprised if it has the wrong types in it

i can extract that into its own function

hmmm actually this is wrong. I tested in PG, and COPY CSV does not do \xdd or \ddd escaping. Left a comment on #68804 explaining why that issue is happening

steven-sheehy · 2021-08-18T04:44:42Z

pkg/sql/pgwire/testdata/pgtest/copy

+{"Type":"ReadyForQuery","TxStatus":"I"}
+
+# Verify that byte-escape sequences are processed in CSV mode.
+# `MixceDU0` is base64 for `2,\x54`.


Does this support octal encoding as well like PostgreSQL does? Something like 1,"\157\156\145". Would be good to add a test if so.

yes it should. adding a test

steven-sheehy · 2021-08-18T04:59:46Z

pkg/sql/copy.go

+	// we don't have a good way of knowing how many bytes to read for the next
+	// record ahead of time (without reimplementing all the CSV parsing logic
+	// here). This means the input buffer will keep growing -- the memory is
+	// accounted in processCopyData.


So what will happen if I want to upload a CSV file that's 10GB? It won't process a single row from it until all 10GBs are uploaded to memory? If so, that just trades one problem for another.

It seems like instead the csv reader should be updated to only return complete records and put back incomplete records into the buffer if there's no newline or EOF after that incomplete record in the buffer yet.

I agree. I was thinking about how to do this and I think we'll need to fork the csv reader implementation from go. (we'd also need to do that anyway in order to support some of the other COPY options)

one added challenge is that the COPY CSV format requires a newline to appear after each record except for the last one. so if we encounter an EOF on an individual CopyData input buffer, it's hard to tell if that's the end of all the CSV input or if we need to check for another input buffer

tomorrow I'll take a look at how postgres does this and explore some other CSV parsers in go.

Would it help if I create a separate ticket just for partial records with COPY CSV with the info in my comment? That way the more complicated CSV reader refactoring doesn't block the nice improvements in this PR?

yes, sure another ticket would help. if i don't get to this soon then we can go ahead with this PR without the CSV buffering changes

I did find that Postgres solves this by buffering line-by-line, and taking into account whether it is in a quoted field or not: https://github.com/postgres/postgres/blob/9626325da5e8e23ff90091bc96535495d350f06e/src/backend/commands/copyfromparse.c#L1090-L1096

rafiss · 2021-08-19T14:33:51Z

This is ready for another look -- I updated the CSV buffering logic so it searches for a non-escaped newline.

Fixes cockroachdb#63235 Release note (bug fix): When using COPY FROM .. BINARY, the correct format code will now be returned.

Fixes cockroachdb#68805 Release note (bug fix): Previously, COPY FROM ... BINARY would return an error if the input data was split across different messages. This is fixed now.

Release note (bug fix): Previously, COPY FROM ... CSV would require each CopyData message to be split at the boundary of a record. This is a bug since the COPY protocol allows messages to be split at arbitrary points. This is fixed now.

Fixes cockroachdb#68804 Release note (bug fix): COPY FROM ... CSV did not correctly handle octal byte escape sequences such as `\011` when using a BYTEA column`. This is now fixed.

otan

nice!

pkg/sql/copy.go

These are worth adding since we now have special logic for the CSV format for finding newlines. Release note: None

rafiss · 2021-08-23T21:08:49Z

tftr!

bors r=otan

craig · 2021-08-23T22:31:25Z

Build succeeded:

GitHub CI (Cockroach)

blathers-crl · 2021-08-23T22:31:46Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

Backport to branch 21.1.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

This is failing after cockroachdb#69066 got merged. `TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking that the COPY command errors out quickly if it tries to insert a null value into a non-nullable column. Previously this would pass against CRDB, since it would actually error out due to CRDB not handling partial batches of data (cockroachdb#68805). Now that cockroachdb#68805 is fixed, the pgx test gets a bit further. It fails this assertion: ``` endTime := time.Now() copyTime := endTime.Sub(startTime) if copyTime > time.Second { t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime) } ``` The COPY command does not fail immediately against CRDB because we have logic to delay inserting rows until we see 100 rows total: https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374 I don't think we should change our row batching behavior, so for now it makes sense to ignore this pgx test. Release justification: test only change Release note: None

69446: roachtest: ignore pgx COPY error test r=otan a=rafiss resolves #69291 This is failing after #69066 got merged. `TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking that the COPY command errors out quickly if it tries to insert a null value into a non-nullable column. Previously this would pass against CRDB, since it would actually error out due to CRDB not handling partial batches of data (#68805). Now that #68805 is fixed, the pgx test gets a bit further. It fails this assertion: ``` endTime := time.Now() copyTime := endTime.Sub(startTime) if copyTime > time.Second { t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime) } ``` The COPY command does not fail immediately against CRDB because we have logic to delay inserting rows until we see 100 rows total: https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374 I don't think we should change our row batching behavior, so for now it makes sense to ignore this pgx test. Release justification: test only change Release note: None Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>

69234: changefeedccl: signal changeAggregator shutdown from the kvfeed r=miretskiy a=stevendanna During acceptance testing, we observed that changefeeds did not correctly restart on primary key changes and did not correctly stop when schema_change_policy was set to 'stop' when the changeFrontier and changeAggregator were running on different nodes (most production changefeeds). The root cause of this was a bad assumption in the changeAggregator shutdown logic. Namely, we assumed that the changeAggregator (and kv feed) would see context cancellations as a result of the changeFrontier moving to draining. However, this is not guaranteed. When the changeFrontier moves to draining, all of its inputs will be drained. But, a DrainRequest message is only sent to the input lazily when the next message is received from that input. In this case of a schema change, the kv feed would stop sending messages to the changeAggregator and thus no further messages will be sent to the changeFrontier and the drain request is not triggered. With this change, we now shut down the changeAggregator when the kvfeed indicates that no more messages will be returned. Fixes #68791 Release note (enterprise change): Fixed a bug where CHANGEFEEDs would fail to correctly handle a primary key change. 69304: sql: no-op interleaved syntax for CREATE TABLE/INDEX r=fqazi a=fqazi Fixes: #68344 Previously, CREATE TABLE/INDEX operations were completely blocked with an error once support was removed. This was inadequate because for migrations may still use this syntax and we can't fully block it. To address this, this patch will make the interleaved syntax a no-op with a client warning. Release justification: low risk and saner behaviour for customer migrations on this release. Release note (sql change): Interleaved syntax for CREATE TABLE/INDEX is now a no-op, since support has been removed. 69439: storage/fs: remove RemoveDir function r=jbowens a=jbowens Remove the RemoveDir function from the storage/fs.FS interface. It's a vestige from the RocksDB Env and its Pebble implementation simply called the existing Remove function. Release justification: non-production code changes Release note: None 69446: roachtest: ignore pgx COPY error test r=otan a=rafiss resolves #69291 This is failing after #69066 got merged. `TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking that the COPY command errors out quickly if it tries to insert a null value into a non-nullable column. Previously this would pass against CRDB, since it would actually error out due to CRDB not handling partial batches of data (#68805). Now that #68805 is fixed, the pgx test gets a bit further. It fails this assertion: ``` endTime := time.Now() copyTime := endTime.Sub(startTime) if copyTime > time.Second { t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime) } ``` The COPY command does not fail immediately against CRDB because we have logic to delay inserting rows until we see 100 rows total: https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374 I don't think we should change our row batching behavior, so for now it makes sense to ignore this pgx test. Release justification: test only change Release note: None Co-authored-by: Steven Danna <danna@cockroachlabs.com> Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com> Co-authored-by: Jackson Owens <jackson@cockroachlabs.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>

This is failing after cockroachdb#69066 got merged. `TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking that the COPY command errors out quickly if it tries to insert a null value into a non-nullable column. Previously this would pass against CRDB, since it would actually error out due to CRDB not handling partial batches of data (cockroachdb#68805). Now that cockroachdb#68805 is fixed, the pgx test gets a bit further. It fails this assertion: ``` endTime := time.Now() copyTime := endTime.Sub(startTime) if copyTime > time.Second { t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime) } ``` The COPY command does not fail immediately against CRDB because we have logic to delay inserting rows until we see 100 rows total: https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374 I don't think we should change our row batching behavior, so for now it makes sense to ignore this pgx test. Release justification: test only change Release note: None

rafiss added the backport-21.1.x 21.1 is EOL label Aug 18, 2021

rafiss requested review from otan and a team August 18, 2021 00:48

rafiss requested a review from a team as a code owner August 18, 2021 00:48

rafiss changed the title ~~Fix copy bugs~~ sql: fix COPY bugs for binary/csv formats Aug 18, 2021

rafiss force-pushed the fix-copy-bugs branch 2 times, most recently from 0ef1e72 to ec6c129 Compare August 18, 2021 00:54

otan reviewed Aug 18, 2021

View reviewed changes

steven-sheehy reviewed Aug 18, 2021

View reviewed changes

rafiss mentioned this pull request Aug 18, 2021

COPY CSV doesn't support bytea column #68804

Closed

rafiss force-pushed the fix-copy-bugs branch from ec6c129 to 3d1511a Compare August 19, 2021 14:29

blathers-crl bot requested a review from otan August 19, 2021 14:29

sql: return correct format code for COPY

5b9c2d0

Fixes cockroachdb#63235 Release note (bug fix): When using COPY FROM .. BINARY, the correct format code will now be returned.

rafiss force-pushed the fix-copy-bugs branch from 3d1511a to 957584d Compare August 21, 2021 00:37

rafiss added 3 commits August 21, 2021 00:02

sql: COPY BINARY mesages can be split at arbitrary boundaries

56f6cc7

Fixes cockroachdb#68805 Release note (bug fix): Previously, COPY FROM ... BINARY would return an error if the input data was split across different messages. This is fixed now.

sql: fix byte escaping in COPY CSV input

97c3d0b

Fixes cockroachdb#68804 Release note (bug fix): COPY FROM ... CSV did not correctly handle octal byte escape sequences such as `\011` when using a BYTEA column`. This is now fixed.

rafiss force-pushed the fix-copy-bugs branch from 957584d to 68272d5 Compare August 21, 2021 05:02

rafiss requested a review from a team August 23, 2021 16:05

otan approved these changes Aug 23, 2021

View reviewed changes

pkg/sql/copy.go Outdated Show resolved Hide resolved

sql: add edge-case tests for COPY CSV

d7cf85a

These are worth adding since we now have special logic for the CSV format for finding newlines. Release note: None

rafiss force-pushed the fix-copy-bugs branch from 68272d5 to d7cf85a Compare August 23, 2021 21:08

craig bot merged commit 5ba04b9 into cockroachdb:master Aug 23, 2021

blathers-crl bot mentioned this pull request Aug 23, 2021

release-21.1: sql: fix COPY bugs for binary/csv formats #69255

Merged

rafiss mentioned this pull request Aug 24, 2021

release-21.1: sql: fix COPY bugs for binary/csv formats #69278

Merged

rafiss deleted the fix-copy-bugs branch August 24, 2021 18:07

rafiss mentioned this pull request Aug 27, 2021

roachtest: pgx failed #69291

Closed

rafiss mentioned this pull request Aug 27, 2021

roachtest: ignore pgx COPY error test #69446

Merged

rafiss mentioned this pull request Aug 27, 2021

release-21.1: roachtest: ignore pgx COPY error test #69490

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: fix COPY bugs for binary/csv formats #69066

sql: fix COPY bugs for binary/csv formats #69066

rafiss commented Aug 18, 2021 •

edited

Loading

cockroach-teamcity commented Aug 18, 2021

otan Aug 18, 2021

rafiss Aug 18, 2021

otan Aug 18, 2021

rafiss Aug 18, 2021

rafiss Aug 18, 2021

steven-sheehy Aug 18, 2021

rafiss Aug 18, 2021

steven-sheehy Aug 18, 2021 •

edited

Loading

rafiss Aug 18, 2021

steven-sheehy Aug 18, 2021 •

edited

Loading

rafiss Aug 18, 2021

rafiss commented Aug 19, 2021

otan left a comment

rafiss commented Aug 23, 2021

craig bot commented Aug 23, 2021

blathers-crl bot commented Aug 23, 2021

	switch t := c.resultColumns[i].Typ; t.Family() {
	case types.BytesFamily,
	types.DateFamily,
	types.IntervalFamily,
	types.INetFamily,
	types.StringFamily,
	types.TimestampFamily,
	types.TimestampTZFamily,
	types.UuidFamily:
	s = decodeCopy(s)
	}

sql: fix COPY bugs for binary/csv formats #69066

sql: fix COPY bugs for binary/csv formats #69066

Conversation

rafiss commented Aug 18, 2021 • edited Loading

cockroach-teamcity commented Aug 18, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steven-sheehy Aug 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steven-sheehy Aug 18, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rafiss commented Aug 19, 2021

otan left a comment

Choose a reason for hiding this comment

rafiss commented Aug 23, 2021

craig bot commented Aug 23, 2021

blathers-crl bot commented Aug 23, 2021

rafiss commented Aug 18, 2021 •

edited

Loading

steven-sheehy Aug 18, 2021 •

edited

Loading

steven-sheehy Aug 18, 2021 •

edited

Loading