Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix COPY bugs for binary/csv formats #69066

Merged
merged 5 commits into from
Aug 23, 2021

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Aug 18, 2021

See individual commits:

sql: return correct format code for COPY
fixes #63235

sql: COPY BINARY messages can be split at arbitrary boundaries
fixes #68805

sql: COPY CSV handles arbitrary message boundaries
refs #68805

sql: fix byte escaping in COPY CSV input
fixes #68804

Release justification: Fixes for high-priority bugs in existing functionality. ("Another rule of thumb to consider: if a customer hit this bug after releasing, would we patch the release to fix it? If we would patch the release later to fix the bug, then usually it's best to fix the bug now.")

@rafiss rafiss added the backport-21.1.x 21.1 is EOL label Aug 18, 2021
@rafiss rafiss requested review from otan and a team August 18, 2021 00:48
@rafiss rafiss requested a review from a team as a code owner August 18, 2021 00:48
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss changed the title Fix copy bugs sql: fix COPY bugs for binary/csv formats Aug 18, 2021
@rafiss rafiss force-pushed the fix-copy-bugs branch 2 times, most recently from 0ef1e72 to ec6c129 Compare August 18, 2021 00:54
pkg/sql/copy.go Outdated
@@ -387,6 +387,15 @@ func (c *copyMachine) readTextData(ctx context.Context, final bool) (brk bool, e
}

func (c *copyMachine) readCSVData(ctx context.Context, final bool) (brk bool, err error) {
// Don't try to parse the CSV until we read all the data. This is because
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we look at improving this in future / make an issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as @steven-sheehy says this behavior isn't that good, and it might even be better to leave the CSV parsing as-is instead of doing this workaround

i'll give a real fix another shot.

pkg/sql/copy.go Outdated
@@ -424,6 +424,17 @@ func (c *copyMachine) readCSVTuple(ctx context.Context, record []string) error {
exprs[i] = tree.DNull
continue
}
switch t := c.resultColumns[i].Typ; t.Family() {
case types.BytesFamily,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where did you get this list from? how do we know it's complete?
(it's weird to me that e.g. Timestamp is ehre but Time is not)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the list comes from from what COPY FROM ... TEXT is already doing:

cockroach/pkg/sql/copy.go

Lines 660 to 670 in 61fdb08

switch t := c.resultColumns[i].Typ; t.Family() {
case types.BytesFamily,
types.DateFamily,
types.IntervalFamily,
types.INetFamily,
types.StringFamily,
types.TimestampFamily,
types.TimestampTZFamily,
types.UuidFamily:
s = decodeCopy(s)
}

it seems old, wouldn't be surprised if it has the wrong types in it

i can extract that into its own function

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm actually this is wrong. I tested in PG, and COPY CSV does not do \xdd or \ddd escaping. Left a comment on #68804 explaining why that issue is happening

{"Type":"ReadyForQuery","TxStatus":"I"}

# Verify that byte-escape sequences are processed in CSV mode.
# `MixceDU0` is base64 for `2,\x54`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this support octal encoding as well like PostgreSQL does? Something like 1,"\157\156\145". Would be good to add a test if so.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should. adding a test

pkg/sql/copy.go Outdated
// we don't have a good way of knowing how many bytes to read for the next
// record ahead of time (without reimplementing all the CSV parsing logic
// here). This means the input buffer will keep growing -- the memory is
// accounted in processCopyData.
Copy link

@steven-sheehy steven-sheehy Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what will happen if I want to upload a CSV file that's 10GB? It won't process a single row from it until all 10GBs are uploaded to memory? If so, that just trades one problem for another.

It seems like instead the csv reader should be updated to only return complete records and put back incomplete records into the buffer if there's no newline or EOF after that incomplete record in the buffer yet.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I was thinking about how to do this and I think we'll need to fork the csv reader implementation from go. (we'd also need to do that anyway in order to support some of the other COPY options)

one added challenge is that the COPY CSV format requires a newline to appear after each record except for the last one. so if we encounter an EOF on an individual CopyData input buffer, it's hard to tell if that's the end of all the CSV input or if we need to check for another input buffer

tomorrow I'll take a look at how postgres does this and explore some other CSV parsers in go.

Copy link

@steven-sheehy steven-sheehy Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it help if I create a separate ticket just for partial records with COPY CSV with the info in my comment? That way the more complicated CSV reader refactoring doesn't block the nice improvements in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, sure another ticket would help. if i don't get to this soon then we can go ahead with this PR without the CSV buffering changes

I did find that Postgres solves this by buffering line-by-line, and taking into account whether it is in a quoted field or not: https://github.com/postgres/postgres/blob/9626325da5e8e23ff90091bc96535495d350f06e/src/backend/commands/copyfromparse.c#L1090-L1096

@rafiss
Copy link
Collaborator Author

rafiss commented Aug 19, 2021

This is ready for another look -- I updated the CSV buffering logic so it searches for a non-escaped newline.

Fixes cockroachdb#63235

Release note (bug fix): When using COPY FROM .. BINARY, the correct
format code will now be returned.
Fixes cockroachdb#68805

Release note (bug fix): Previously, COPY FROM ... BINARY would return
an error if the input data was split across different messages. This is
fixed now.
Release note (bug fix): Previously, COPY FROM ... CSV
would require each CopyData message to be split at the boundary of a
record. This is a bug since the COPY protocol allows messages to be
split at arbitrary points. This is fixed now.
Fixes cockroachdb#68804

Release note (bug fix): COPY FROM ... CSV did not correctly handle
octal byte escape sequences such as `\011` when using a BYTEA column`.
This is now fixed.
Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

pkg/sql/copy.go Outdated Show resolved Hide resolved
These are worth adding since we now have special logic for the CSV
format for finding newlines.

Release note: None
@rafiss
Copy link
Collaborator Author

rafiss commented Aug 23, 2021

tftr!

bors r=otan

@craig
Copy link
Contributor

craig bot commented Aug 23, 2021

Build succeeded:

@craig craig bot merged commit 5ba04b9 into cockroachdb:master Aug 23, 2021
@blathers-crl
Copy link

blathers-crl bot commented Aug 23, 2021

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


Backport to branch 21.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@rafiss rafiss deleted the fix-copy-bugs branch August 24, 2021 18:07
@rafiss rafiss mentioned this pull request Aug 27, 2021
rafiss added a commit to rafiss/cockroach that referenced this pull request Aug 27, 2021
This is failing after cockroachdb#69066 got merged.
`TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking
that the COPY command errors out quickly if it tries to insert a null
value into a non-nullable column.

Previously this would pass against CRDB, since it would actually error
out due to CRDB not handling partial batches of data (cockroachdb#68805).
Now that cockroachdb#68805 is fixed,
the pgx test gets a bit further.

It fails this assertion:
```
	endTime := time.Now()
	copyTime := endTime.Sub(startTime)
	if copyTime > time.Second {
		t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime)
	}
```

The COPY command does not fail immediately against CRDB because we have logic to delay
inserting rows until we see 100 rows total:
https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374

I don't think we should change our row batching behavior, so for now
it makes sense to ignore this pgx test.

Release justification: test only change
Release note: None
craig bot pushed a commit that referenced this pull request Aug 27, 2021
69446: roachtest: ignore pgx COPY error test r=otan a=rafiss

resolves #69291

This is failing after #69066 got merged.
`TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking
that the COPY command errors out quickly if it tries to insert a null
value into a non-nullable column.

Previously this would pass against CRDB, since it would actually error
out due to CRDB not handling partial batches of data (#68805).
Now that #68805 is fixed,
the pgx test gets a bit further.

It fails this assertion:
```
	endTime := time.Now()
	copyTime := endTime.Sub(startTime)
	if copyTime > time.Second {
		t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime)
	}
```

The COPY command does not fail immediately against CRDB because we have logic to delay
inserting rows until we see 100 rows total:
https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374

I don't think we should change our row batching behavior, so for now
it makes sense to ignore this pgx test.

Release justification: test only change
Release note: None

Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
craig bot pushed a commit that referenced this pull request Aug 27, 2021
69234: changefeedccl: signal changeAggregator shutdown from the kvfeed r=miretskiy a=stevendanna

During acceptance testing, we observed that changefeeds did not
correctly restart on primary key changes and did not correctly stop
when schema_change_policy was set to 'stop' when the changeFrontier
and changeAggregator were running on different nodes (most production
changefeeds).

The root cause of this was a bad assumption in the changeAggregator
shutdown logic. Namely, we assumed that the changeAggregator (and kv
feed) would see context cancellations as a result of the
changeFrontier moving to draining. However, this is not guaranteed.

When the changeFrontier moves to draining, all of its inputs will be
drained. But, a DrainRequest message is only sent to the input lazily
when the next message is received from that input.

In this case of a schema change, the kv feed would stop sending
messages to the changeAggregator and thus no further messages will be
sent to the changeFrontier and the drain request is not triggered.

With this change, we now shut down the changeAggregator when the
kvfeed indicates that no more messages will be returned.

Fixes #68791

Release note (enterprise change): Fixed a bug where CHANGEFEEDs would
fail to correctly handle a primary key change.

69304: sql: no-op interleaved syntax for CREATE TABLE/INDEX r=fqazi a=fqazi

Fixes: #68344

Previously, CREATE TABLE/INDEX operations were completely
blocked with an error once support was removed. This
was inadequate because for migrations may still use this
syntax and we can't fully block it. To address this, this
patch will make the interleaved syntax a no-op with a client
warning.

Release justification: low risk and saner behaviour for customer
migrations on this release.
Release note (sql change): Interleaved syntax for CREATE TABLE/INDEX
is now a no-op, since support has been removed.

69439: storage/fs: remove RemoveDir function r=jbowens a=jbowens

Remove the RemoveDir function from the storage/fs.FS interface. It's a
vestige from the RocksDB Env and its Pebble implementation simply called
the existing Remove function.

Release justification: non-production code changes
Release note: None

69446: roachtest: ignore pgx COPY error test r=otan a=rafiss

resolves #69291

This is failing after #69066 got merged.
`TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking
that the COPY command errors out quickly if it tries to insert a null
value into a non-nullable column.

Previously this would pass against CRDB, since it would actually error
out due to CRDB not handling partial batches of data (#68805).
Now that #68805 is fixed,
the pgx test gets a bit further.

It fails this assertion:
```
	endTime := time.Now()
	copyTime := endTime.Sub(startTime)
	if copyTime > time.Second {
		t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime)
	}
```

The COPY command does not fail immediately against CRDB because we have logic to delay
inserting rows until we see 100 rows total:
https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374

I don't think we should change our row batching behavior, so for now
it makes sense to ignore this pgx test.

Release justification: test only change
Release note: None

Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Co-authored-by: Faizan Qazi <faizan@cockroachlabs.com>
Co-authored-by: Jackson Owens <jackson@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
rafiss added a commit to rafiss/cockroach that referenced this pull request Aug 27, 2021
This is failing after cockroachdb#69066 got merged.
`TestConnCopyFromFailServerSideMidwayAbortsWithoutWaiting` is checking
that the COPY command errors out quickly if it tries to insert a null
value into a non-nullable column.

Previously this would pass against CRDB, since it would actually error
out due to CRDB not handling partial batches of data (cockroachdb#68805).
Now that cockroachdb#68805 is fixed,
the pgx test gets a bit further.

It fails this assertion:
```
	endTime := time.Now()
	copyTime := endTime.Sub(startTime)
	if copyTime > time.Second {
		t.Errorf("Failing CopyFrom shouldn't have taken so long: %v", copyTime)
	}
```

The COPY command does not fail immediately against CRDB because we have logic to delay
inserting rows until we see 100 rows total:
https://github.com/cockroachdb/cockroach/blob/8cae60f603ccc4d83137167b3b31cab09be9d41a/pkg/sql/copy.go#L370-L374

I don't think we should change our row batching behavior, so for now
it makes sense to ignore this pgx test.

Release justification: test only change
Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-21.1.x 21.1 is EOL
Projects
None yet
4 participants