New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: various COPY improvements #78303
Conversation
Release note: None
Release note: None
Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1, 3 of 3 files at r2, 1 of 1 files at r3, 3 of 4 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @otan)
-- commits, line 7 at r2:
Can we add some testing here to validate that the options show up as unimplemented?
pkg/util/encoding/csv/reader_test.go, line 396 at r4 (raw file):
Escape: 'x', Input: `"x"",",","xxx"",x,"xxxx,"` + "\n", Output: [][]string{{`"`, `,`, `x"`, `x`, `xx,`}},
I'm not following the 4th case. If we just provide an escape character, shouldn't this show nothing or a parse error?
pkg/util/encoding/csv/reader_test.go, line 402 at r4 (raw file):
Comma: 'x', Input: `"x""x,x"xxx""x"xx"x"xxxx,"` + "\n", Output: [][]string{{`"`, `,`, `x"`, `x`, `xx,`}},
I think I'm missing something. I would have thought that this would yield:
'"',',"x"','"x','"xx'
pkg/util/encoding/csv/writer_test.go, line 54 at r4 (raw file):
{Input: [][]string{{"a", "a", "a"}}, Output: "a,a,a\n"}, {Input: [][]string{{`\.`}}, Output: "\"\\.\"\n"}, {Input: [][]string{{`"`, `,`, `x"`, `x`, `xx,`}}, Escape: 'x', Output: `"x"",",","xxx"",x,"xxxx,"` + "\n"},
More confusion on my part- why is a single x being interpreted as x instead of as the escape, and thus a parse error?
Exciting stuff--can we list explicitly what will remain unimplemented for Copy from the linked issue? |
It's everything in #41608
except for the ESCAPE option.
…On Wed, Mar 23, 2022 at 2:40 PM Andy Woods ***@***.***> wrote:
Exciting stuff--can we list explicitly what will remain unimplemented for
Copy from the linked issue?
—
Reply to this email directly, view it on GitHub
<#78303 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEMXVOUO6T6PTK53IU6FD6DVBNQQFANCNFSM5RMY66CA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Awesome news! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm)
Previously, ajstorm (Adam Storm) wrote…
Can we add some testing here to validate that the options show up as unimplemented?
the new tests in TestUnimplementedSyntax in parse_test.go
already does this
pkg/util/encoding/csv/reader_test.go, line 396 at r4 (raw file):
Previously, ajstorm (Adam Storm) wrote…
I'm not following the 4th case. If we just provide an escape character, shouldn't this show nothing or a parse error?
"x"",",","xxx"",x,"xxxx,"` shouldn't.
breaking it down:
"x"",
: "x""
- the wrapping "
indicates to use escapes, x
escapes "
, so it is "
.
",",
: the ","
is just ,
"xxx"",
: the x escapes x, then x escapes ", so it is x"
x,
: no quotes, so we don't have to escape anything. so it is x
(see below comment)
"xxxx,"
, x escapes x twice, then a comma, so xx,
Code quote:
"x"",",","xxx"",x,"xxxx,"` + "\n"
pkg/util/encoding/csv/reader_test.go, line 402 at r4 (raw file):
Previously, ajstorm (Adam Storm) wrote…
I think I'm missing something. I would have thought that this would yield:
'"',',"x"','"x','"xx'
"x""x
, yields "
, then a delimiter
,x
yields ,
then a delimiter
"xxx""x
yields "x", then a delimiter
"xx"x": point of contention. begin quote, then x escapes x, so it becomes just x
, then a delimiter
"xxxx,"
: quote, two escapes x's, then a comma, so xx,
fwiw all these outputs are from PG
pkg/util/encoding/csv/writer_test.go, line 54 at r4 (raw file):
Previously, ajstorm (Adam Storm) wrote…
More confusion on my part- why is a single x being interpreted as x instead of as the escape, and thus a parse error?
because things that are not in quotes are their accepted "raw value".
735456b
to
e690e37
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @otan)
pkg/sql/pgwire/testdata/pgtest/copy, line 297 at r5 (raw file):
{"Type":"DataRow","Values":[{"text":"1"},{"text":"x\",x"}]} {"Type":"CommandComplete","CommandTag":"SELECT 2"} {"Type":"ReadyForQuery","TxStatus":"I"}
just to double-check, these new ones also passed against PG?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm and @otan)
pkg/util/encoding/csv/reader.go, line 133 at r4 (raw file):
// Escape, if not 0, is the character used to escape `"` characters and // itself.
maybe mention that if it's unset, it defaults to the same as QUOTE. (which will only be more important once we support the QUOTE option)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @otan, and @rafiss)
pkg/util/encoding/csv/reader.go, line 133 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
maybe mention that if it's unset, it defaults to the same as QUOTE. (which will only be more important once we support the QUOTE option)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good from my end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too. Thanks for clearing up my confusion about the unquoted characters.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @otan, and @rafiss)
Previously, otan (Oliver Tan) wrote…
the new tests in TestUnimplementedSyntax in
parse_test.go
already does this
Thanks. I see them now.
pkg/util/encoding/csv/reader_test.go, line 396 at r4 (raw file):
Previously, otan (Oliver Tan) wrote…
"x"",",","xxx"",x,"xxxx,"` shouldn't.
breaking it down:
"x"",
:"x""
- the wrapping"
indicates to use escapes,x
escapes"
, so it is"
.
",",
: the","
is just,
"xxx"",
: the x escapes x, then x escapes ", so it isx"
x,
: no quotes, so we don't have to escape anything. so it isx
(see below comment)
"xxxx,"
, x escapes x twice, then a comma, soxx,
Yeah, makes sense. It was step 4 that I was missing.
pkg/util/encoding/csv/reader_test.go, line 402 at r4 (raw file):
Previously, otan (Oliver Tan) wrote…
"x""x
, yields"
, then a delimiter
,x
yields,
then a delimiter
"xxx""x
yields"x", then a delimiter
"xx"x": point of contention. begin quote, then x escapes x, so it becomes justx
, then a delimiter
"xxxx,"
: quote, two escapes x's, then a comma, soxx,
fwiw all these outputs are from PG
All good. Thanks.
pkg/util/encoding/csv/writer_test.go, line 54 at r4 (raw file):
Previously, otan (Oliver Tan) wrote…
because things that are not in quotes are their accepted "raw value".
Ahh, interesting. That explains all of my confusion. Thx.
f102228
to
03ee50c
Compare
Add `ESCAPE` logic to the `encoding/csv` package, for exposure to SQL at a later stage. It is worth noting I wrote this in the "safest" backportable way possible. Ideally we'd rewrite the read logic to be more "parser"-like to account for the change in QUOTE case, but that's a lot riskier to backport. Release note: None
Release note (sql change): Implemented the `COPY FROM ... ESCAPE ...` syntax.
thanks bors r=rafiss,ajstorm |
Build succeeded: |
blathers backport 22.1
…On Thu, 24 Mar 2022, 3:28 pm craig[bot], ***@***.***> wrote:
Merged #78303 <#78303> into
master.
—
Reply to this email directly, view it on GitHub
<#78303 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA32FQ64NXUT2VWTPFBIYLTVBPVP7ANCNFSM5RMY66CA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
blathers backport 21.2 |
blathers backport 21.2 |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 149a1db to blathers/backport-release-21.2-78303: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2 failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Refs: #41608
See individual commits for details.