New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importccl: count rows imported #28469
Conversation
@@ -901,11 +900,14 @@ func doDistributedCSVTransform( | |||
for i := 0; i < n; i++ { | |||
row := rows.At(i) | |||
name := row[0].(*tree.DString) | |||
size := row[1].(*tree.DInt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this field used anymore? If not should it be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yeah, I forgot we didn't care about wire compat. Done.
4c6785d
to
adbd786
Compare
This plumbs the information about imported rows from the distributed import process back to the gateway so it can be returned in the results, similar to how it looked in original IMPORT via RESTORE. Fixes cockroachdb#27281. Release note (sql change): fix imported row counts in IMPORT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you also need to bump the distsql API version. Otherwise the planner could do incorrect things in a cross version cluster. Probably requires bumping the min version too since it's backward incompatible. Should ping the distsql folks to review that part of the change too?
I think we already said we're not allowing IMPORT in mixed version because of the job / proto changes, so we'll be putting version gates in once we pick a sha that is "2.1" for the purposes of IMPORT (i.e. after we're done with breaking changes like this). |
Is a 2.0 node allowed to schedule an IMPORT worker on a 2.1 node? That would also break. |
Ok in 2.0 we call into cockroach/pkg/sql/distsql_physical_planner.go Line 971 in b60db1b
|
i could also just put the int col back -- i think it is otherwise safe for 2.0 to schedule 2.1 workers |
That's fine with me too. I just want to make sure we don't produce a panic in a cross version situation. |
b9cda60
to
3f0722f
Compare
so, after a little testing: we already have a panic in 2.0 when run in a mixed version cluster with master (or even, it seems, when resuming operation a formerly mixed version cluster). That panic is coming from sampling, so we hit it before this code since it is in just the SST writer changed here So, a) Ugh b) I think we can consider it out of the scope of this PR and shows we need to do a big cross-version test frenzy (and write some roach tests) on this before release (this time). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. LGTM then.
bors r+ |
28469: importccl: count rows imported r=dt a=dt This plumbs the information about imported rows from the distributed import process back to the gateway so it can be returned in the results, similar to how it looked in original IMPORT via RESTORE. Release note (sql change): fix imported row counts in IMPORT. Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Build succeeded |
This plumbs the information about imported rows from the distributed
import process back to the gateway so it can be returned in the results,
similar to how it looked in original IMPORT via RESTORE.
Release note (sql change): fix imported row counts in IMPORT.