Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importccl: count rows imported #28469

Merged
merged 1 commit into from Aug 14, 2018
Merged

importccl: count rows imported #28469

merged 1 commit into from Aug 14, 2018

Conversation

dt
Copy link
Member

@dt dt commented Aug 10, 2018

This plumbs the information about imported rows from the distributed
import process back to the gateway so it can be returned in the results,
similar to how it looked in original IMPORT via RESTORE.

Release note (sql change): fix imported row counts in IMPORT.

@dt dt requested review from maddyblue and a team August 10, 2018 17:59
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@@ -901,11 +900,14 @@ func doDistributedCSVTransform(
for i := 0; i < n; i++ {
row := rows.At(i)
name := row[0].(*tree.DString)
size := row[1].(*tree.DInt)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this field used anymore? If not should it be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah, I forgot we didn't care about wire compat. Done.

@dt dt force-pushed the counts branch 2 times, most recently from 4c6785d to adbd786 Compare August 13, 2018 15:07
This plumbs the information about imported rows from the distributed
import process back to the gateway so it can be returned in the results,
similar to how it looked in original IMPORT via RESTORE.

Fixes cockroachdb#27281.

Release note (sql change): fix imported row counts in IMPORT.
Copy link
Contributor

@maddyblue maddyblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also need to bump the distsql API version. Otherwise the planner could do incorrect things in a cross version cluster. Probably requires bumping the min version too since it's backward incompatible. Should ping the distsql folks to review that part of the change too?

@dt
Copy link
Member Author

dt commented Aug 13, 2018

I think we already said we're not allowing IMPORT in mixed version because of the job / proto changes, so we'll be putting version gates in once we pick a sha that is "2.1" for the purposes of IMPORT (i.e. after we're done with breaking changes like this).

@maddyblue
Copy link
Contributor

Is a 2.0 node allowed to schedule an IMPORT worker on a 2.1 node? That would also break.

@maddyblue
Copy link
Contributor

Ok in 2.0 we call into

func (dsp *DistSQLPlanner) CheckNodeHealthAndVersion(
which only adds the node to the list if it meets the version limits. MinVersion has been 6 forever. I think this means the 2.0 server would schedule it but get back an incorrect type. I am ok merging this change if the 2.0 server in this case returns an error (i.e., doesn't panic). I'm also ok with this change if for another reason not described above the 2.0 server wouldn't schedule work on the 2.1 node.

@dt
Copy link
Member Author

dt commented Aug 13, 2018

i could also just put the int col back -- i think it is otherwise safe for 2.0 to schedule 2.1 workers

@maddyblue
Copy link
Contributor

That's fine with me too. I just want to make sure we don't produce a panic in a cross version situation.

@dt dt force-pushed the counts branch 2 times, most recently from b9cda60 to 3f0722f Compare August 13, 2018 21:14
@dt
Copy link
Member Author

dt commented Aug 13, 2018

so, after a little testing: we already have a panic in 2.0 when run in a mixed version cluster with master (or even, it seems, when resuming operation a formerly mixed version cluster). That panic is coming from sampling, so we hit it before this code since it is in just the SST writer changed here

So, a) Ugh b) I think we can consider it out of the scope of this PR and shows we need to do a big cross-version test frenzy (and write some roach tests) on this before release (this time).

Copy link
Contributor

@maddyblue maddyblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. LGTM then.

@dt
Copy link
Member Author

dt commented Aug 14, 2018

bors r+

craig bot pushed a commit that referenced this pull request Aug 14, 2018
28469: importccl: count rows imported r=dt a=dt

This plumbs the information about imported rows from the distributed
import process back to the gateway so it can be returned in the results,
similar to how it looked in original IMPORT via RESTORE.

Release note (sql change): fix imported row counts in IMPORT.

Co-authored-by: David Taylor <tinystatemachine@gmail.com>
@craig
Copy link
Contributor

craig bot commented Aug 14, 2018

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants