Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulk: "unexpected consumer closed" returned from EXPORT rather than underlying error #77928

Closed
stevendanna opened this issue Mar 16, 2022 · 2 comments
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@stevendanna
Copy link
Collaborator

stevendanna commented Mar 16, 2022

Describe the problem

An EXPORT of a large table can hit ReadWithinUncertaintyIntervalError. However, the user only sees an unexpected consumer closed error.

     8.024ms      3.701ms    event:kv/kvclient/kvcoord/txn_coord_sender.go:810 [n1,client=127.0.0.1:40598,hostnossl,user=root] resetting epoch-based coordinator state on retry
     8.077ms      0.053ms    event:sql/distsql_running.go:797 [n1,client=127.0.0.1:40598,hostnossl,user=root] encountered error (transitioning to draining): TransactionRetryWithProtoRefreshError: ReadWithinUncertaintyIntervalError: read at time 1646848637.274287952,0 encountered previous write with future timestamp 1646848637.275565780,0 within uncertainty interval `t <= (local=1646848637.279714058,0, global=1646848637.774287952,0)`; observed timestamps: [{1 1646848637.274287952,0} {9 1646848637.279714058,0}]: "sql txn" meta={id=9b0c8dc7 pri=0.00517326 epo=0 ts=1646848637.274287952,0 min=1646848637.274287952,0 seq=0} lock=false stat=PENDING rts=1646848637.274287952,0 wto=false gul=1646848637.774287952,0
    18.567ms     10.490ms    event:kv/kvclient/rangecache/range_cache.go:1011 [n1,client=127.0.0.1:40598,hostnossl,user=root] clearing entries overlapping r252:/Table/53/1/11{1044104-3108992} [(n8,s8):1, (n4,s4):2, (n2,s2):3, next=4, gen=32]
    18.625ms      0.058ms    event:kv/kvclient/rangecache/range_cache.go:1011 [n1,client=127.0.0.1:40598,hostnossl,user=root] clearing entries overlapping r64:/Table/53/1/15{6371000-7055000} [(n8,s8):1, (n4,s4):2, (n2,s2):3, next=4, gen=24, sticky=1646401301.063899970,0]
   886.205ms    867.580ms    event:sql/distsql_running.go:797 [n1,client=127.0.0.1:40598,hostnossl,user=root] encountered error (transitioning to draining): unexpected closure of consumer
  1269.890ms    383.685ms    event:sql/conn_executor_exec.go:669 [n1,client=127.0.0.1:40598,hostnossl,user=root] execution ends
  1269.923ms      0.033ms    event:sql/conn_executor_exec.go:669 [n1,client=127.0.0.1:40598,hostnossl,user=root] execution failed after 0 rows: unexpected closure of consumer

It appears that we may be using the EmitRow API a bit incorrectly. In the CSV exporter, we explicitly return an error:

https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/importer/exportcsv.go#L298-L300

however, if we look at what the emitRow helper does (and the callers of that interface):

https://github.com/cockroachdb/cockroach/blob/master/pkg/sql/rowexec/processors.go#L47

and the callers of that function, none of them return errors in the case of getting ConsumerClosed or DrainRequested. I haven't chased the code all the way through, but I suspect us returning an error here is hiding the underlying error.

Jira issue: CRDB-13853

@stevendanna stevendanna added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 16, 2022
@yuzefovich
Copy link
Member

Thanks for tracking this down! We have seen this problem in a couple of customer incidents and never made much progress.

@stevendanna
Copy link
Collaborator Author

I believe we fixed this #77938

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

2 participants