Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv,sql: replica command errors bubble up as-is to SQL clients #49868

Open
knz opened this issue Jun 4, 2020 · 4 comments
Open

kv,sql: replica command errors bubble up as-is to SQL clients #49868

knz opened this issue Jun 4, 2020 · 4 comments
Labels
A-error-handling Error messages, error propagations/annotations A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. T-kv KV Team T-sql-queries SQL Queries Team

Comments

@knz
Copy link
Contributor

knz commented Jun 4, 2020

Found in #49513, a KV error went through all the layers and popped up in the client unadulterated:

pq: unable to select removal target from [(n1,s1):?]; current replicas [(n2,s2):1 (n7,s7):2 (n1,s1):3 (n8,s8):4]: could not select an appropriate replica to be removed

The present issue is not about investigating the cause of the error and instead focuses on the communication of this error throughout CockroachDB.

There are a couple of problems with this situation:

  • it does not carry a pg error code
  • it does not provide SQL-level context (which type of operation was being performed)
  • it does not suggest whether the operation is retryable
  • it probably is overly confusing - there's nothing about the replica details that's remotely useful to the user of a SQL client.

The way forward would be to address this in SQL by:

  • logging the details of the error via %+v in the log file
  • replacing the details by a SQL-level summary of the situation (via errors.Handled())

cc @andreimatei if you have suggestions as to how/where to do this? Perhaps the error could be summarized by txncoordsender before it comes into SQL?

cc @asubiotto @jordanlewis for triage.

Jira issue: CRDB-4177

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. A-error-handling Error messages, error propagations/annotations labels Jun 4, 2020
@andreimatei
Copy link
Contributor

I tried tracking through the allocator code to figure out what the error really means and how expected it is, but I got lost quickly. I think the error is generated by an AdminRelocateRange request (which is an unusual one), so I guess if we'd want a better error experience, we'd need to introduce a new ErrorDetail specific to this request type.
Or maybe the answer here is some sort of retry in the store.AdminRelocateRange(). But again I don't really know what the error means.

@knz
Copy link
Contributor Author

knz commented Jun 4, 2020

Can you help us determine who would know?

@andreimatei
Copy link
Contributor

I've got nothing. We can see if @tbg or @darinpp know anything, but probably any of us are just as good.
FWIW this test already ignores a bunch of errors coming from EXPERIMENTAL_RELOCATE:

func isExpectedRelocateError(err error) bool {

I think we should keep around #49513 for deflaking the test, but otherwise close this issue frankly. Focusing on the error ergonomics for experimental_relocate in general, and this one error in particular, doesn't seem too important to me.

@yuzefovich yuzefovich added this to Incoming in KV Jun 4, 2020
@asubiotto asubiotto moved this from Triage to [GENERAL BACKLOG] Enhancements/Features/Investigations in BACKLOG, NO NEW ISSUES: SQL Execution Jun 9, 2020
@lunevalex lunevalex moved this from Incoming to Transactions in KV Jul 29, 2020
@lunevalex lunevalex added the A-kv-client Relating to the KV client and the KV interface. label Jul 29, 2020
@lunevalex lunevalex moved this from Transactions to Cold storage in KV Apr 23, 2021
@jlinder jlinder added T-sql-queries SQL Queries Team T-kv KV Team labels Jun 16, 2021
@jordanlewis jordanlewis removed their assignment Jul 2, 2022
@tbg
Copy link
Member

tbg commented Jul 19, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-error-handling Error messages, error propagations/annotations A-kv-client Relating to the KV client and the KV interface. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. T-kv KV Team T-sql-queries SQL Queries Team
Projects
BACKLOG, NO NEW ISSUES: SQL Execution
[GENERAL BACKLOG] Enhancements/Featur...
KV
On Hold
Status: Backlog
Development

No branches or pull requests

6 participants