Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throttled transactions return MySQL error code 1041 ER_OUT_OF_RESOURCES #12949

Conversation

ejortegau
Copy link
Contributor

@ejortegau ejortegau commented Apr 21, 2023

Description

This PR causes transactions throttled by the Transaction Throttler to return MySQL error code 1041 ER_OUT_OF_RESOURCES. This error code seems better suited to represent the fact that transactions
are being throttled by the server due to some form of resource contention than the current code 1203 ER_TOO_MANY_USER_CONNECTIONS.

Related Issue(s)

#12958

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation was added or is not required

Deployment Notes

N/A

This error code seems better suited to represent the fact that transactions are
being throttled by the server due to some form of resource contention than the
current code 1203 ER_TOO_MANY_USER_CONNECTIONS.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
@vitess-bot vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says labels Apr 21, 2023
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Apr 21, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a test is added or modified, there should be a documentation on top of the test to explain what the expected behavior is what the test does.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@github-actions github-actions bot added this to the v17.0.0 milestone Apr 21, 2023
@ejortegau ejortegau marked this pull request as ready for review April 24, 2023 09:39
@systay
Copy link
Collaborator

systay commented Apr 24, 2023

@shlomi-noach do you have opinions on this?

Copy link
Contributor

@shlomi-noach shlomi-noach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ER_OUT_OF_RESOURCES means "out of memory" in MySQL's docs. So this, too, is perhaps not the most descriptive error?

@timvaillancourt
Copy link
Contributor

timvaillancourt commented Apr 24, 2023

ER_OUT_OF_RESOURCES means "out of memory" in MySQL's docs. So this, too, is perhaps not the most descriptive error?

It doesn't seem any existing MySQL error is a great fit, so I think we're stuck with "the best we can find" or proposing a new error code to the upstream project, but this new code wouldn't exist on current versions. Maybe the lack of a clear winner means it should be configurable?

ER_OUT_OF_RESOURCES feels like a generic resource saturation error, but agreed that the docs make it sound as though it's for memory only - my take on this is a generic error that's only used for memory today, but it's a guess. To make matters more confusing there is the error code ER_OUTOFMEMORY as well 🤷

ER_ERROR_DURING_BEGIN would work if it existed, but only ER_ERROR_DURING_COMMIT exists and that would be misleading as we don't throttle COMMIT. ER_ABORTING might work, but it's 8.0+ only and technically nothing is "aborted", it never "began"

ER_UNKNOWN_ERROR might work, but it removes all context and sounds like something unexpected happened. Another is ER_QUERY_INTERRUPTED - it usually means KILL'ed, but it could work. The error message Query execution was interrupted seems vague yet accurate

Some more alternatives I found that I feel are less-ideal:

  1. ER_ERROR_WHEN_EXECUTING_COMMAND
    • Errored yes, but very vague
    • Was it "executing" if it never ran BEGIN 🤔?
  2. ER_LOCK_WAIT_TIMEOUT
    • "Lock" is very misleading, but this DOES generally mean "retry your change again, I didn't change anything"
  3. ER_USER_LIMIT_REACHED or ER_TOO_MANY_USER_CONNECTIONS
    • "User" and "User connections" is misleading. User quotas aren't in play here
    • Seems ER_TOO_MANY_USER_CONNECTIONS is the default of the switch block in existing code

Given these suboptimal options I feel "OUT_OF_RESOURCES" gives the user the best picture of what's going on, although the docs will make it sound like it's memory consumption related 🤔

@shlomi-noach
Copy link
Contributor

Go wild with ER_OUT_OF_RESOURCES

@timvaillancourt
Copy link
Contributor

Go wild with ER_OUT_OF_RESOURCES

Oh! After reviewing the 8.x+ error codes I found a decent fit:

ER_RESOURCE_GROUP_BUSY. Of course, we're not really a "resource group" but we ARE measuring "resources" and want to tell the client we are "busy". The concepts have some overlap 🤔

Message: Resource group %s is busy.

This doesn't really help 5.x though. Curious what thoughts you had @ejortegau / @shlomi-noach?

@shlomi-noach
Copy link
Contributor

This doesn't really help 5.x though.

Not sure if it at all matters to users of 5.x? It about the error number the client gets, right? This doesn't go through MySQL in any way.

@timvaillancourt
Copy link
Contributor

timvaillancourt commented Apr 25, 2023

This doesn't really help 5.x though.

Not sure if it at all matters to users of 5.x? It about the error number the client gets, right? This doesn't go through MySQL in any way.

That's a good point. I guess the question is how would a 5.7 mysql shell and/or application driver handle an error code that doesn't exist yet (8.x+). My guess is it handles this, but I wonder if the error message would be clear 🤔

I'll find some time to investigate this scenario 👍

@ejortegau
Copy link
Contributor Author

I would go with 1041 for now - seems the least bad one IMHO. But I am happy to change it if you think it's better - the nature of the change as you can see from the PR is quite simple.

@timvaillancourt
Copy link
Contributor

I would go with 1041 for now - seems the least bad one IMHO. But I am happy to change it if you think it's better - the nature of the change as you can see from the PR is quite simple.

@ejortegau 1041 sounds good to me 👍

@harshit-gangal harshit-gangal added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: Query Serving and removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says Component: Query Serving labels May 2, 2023
@harshit-gangal harshit-gangal merged commit a2bf80a into vitessio:main May 2, 2023
timvaillancourt pushed a commit to slackhq/vitess that referenced this pull request May 10, 2023
…ES (vitessio#12949)

This error code seems better suited to represent the fact that transactions are
being throttled by the server due to some form of resource contention than the
current code 1203 ER_TOO_MANY_USER_CONNECTIONS.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request May 10, 2023
…ES (vitessio#12949) (#81)

This error code seems better suited to represent the fact that transactions are
being throttled by the server due to some form of resource contention than the
current code 1203 ER_TOO_MANY_USER_CONNECTIONS.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
Co-authored-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
timvaillancourt pushed a commit to slackhq/vitess that referenced this pull request May 27, 2024
…ES (vitessio#12949)

This error code seems better suited to represent the fact that transactions are
being throttled by the server due to some form of resource contention than the
current code 1203 ER_TOO_MANY_USER_CONNECTIONS.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request May 28, 2024
* Skip recalculating the rate in MaxReplicationLagModule when it can't be done (vitessio#12620)

* Skip recalculating the rate in MaxReplicationLagModule when it can't be done

This defends against lag records with nil stats which can lead to segfaults.
See vitessio#12619

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Make linter happy

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

---------

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Throttled transactions return MySQL error code 1041 ER_OUT_OF_RESOURCES (vitessio#12949)

This error code seems better suited to represent the fact that transactions are
being throttled by the server due to some form of resource contention than the
current code 1203 ER_TOO_MANY_USER_CONNECTIONS.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* MaxReplicationLagModule.recalculateRate no longer fills the log (vitessio#14875)

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

---------

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
Co-authored-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Query Serving Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants