Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

domain, session: Add new sysvarcache to replace global values cache #24359

Merged
merged 30 commits into from
May 17, 2021

Conversation

morgo
Copy link
Contributor

@morgo morgo commented Apr 28, 2021

What problem does this PR solve?

Issue Number: close #24326

Problem Summary:

The existing global vars cache only caches for 2 seconds, and does not perform notification to other servers when cache is invalid. This PR changes the design to be basically the same as the privileges system cache.

It fixes two immediate bugs:

  • Read after write consistency from setting global vars (on the initiating instance only)
  • Running SHOW VARIABLES for the first time in a session previously took 1 second in some cases(!). It should now read the values from memory.

What is changed and how it works?

What's Changed:

  • The global vars cache is replaced with a new cache.
  • All operations check the cache first rather than read from mysql.global_variables.
  • Updating a global variable sends a notification to other servers via etcd.
  • Updating a global variable is read-after-write consistent on a single server.

This opens the door, but does not fix some remaining issues - the session cache should be populated with a copy of all session vars when the session starts. Currently there is a lazy loading mechanism which is not MySQL compatible. It also caches some global variables in the session systems[] which is incorrect. In a followup PR I hope to remove the array builtinGlobalVariable and simplify loadCommonGlobalVariablesIfNeeded to just copy session vars.

Related changes

Check List

Tests

  • Unit test
  • Manual test (add detailed scripts or steps below)

I manually verified that behavior is correct with multiple servers receiving notice from etcd.

Side effects

  • There is an small upgrade issue with this PR: because the cache now lives for a long time, it relies on a notification from etcd that the cache is stale and needs to be refreshed. If the cluster is running with mixed-versions, and the SET GLOBAL statement is run on the older version, it will not send an etcd notification to newer servers that their cache is stale. I think this behavior is acceptable, since the cache will be refreshed within a few minutes, but it should be made clear in the release notes.

Release note

  • The system variables cache has been replaced to follow a similar design to privileges cache, where all TiDB servers are immediately notified of changes to variables. This lifts a previous limitation where changes to global variables may take 2 seconds to take effect. To take advantage of this feature, all TiDB servers need to be upgraded so that they are able to both publish and subscribe to system variable changes. Operating with mixed versions might result in an issue where system variable changes take longer (30 seconds) to propagate to the cluster.

@morgo morgo requested review from a team as code owners April 28, 2021 18:59
@morgo morgo requested review from qw4990 and removed request for a team April 28, 2021 18:59
@ti-chi-bot ti-chi-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Apr 28, 2021
@morgo morgo added the sig/sql-infra SIG: SQL Infra label Apr 28, 2021
@github-actions github-actions bot added the sig/execution SIG execution label Apr 28, 2021
@morgo
Copy link
Contributor Author

morgo commented Apr 28, 2021

/run-all-tests

@morgo
Copy link
Contributor Author

morgo commented Apr 28, 2021

/run-all-tests

@morgo morgo requested a review from tiancaiamao April 28, 2021 20:59
if !ok {
logutil.BgLogger().Error("LoadSysVarCacheLoop loop watch channel closed")
watchCh = do.etcdClient.Watch(context.Background(), sysVarCacheKey)
count++
Copy link
Contributor

@zhangjinpeng87 zhangjinpeng87 Apr 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it will do endless retrying if there is something wrong with PD?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This code is copied from the loop for privilege cache. If the server is up, it should keep retrying.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I remember the design of endless retrying PD connectivity came up in the jepsen test. This was the recommended behavior.


count = 0
logutil.BgLogger().Info("Rebuilding sysvar cache from etcd watch event.")
err := do.sysVarCache.RebuildSysVarCache(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think ReBuildSysVarCache Asyncrhonizely can guarantee the read-after-write consistency. After one tidb-server set the global variable and notify the etcd, another tidb-server may still use the old value in a short time range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should clarify: It is read-after-write on a single server only. This is still very useful because it simplifies things like the testsuite.

The reason why it offers read-after-write is that the notify function runs the rebuild task on the initiating server immediately. So the initiating server actually rebuilds the cache twice (same design as privilege cache).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the corresponding PR where the privilege cache was changed to run immediately on the initiating server: #8886

@xhebox
Copy link
Contributor

xhebox commented Apr 29, 2021

#22808 could be fixed by this PR. And we may need to correct the document accordingly. @morgo

@morgo
Copy link
Contributor Author

morgo commented Apr 29, 2021

#22808 could be fixed by this PR. And we may need to correct the document accordingly. @morgo

Yes, I can confirm this will be fixed by this PR. I will update the docs to remove the note about the 2 second delay from the docs when this merges (I agree it's misleading, I totally assumed it was sending a notification to other servers via etcd until I dug into this further).

There are a lot of scripts that could easily naively assume on a single-server changes have read-after-write semantics.

@purelind
Copy link
Contributor

purelind commented May 2, 2021

/run-all-tests

@morgo
Copy link
Contributor Author

morgo commented May 13, 2021

/run-all-tests

@morgo
Copy link
Contributor Author

morgo commented May 13, 2021

/run-unit-test
/run-integration-common-test

@morgo
Copy link
Contributor Author

morgo commented May 13, 2021

There seems to be a DATA RACE:

[2021-05-13T04:04:17.499Z] fatal error: concurrent map iteration and map write
[2021-05-13T04:04:17.499Z] 
[2021-05-13T04:04:17.499Z] goroutine 26001 [running]:
[2021-05-13T04:04:17.499Z] runtime.throw(0x3998cc3, 0x26)
[2021-05-13T04:04:17.499Z] 	/usr/local/go/src/runtime/panic.go:774 +0x72 fp=0xc068679bc8 sp=0xc068679b98 pc=0x1355f12
[2021-05-13T04:04:17.499Z] runtime.mapiternext(0xc068679cf8)
[2021-05-13T04:04:17.499Z] 	/usr/local/go/src/runtime/map.go:858 +0x579 fp=0xc068679c50 sp=0xc068679bc8 pc=0x1335e39
[2021-05-13T04:04:17.499Z] github.com/pingcap/tidb/domain.(*SysVarCache).RebuildSysVarCache(0xc06647a8f0, 0x3ff6780, 0xc064bd11e0, 0x0, 0x0)
[2021-05-13T04:04:17.499Z] 	/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/domain/sysvar_cache.go:121 +0x127 fp=0xc068679d68 sp=0xc068679c50 pc=0x25eae87
[2021-05-13T04:04:17.499Z] github.com/pingcap/tidb/domain.(*Domain).LoadSysVarCacheLoop.func1(0xc06647a870, 0xc066d9e1a0, 0x6fc23ac00, 0x3ff6780, 0xc064bd11e0)
[2021-05-13T04:04:17.499Z] 	/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/domain/domain.go:964 +0x224 fp=0xc068679fb8 sp=0xc068679d68 pc=0x25ee524
[2021-05-13T04:04:17.499Z] runtime.goexit()
[2021-05-13T04:04:17.499Z] 	/usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc068679fc0 sp=0xc068679fb8 pc=0x1386e41
[2021-05-13T04:04:17.499Z] created by github.com/pingcap/tidb/domain.(*Domain).LoadSysVarCacheLoop
[2021-05-13T04:04:17.499Z] 	/home/jenkins/agent/workspace/tidb_ghpr_check_2/go/src/github.com/pingcap/tidb/domain/domain.go:938 +0xef

Thanks! This is fixed in 46d5b31

@morgo morgo requested a review from tiancaiamao May 14, 2021 16:59
@tiancaiamao
Copy link
Contributor

/LGTM

@ti-chi-bot
Copy link
Member

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • tiancaiamao
  • xhebox

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by writing /lgtm in a comment.
Reviewer can cancel approval by writing /lgtm cancel in a comment.

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 17, 2021
@morgo
Copy link
Contributor Author

morgo commented May 17, 2021

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: be30709

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label May 17, 2021
@morgo
Copy link
Contributor Author

morgo commented May 17, 2021

/run-unit-test

@ti-chi-bot ti-chi-bot merged commit 0f10bef into pingcap:master May 17, 2021
@morgo morgo deleted the sysvar-cache branch May 17, 2021 14:13
Howie59 pushed a commit to Howie59/tidb that referenced this pull request May 21, 2021
…ingcap#24052)

* *: fix revoke statement for CURRENT_USER() and refine error message

planner: support set tidb_allow_mpp to `2` or `ENFORCE` to enforce use mpp mode. (pingcap#24516)

store/tikv: remove use of SchemaAmender option in store/tikv (pingcap#24408)

*: the value of tikv-client.store-liveness-timeout should not less than 0 (pingcap#24244)

store/tikv: remove use of EnableAsyncCommit option in store/tikv (pingcap#24462)

txn: Add txn state's view (pingcap#22908)

planner: ignore lock for temporary table of PointGet and BatchPointGet (pingcap#24540)

store/tikv: remove use of ReplicaRead transaction option in store/tikv (pingcap#24409)

store/driver: move error to single package (pingcap#24549)

ddl: add check table compatibility for temporary table (pingcap#24501)

store/tikv: remove use of IsStatenessReadOnly option in store/tikv (pingcap#24464)

store/tikv: change backoff type for missed tiflash peer. (pingcap#24577)

store/tikv: remove use of MatchStoreLabels transaction option in store/tikv (pingcap#24465)

executor, meta: Allocate auto id for global temporary tables (pingcap#24506)

store/tikv: remove use of SampleStep option in store/tikv (pingcap#24461)

executor: add partition pruning tests for adding and dropping partition operations (pingcap#24573)

ddl: forbid partition on temporary mode before put into queue (pingcap#24565)

ddl: speedup test case TestIndexOnMultipleGeneratedColumn (pingcap#24487)

execution: Fix issue 24439 Inconsistent error with MySQL for GRANT CREATE USER ON <specific db>.* (pingcap#24485)

*: fix errcheck (pingcap#24463)

test: make TestExtractStartTs stable (pingcap#24585)

ddl: forbid recover/flashback temporary tables (pingcap#24518)

executor: fix point_get result on clustered index when new-row-format disabled but new-collation enabled (pingcap#24544)

executor: Improve the performance of appending not fixed columns (pingcap#20969)

*: typo fix (pingcap#24564)

planner/core: refresh stale regions in cache for batch cop response (pingcap#24457)

binlog: DML on temporary tables do not write binlog (pingcap#24570)

store/tikv: remove use of GuaranteeLinearizability option in store/tikv (pingcap#24605)

store/tikv: remove use of CollectRuntimeStats option in store/tikv (pingcap#24604)

store/tikv: move Backoffer into a single package (pingcap#24525)

variables: init cte max recursive deeps in a new session (pingcap#24609)

store/tikv: move transaction options out to /kv (pingcap#24619)

store/driver: move backoff driver into single package so we can use i… (pingcap#24624)

server: close the temporary session in HTTP API to avoid memory leak (pingcap#24339)

store/tikv: use latest PD TS plus one as min commit ts (pingcap#24579)

planner: fix incorrect TableDual plan built from nulleq (pingcap#24596)

ranger: fix the case which could have duplicate ranges (pingcap#24590)

 executor, store: Pass the SQL digest down to pessimistic lock request (pingcap#24380)

kv: remove UnionStore interface (pingcap#24625)

*: enable gosimple linter (pingcap#24617)

txn: avoid the gc resolving pessimistic locks of ongoing transactions (pingcap#24601)

util: fix wrong enum building for index range  (pingcap#24632)

sessionctx: change innodb large prefix default (pingcap#24555)

store: fix data race about KVStore.tikvClient (pingcap#24655)

executor, privileges: Add dynamic privileges to SHOW PRIVILEGES (pingcap#24646)

ddl: refactor rule [4/6] (pingcap#24007)

cmd: ddl_test modify retryCnt from 5 to 20 (pingcap#24662)

executor: add correctness tests about direct reading with ORDER BY and LIMIT (pingcap#24455)

store/tikv: remove options from unionstore (pingcap#24629)

planner: fix wrongly check for update statement (pingcap#24614)

store/tikv: remove CompareTS (pingcap#24657)

planner, privilege: Add security enhanced mode part 4 (pingcap#24416)

executor: add some test cases about partition table dynamic-mode with split-region (pingcap#24665)

planner: fix wrong column offsets when processing dynamic pruning for IndexJoin (pingcap#24659)

*: Add security enhanced mode part 3 (pingcap#24412)

store/tikv: resolve ReplicaReadType dependencies (pingcap#24653)

executor: add test cases about partition table with `expression` (pingcap#24628)

tablecodec: fix write wrong prefix index value when collation is ascii_bin/latin1_bin (pingcap#24578)

*: compatibility with staleread (pingcap#24285)

session: test that temporary tables will also be retried (pingcap#24505)

domain, session: Add new sysvarcache to replace global values cache (pingcap#24359)

ddl, transaction: DDL on temporary tables won't affect transactions (pingcap#24534)

*: implement tidb_bounded_staleness built-in function (pingcap#24328)

executor: add correctness tests for partition table with different joins (pingcap#24673)

expression: fix the spelling of word arithmetical (pingcap#24713)

store/copr: balance region for batch cop task (pingcap#24521)

store, metrics: Add metrics for safetTS updating (pingcap#24687)

sem: add tidbredact log to restricted variables (pingcap#24701)

session: fix dml_batch_size doesn't load the global variable (pingcap#24710)

store/tikv: retry TSO RPC (pingcap#24682)

expression, planner: push cast down to control function with enum type. (pingcap#24542)

executor: add correctness tests about IndexMerge (pingcap#24674)

variable: change default for DefDMLBatchSize tidbOptInt64 call (pingcap#24697)

planner: add partitioning pruning tests for range partitioning (pingcap#24554)

*: add option for enum push down (pingcap#24685)

txn: break dependency from store/tikv to tidb/kv cause by TransactionOption (pingcap#24656)

executor: enhancement for ListInDisk(support writing after reading) (pingcap#24379)

kv: move TxnScope into kv (pingcap#24715)

execution: fix the incorrect use of cached plan for point get (pingcap#24749)

executor: add correctness tests about direct reading with indexJoin (pingcap#24497)

variable:  fix sysvar datarace with deep copy (pingcap#24732)

*: Implementing RENAME USER (pingcap#24413)

infoschema, executor: Add the deadlock table (pingcap#24524)

docs: Some proposal for renaming and configurations for Lock View (pingcap#24718)

planner: add range partition boundaries tests with BETWEEN expression (pingcap#24598)

oracle: simplify timestamp utilities (pingcap#24688)

executor: fix wrong enum key in point get (pingcap#24618)

ranger: fix incorrect enum range for xxx_ci collation (pingcap#24661)

executor: add some test cases about dynamic-mode and apply operator (pingcap#24683)

store/tikv: remove Variables.Hook (pingcap#24758)

ddl: speed up the execution time of `TestBackwardCompatibility`. (pingcap#24704)

*: prepare errors for CTE (pingcap#24763)

expression: support cast real/int as real (pingcap#24670)

executor: add table name in log (pingcap#24666)

expression: add builtin function ``json_pretty`` (pingcap#24675)

ddl: make `TestDropLastVisibleColumns` stable (pingcap#24790)

* ddl: make `TestDropLastVisibleColumns` stable

*: support AS OF TIMESTAMP read-only begin statement (pingcap#24740)

executor: Fix unstable TestTiDBLastTxnInfoCommitMode (pingcap#24779)

planner: add tests for partition range boundaries for LT/GT (pingcap#24574)

test: record random seed in TestIssue20658 (pingcap#24782)

store/tikv/retry: define Config instead of BackoffType (pingcap#24692)

config: ignore tiflash when show config (pingcap#24770)

privileges: improve dynamic privs registration and tests (pingcap#24773)

README: remove the link of TiDB Monthly Update (pingcap#24791)

region_cache: filter peers on tombstone or dropped stores (pingcap#24726)

util/stmtsummary: remove import package tikv (pingcap#24776)

ddl: grammar check for create unsupported temporary table (pingcap#24723)

*: update go.etcd.io/bbolt (pingcap#24799)

ddl: speed up the execution time of `ddl test` and `Test Chunk pingcap#7 ddl-other` (pingcap#24780)

executor: remove the unnecessary use of fmt.Sprintf (pingcap#24815)

executor: fix index join panic on prefix index on some cases (pingcap#24568)
@lzmhhh123
Copy link
Contributor

/run-cherry-picker

ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Jul 8, 2021
Signed-off-by: ti-srebot <ti-srebot@pingcap.com>
@ti-srebot
Copy link
Contributor

cherry pick to release-4.0 in PR #26030

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution sig/sql-infra SIG: SQL Infra size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

show session variables is very slow on first run
8 participants