New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: periodically flush sqlstats #67090
Conversation
e2e4ae2
to
6049ab3
Compare
f5e5b0d
to
70815a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng)
pkg/sql/sqlstats/persistedsqlstats/sqlstatsutil/json_decoding.go, line 136 at r1 (raw file):
metadata json.JSON, result *roachpb.CollectedStatementStatistics, ) error { err := decodeString(metadata, "stmtTyp", &result.Stats.SQLType)
You've built a lot of machinery here but it still ends up being quite verbose to use. I think I'd rework it to be more composable. You can probably also rearrange it such that the decoding and encoding can be symmetrical for each of your top-level data types. Consider the following approach:
type jsonDecoder interface {
decodeJSON(js json.JSON) error
}
type decimal tree.DDecimal
func (d *decimal) decodeJSON(js json.JSON) error {
var ok bool
decimal := (*apd.Decimal)(d)
*decimal, ok = field.AsDecimal()
if !ok {
return nil, errors.New("unable to decode float64")
}
return nil
}
type jsonFloat float64
func (f *jsonFloat) decodeJSON(js json.JSON) error {
var d apd.Decimal
if err := (*decimal)(d).decodeJSON(js); err != nil {
return err
}
asFloat, err := d.AsFloat()
if err != nil {
return err
}
*f = jsonFloat(asFloat)
return nil
}
func (jf jsonField) (js json.JSON) (err error) {
defer func() {
if err != nil {
err = errors.Wrapf(err, "decoding field %s", field)
}
}()
field, err := safeFetchVal(js, jf.field)
if err != nil {
return err
}
return jf.dec.decodeJSON(field)
}
type jsonField struct {
field string
dec jsonDecoder
}
type jsonFields []jsonField
func (jf jsonFeilds) decodeJSON(js json.JSON) error {
for i := range jf {
if err := jf[i].decodeJSON(js); err != nil {
return err
}
}
}
func (n *numeric) decodeJSON(obj json.JSON) error {
fields := [...]jsonField{
{"mean", (*jsonFloat)(&result.Mean)},
{"sqDiff", (*jsonFloat)(&result.SquaredDiffs)},
}
return (jsonFields)(fields[:]).decodeJSON(obj)
}
//...
ea32d16
to
b34c11c
Compare
cc95d99
to
92f2f92
Compare
TPCC-1000 Benchmark on 3-node roachprod (16 vCPU each): This PR:
Seems like the impact increasing the avg latency by 2ms. TPCC runs for 2 hours and I changed the flush interval to 15 mins so it's easier to observe through custom time series chart: Total flush count: Average flush duration: In-memory footprint cc: @jordanlewis |
Very nice! The average flush duration seems a little suspicious - 20 microseconds is too fast for an entire roundtrip to SQL. Can you double check that? |
I gathered more data from the master branch and the flush branch (with different flush interval configured) using TPCC-1000. It seems like we are not creating too much impact in terms of latency. Master averages 59.05ms p99 latency, the flush15 (15 mins flush interval) and flush60 averages 59.79ms and 60.27ms p99 latency respectively. The increase of p99 latency is below 2ms. Though I'm choosing to ignore the pMax latency here because it seems like we are seeing a huge fluctuation of its value even on the master branch. So it serves as a poor indicator of the overall performance.
Overall I think this is good news for us. I think the next step for me is to try to overload the cluster and see how the performance varies. |
caee1af
to
55b15b0
Compare
67866: sql: implement SQL Stats flush logic r=Azhng a=Azhng Previous PR: #67805 Next Chained PR: #67090 ## First Commit sql: remove `count` from stmt/txn stats system table Previously, system.statement_statistics and system.transaction_statistics table includes a `count` column that corresponds to `roachpb.StatementStatistics.Count` and `roachpb.TransactionStatistics.Count` fields respectively. The objective for that column is to make `INSERT ON CONFLICT DO UPDATE` style query easy. However, since early prototyping have shown that `INSERT ON CONFLICT DO UPDATE` style statement is quite inefficient, the SQL Stats flush mechanism will be implemented using separate queries INSERT and UPDATE statements. This column is no longer userful and it would require special handling. Removing this column simplifies the flush logic and removes the need for special handlings. Release note (sql change): count column is removed from system.statement_statistics and system.transaction_statistics tables. ## Second Commit sql: implement persistedsqlstats flush logic This commit implements the initial flush logic of the persisted sql stats subsystem. Release note: None 68426: kv: assert txn unused in SetFixedTimestamp r=nvanbenschoten a=nvanbenschoten This commit asserts that a transaction has not been used to read or to write by the time that `SetFixedTimestamp` is called on it. This was extracted from #68194 and modified to return an error from `SetFixedTimestamp` on misuse instead of fatal-ing. This provides a sufficient, temporary backstop for #68216 until the conn executor logic is fixed: ``` root@127.0.0.1:26257/movr> create table t (x int); CREATE TABLE root@127.0.0.1:26257/movr> insert into t values (1); INSERT 1 root@127.0.0.1:26257/movr> select crdb_internal_mvcc_timestamp, * from t; crdb_internal_mvcc_timestamp | x ---------------------------------+---- 1628094563935439000.0000000000 | 1 (1 row) root@127.0.0.1:26257/movr> begin as of system time (1628094563935439000.0000000000-1)::string; BEGIN root@127.0.0.1:26257/movr OPEN> select * from t; x ----- (0 rows) root@127.0.0.1:26257/movr OPEN> prepare y as select * from t as of system time 1628094563935439000.0000000000; ERROR: internal error: cannot set fixed timestamp, txn "sql txn" meta={id=e5e81c19 pri=0.01517572 epo=0 ts=1628094563.935438999,0 min=1628094563.935438999,0 seq=0} lock=false stat=PENDING rts=1628094563.935438999,0 wto=false gul=1628094563.935438999,0 already performed reads SQLSTATE: XX000 DETAIL: stack trace: github.com/cockroachdb/cockroach/pkg/kv/kvclient/kvcoord/txn_coord_sender.go:1016: SetFixedTimestamp() github.com/cockroachdb/cockroach/pkg/kv/txn.go:1200: SetFixedTimestamp() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_prepare.go:278: populatePrepared() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_prepare.go:220: func1() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_prepare.go:226: prepare() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_prepare.go:112: addPreparedStmt() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:570: execStmtInOpenState() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:126: execStmt() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1626: func1() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1628: execCmd() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1550: run() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:627: ServeConn() github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:645: func1() runtime/asm_amd64.s:1371: goexit() HINT: You have encountered an unexpected error. Please check the public issue tracker to check whether this problem is already tracked. If you cannot find it there, please report the error with details by creating a new issue. If you would rather not post publicly, please contact us directly using the support form. We appreciate your feedback. root@127.0.0.1:26257/? ERROR> ``` 68442: kv: include RangeID in rangefeed goroutine stacks r=nvanbenschoten a=nvanbenschoten This commit includes the RangeID in each of a rangefeed processor and its registations' associated goroutine stacks. This is a cheap and easy way to get better observability into the ranges that have active rangefeeds. It also tells us where those goroutines are spending their time. This will also become easier to use in Go 1.17, which improved the format of stack traces. 68443: parser: add VIRTUAL syntax to help r=RaduBerinde a=rafiss Release note: None Co-authored-by: Azhng <archer.xn@gmail.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe remove the big First Commit header from the PR description now?
Reviewed 5 of 7 files at r23, 23 of 23 files at r24.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @jordanlewis, and @maryliag)
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 100 at r24 (raw file):
select { case <-resetIntervalChanged: // Recalculate flush interval.
IIUC, in this case we stop waiting, flush immediately, and then the post statement of the for loop calculates the next flush interval / wait period, yes? That might be worth spelling out a bit more in this comment. I had to scratch my head for a bit.
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 103 at r24 (raw file):
continue case <-timer.C: timer.Read = true
This is the "normal" case, right? Can we put it first in the list?
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 104 at r24 (raw file):
case <-timer.C: timer.Read = true case <-s.memoryPressureSignal:
I could use a comment here, too. Would it be something like, "We're experiencing memory pressure, so flush SQL stats to disk immediately, rather than waiting the full flush interval, in an attempt to relieve some of that pressure?"
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 120 at r24 (raw file):
// [(1 - SQLStatsFlushJitter) * SQLStatsFlushInterval), // (1 + SQLStatsFlushJitter) * SQLStatsFlushInterval)] func (s *PersistedSQLStats) getWaitPeriod() time.Duration {
I see how (with the jitter) this is different from the "flush interval," but I wonder if reusing those words would help clarify a little more. So maybe this method could be called nextFlushInterval
or something?
This commit introduces a new persisted sql stats subsystem that wraps the existing node-local sql stats subsystem. This new subsystem is responsible for flushing the in-meory statistics into the system table periodically, or when it detects memory pressure. This replaces sql.Server's in-memory sqlStats provider. Release note (sql change): SQL stats now can be persisted into system.statement_statistics and system.transaction_statistics tables by enabling the sql.stats.flush.enable cluster setting. The interval of persistence is determined by the new sql.stats.flush.interval cluster setting which defaults to 1 hour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, @maryliag, and @matthewtodd)
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 100 at r24 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
IIUC, in this case we stop waiting, flush immediately, and then the post statement of the for loop calculates the next flush interval / wait period, yes? That might be worth spelling out a bit more in this comment. I had to scratch my head for a bit.
In this case, we would stop waiting but we do not flush immediately. The continue
here would cause the loop to restart and skip the flush. I'll add a comment here.
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 103 at r24 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
This is the "normal" case, right? Can we put it first in the list?
Done.
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 104 at r24 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
I could use a comment here, too. Would it be something like, "We're experiencing memory pressure, so flush SQL stats to disk immediately, rather than waiting the full flush interval, in an attempt to relieve some of that pressure?"
Done.
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 120 at r24 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
I see how (with the jitter) this is different from the "flush interval," but I wonder if reusing those words would help clarify a little more. So maybe this method could be called
nextFlushInterval
or something?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r25.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @Azhng, @jordanlewis, and @maryliag)
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 100 at r24 (raw file):
Previously, Azhng (Archer Zhang) wrote…
In this case, we would stop waiting but we do not flush immediately. The
continue
here would cause the loop to restart and skip the flush. I'll add a comment here.
Oh, I see. That's interesting. I'm curious to know if there's a particular reason we don't flush here; is there something we're trying to protect against?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, @maryliag, and @matthewtodd)
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 100 at r24 (raw file):
Previously, matthewtodd (Matthew Todd) wrote…
Oh, I see. That's interesting. I'm curious to know if there's a particular reason we don't flush here; is there something we're trying to protect against?
The reset interval change occurs when users run SET CLUSTER SETTING sql.stats.flush.interval = ...
command. I think this would be a bit of unexpected if resetting this cluster setting causes flush to happen, since neither the timer has fired nor has the memory limit has been reached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, and @maryliag)
pkg/sql/sqlstats/persistedsqlstats/provider.go, line 100 at r24 (raw file):
Previously, Azhng (Archer Zhang) wrote…
The reset interval change occurs when users run
SET CLUSTER SETTING sql.stats.flush.interval = ...
command. I think this would be a bit of unexpected if resetting this cluster setting causes flush to happen, since neither the timer has fired nor has the memory limit has been reached.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, and @maryliag)
TFTR! bors r+ |
Build succeeded: |
68434: sql,jobs: create SQL Stats Compaction Job and resumer r=ajwerner,maryliag a=Azhng Previous PR #67090 Followed by #68401 ## First Commit sql,util: refactor checking for running jobs to its own util package Release note: None ## Second Commit sql,jobs: create SQL Stats Compaction Job and resumer This commit introduces the SQL Stats Compaction job type and a barebones implementation of the SQL Stats compaction. Release justification: category 4 Release note: None 69294: changefeedccl: small test changes r=stevendanna a=stevendanna Two small test changes: - Use log.TestingClearServerIdentifiers() to avoid opening a new log scope. - Don't start a schema registry when the kafka sink will already start one for us. Release justification: low risk, test only changes (category 1) Release note: None Co-authored-by: Azhng <archer.xn@gmail.com> Co-authored-by: Steven Danna <danna@cockroachlabs.com>
Previous PR: #67866
This commit introduces a new persisted sql stats subsystem
that wraps the existing node-local sql stats subsystem.
This new subsystem is responsible for flushing the in-meory
statistics into the system table periodically, or when it
detects memory pressure.
This replaces sql.Server's in-memory sqlStats provider.
Release note (sql change): SQL stats now can be persisted into
system.statement_statistics and system.transaction_statistics
tables by enabling the sql.stats.flush.enable cluster setting.
The interval of persistence is determined by the new
sql.stats.flush.interval cluster setting which defaults to 1 hour.