Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: track and display counters for successfully executed statements #37264

Merged

Conversation

Projects
None yet
4 participants
@ajwerner
Copy link
Collaborator

commented May 2, 2019

Before this PR, statement counters were incremented before executing a SQL
statement. This is problematic because it means that the counters include
statements which fail during execution. This changes the logic to increment the
counters when statements complete successfully.

Release note (admin ui change): Only include successfully executed statements
in the statement counters.

@ajwerner ajwerner requested a review from andreimatei May 2, 2019

@ajwerner ajwerner requested review from cockroachdb/sql-execution-prs as code owners May 2, 2019

@cockroach-teamcity

This comment has been minimized.

Copy link
Member

commented May 2, 2019

This change is Reviewable

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 2, 2019

@piyush-singh I know we talked about this in the past, is there an issue I can link this to?

@piyush-singh

This comment has been minimized.

Copy link

commented May 2, 2019

@andreimatei
Copy link
Member

left a comment

I dunno bro, there's nothing in the metadata of those counters that suggests anything about "success".
I think we should have total counters and error counters, no?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 2, 2019

I dunno bro, there's nothing in the metadata of those counters that suggests anything about "success".
I think we should have total counters and error counters, no?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)

Sure, it's just weird when you have a cluster where most of your requests are timing out and the topmost chart looks like your throughput is going up but in fact its going down. We don't have an easy way to subtract two time series from each other for display in the admin ui (a limitation of the tsdb I suppose).

How about I add two more metrics, failure and success for each statement type and then propose migrating the admin UI to the success metric?

@andreimatei

This comment has been minimized.

Copy link
Member

commented May 2, 2019

Sure, it's just weird when you have a cluster where most of your requests are timing out and the topmost chart looks like your throughput is going up but in fact its going down.

This is only true for benchmark clients that spam the server with as many queries as they can, not for real apps where throughput presumably stays flat.

How about I add two more metrics, failure and success for each statement type and then propose migrating the admin UI to the success metric?

I think what I'd do is simply copy the errors metric to the overview metrics dashboard, just below SQL Queries.

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 2, 2019

I tried to do some digging on what others do. From a cursory glance I couldn't figure out the MySQL metric semantics. Here's what I dug out of some public grafana dashboard templates:

Cassandra: Uses count from their latency histogram which implies success
Postgres: Shows transaction commits and rollbacks and rows read and rows returned, nothing about number of statements.

I'm going to update this diff to track all 3 things and we can have a separate conversation with @piyush-singh about the right thing to display. I'll admit that this will collect redundant data but if having a few extra counters breaks the bank we've got bigger problems. Reasonable?

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 2, 2019

I'm realizing now that this also interacts with telemetry. Let me sync up with @piyush-singh before moving forward with this.

@ajwerner ajwerner force-pushed the ajwerner:ajwerner/counters-only-on-success branch from b4e68ba to af087a6 May 8, 2019

@ajwerner ajwerner requested a review from cockroachdb/admin-ui-prs as a code owner May 8, 2019

@ajwerner ajwerner changed the title sql: only increment statement counters upon success sql: track and display counters for successfully executed statements May 8, 2019

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 8, 2019

Still need to fix the testing, not yet ready for a look

@ajwerner ajwerner force-pushed the ajwerner:ajwerner/counters-only-on-success branch from af087a6 to 8ba3188 May 9, 2019

@ajwerner ajwerner requested a review from cockroachdb/sql-wiring-prs as a code owner May 9, 2019

@ajwerner ajwerner force-pushed the ajwerner:ajwerner/counters-only-on-success branch from 8ba3188 to a8d6e8e May 10, 2019

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 13, 2019

This is RFAL. I attempted to do more programatic metric meta creation but ultimately reverted it in favor of just making another copy.

@andreimatei
Copy link
Member

left a comment

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/conn_executor_exec.go, line 66 at r1 (raw file):

func (ex *connExecutor) execStmt(
	ctx context.Context, stmt Statement, res RestrictedCommandResult, pinfo *tree.PlaceholderInfo,
) (ev fsm.Event, payload fsm.EventPayload, err error) {

revert this


pkg/sql/conn_executor_exec.go, line 1305 at r1 (raw file):

}

// maybeIncrementExecutedStmtCounter checks if err and payload to determine if

s/checks if/checks


pkg/sql/conn_executor_exec.go, line 1308 at r1 (raw file):

// an error occurred and if not, increments the appropriate statement counter
// for stmt's type.
func (ex *connExecutor) maybeIncrementExecutedStmtCounter(

only one of the callers passes an err. And anything that deals with two possible errors is confusing. I'd remove the err from this function and deal with it in that one caller.
And then I won't nit about commenting `maybe..(.., nil /* err */)

@ajwerner ajwerner force-pushed the ajwerner:ajwerner/counters-only-on-success branch from a8d6e8e to bc798dd May 13, 2019

@ajwerner
Copy link
Collaborator Author

left a comment

TFTR! I moved the conditional logic into the caller

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)


pkg/sql/conn_executor_exec.go, line 66 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

revert this

Done.


pkg/sql/conn_executor_exec.go, line 1305 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

s/checks if/checks

Changed the logic.


pkg/sql/conn_executor_exec.go, line 1308 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

only one of the callers passes an err. And anything that deals with two possible errors is confusing. I'd remove the err from this function and deal with it in that one caller.
And then I won't nit about commenting `maybe..(.., nil /* err */)

I moved the conditional logic up to the callers.

sql: only increment statement counters upon success
Before this PR, statement counters were incremented before executing a SQL
statement. This is problematic because it means that the counters include
statements which fail during execution. This commit adds an additional set
of statement counters which track successfully executed statements which use
the old metric name.

The reason the new behavior uses the old name is to retain historical data
in the UI for these charts after an upgrade. This does unfortunately mean that
the definition of this metric will change in telemetry where the old definition
will have a new name. This is both probably okay from an analytics perspective
(and maybe even preferable) and can be mitigated with some fancy SQL.

Release note (admin ui change): Only include successfully executed statements
in the statement counters.

@ajwerner ajwerner force-pushed the ajwerner:ajwerner/counters-only-on-success branch from bc798dd to b291909 May 13, 2019

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 16, 2019

I added some unit tests (certainly not as exhaustive as it could be) if you want take a look before I merge it

@ajwerner

This comment has been minimized.

Copy link
Collaborator Author

commented May 23, 2019

bors r+

craig bot pushed a commit that referenced this pull request May 23, 2019

Merge #37264 #37492 #37603
37264: sql: track and display counters for successfully executed statements r=ajwerner a=ajwerner

Before this PR, statement counters were incremented before executing a SQL
statement.  This is problematic because it means that the counters include
statements which fail during execution. This changes the logic to increment the
counters when statements complete successfully.

Release note (admin ui change): Only include successfully executed statements
in the statement counters.

37492: roachprod: create geo-distributed clusters with extra nodes placed at the end r=nvanbenschoten a=ajwerner

roachprod: create geo-distributed clusters with extra nodes placed at the end                                                                                                                                      
                                                                                                                                                                                                                   
Prior to this change, roachprod would create geo-distributed clusters by                                                                                                                                           
placing nodes in AZs in contiguous chunks. If the number of total nodes                                                                                                                                            
was not evenly divisible by the number of regions, the first regions                                                                                                                                               
would be allocated one additional node. This allocation pattern is                                                                                                                                                 
rarely desirable. A user will commonly allocate a single extra node as a                                                                                                                                           
load generator and would generally like that load node to be the final                                                                                                                                             
node and for that final node to be the extra node.                                                                                                                                                                 
                                                                                                                                                                                                                   
This changes the allocation where the extra nodes are placed in the same                                                                                                                                           
regions as before but are given node indices at the end rather than with                                                                                                                                           
the other nodes in their region.                                                                                                                                                                                   
                                                                                                                                                                                                                   
After this change a cluster created with `roachprod create $CLUSTER -n 7 --geo`                                                                                                                                    
will look like:                                                                                                                                                                                                    
                                                                                                                                                                                                                   
```                                                                                                                                                                                                                
ajwerner-test-roachprod-gce: [gce] 12h47m58s remaining                                                                                                                                                             
  ajwerner-test-roachprod-gce-0001      ajwerner-test-roachprod-gce-0001.us-east1-b.cockroach-ephemeral 10.142.0.70      34.74.58.108                                                                              
  ajwerner-test-roachprod-gce-0002      ajwerner-test-roachprod-gce-0002.us-east1-b.cockroach-ephemeral 10.142.0.5       35.237.74.155                                                                             
  ajwerner-test-roachprod-gce-0003      ajwerner-test-roachprod-gce-0003.us-west1-b.cockroach-ephemeral 10.138.0.99      35.199.159.104                                                                            
  ajwerner-test-roachprod-gce-0004      ajwerner-test-roachprod-gce-0004.us-west1-b.cockroach-ephemeral 10.138.0.100     35.197.94.83                                                                              
  ajwerner-test-roachprod-gce-0005      ajwerner-test-roachprod-gce-0005.europe-west2-b.cockroach-ephemeral      10.154.15.237   35.230.143.190                                                                    
  ajwerner-test-roachprod-gce-0006      ajwerner-test-roachprod-gce-0006.europe-west2-b.cockroach-ephemeral      10.154.15.236   35.234.156.121                                                                    
  ajwerner-test-roachprod-gce-0007      ajwerner-test-roachprod-gce-0007.us-east1-b.cockroach-ephemeral 10.142.0.33      35.185.62.76                                                                              
```                                                                                                                                                                                                                
                                                                                                                                                                                                                   
Instead of the previous:                                                                                                                                                                                           
                                                                                                                                                                                                                   
```                                                                                                                                                                                                                
ajwerner-test-old: [gce] 12h19m21s remaining                                                                                                                                                                       
  ajwerner-test-old-0001        ajwerner-test-old-0001.us-east1-b.cockroach-ephemeral   10.142.0.139    34.74.150.216                                                                                              
  ajwerner-test-old-0002        ajwerner-test-old-0002.us-east1-b.cockroach-ephemeral   10.142.0.140    34.73.154.246                                                                                              
  ajwerner-test-old-0003        ajwerner-test-old-0003.us-east1-b.cockroach-ephemeral   10.142.0.141    35.243.176.131                                                                                             
  ajwerner-test-old-0004        ajwerner-test-old-0004.us-west1-b.cockroach-ephemeral   10.138.0.71     34.83.16.1                                                                                                 
  ajwerner-test-old-0005        ajwerner-test-old-0005.us-west1-b.cockroach-ephemeral   10.138.0.60     34.83.78.172                                                                                               
  ajwerner-test-old-0006        ajwerner-test-old-0006.europe-west2-b.cockroach-ephemeral       10.154.15.200    35.234.148.191                                                                                    
  ajwerner-test-old-0007        ajwerner-test-old-0007.europe-west2-b.cockroach-ephemeral       10.154.15.199    35.242.179.144                                                                                    
```                  
Fixes #35866.

Release note: None


37603: sql: add ALTER TABLE/INDEX .. UNSPLIT AT .. r=jeffrey-xiao a=jeffrey-xiao

Now that manual splits add a sticky bit to the range descriptor, and
the merge queue respects this sticky bit, we can expose functionality to
manually unset this sticky bit.

If the key to unsplit is not the start of a range, then the unsplit
command will throw an error. If the range was manually split (I.E. the
sticky bit is set), then the sticky bit will be unset. Otherwise, this
command is a no-op.

Syntactically, the unsplit command is identical to the split command.

Release note: None

Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
Co-authored-by: Jeffrey Xiao <jeffrey.xiao1998@gmail.com>
@craig

This comment has been minimized.

Copy link

commented May 23, 2019

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request May 23, 2019

Merge #37264
37264: sql: track and display counters for successfully executed statements r=ajwerner a=ajwerner

Before this PR, statement counters were incremented before executing a SQL
statement.  This is problematic because it means that the counters include
statements which fail during execution. This changes the logic to increment the
counters when statements complete successfully.

Release note (admin ui change): Only include successfully executed statements
in the statement counters.

Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
@craig

This comment has been minimized.

Copy link

commented May 23, 2019

Build succeeded

@craig craig bot merged commit b291909 into cockroachdb:master May 23, 2019

3 checks passed

GitHub CI (Cockroach) TeamCity build finished
Details
bors Build succeeded
Details
license/cla Contributor License Agreement is signed.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.