Twitter Query Type Statistics

Liang Guo edited this page Aug 26, 2013 · 1 revision

This feature supports query execution statistics in Twitter MySQL t12 release. Query level statistics give better information about applications, and it can be leveraged to implement dynamic MySQL server resource governance. The overhead incurred by query level stats collection is minimal, regardless the number of distinct queries to be tracked.

At the most coarse granularity, query stats are grouped by query type. A query type is defined by striping query text string of literal values, e.g. numeric values in table names, predicate expression, values list, and IN list etc. Each incoming query must be processed and mapped to its query type string.

Query type stats can be grouped at the level of client applications as well. It requires applications to pass client IDs in query comments , in the format of /* 'client_id' : 'my_client_id' */. The client ID can be any string value defined by applications, however, internally only first 32 bytes of client ID is logged.

A new global variable --twitter_query_stats [0,2] is provided to enable query stats collection at query type level (=1) and client_id level (=2). The query stats can be enabled or disabled dynamically. However, it can not be changed between levels.

A new global variable --twitter_query_stats_max [0, 1000000] is provided to configure the max number of query types to be tracked. New query types beyond the configured number of queries are ignored. This configurable system variable is mainly to make sure memory usage by the feature can be well controlled.

Query stats is reset when twitter_query_stats level changes or when FLUSH STATUS command is issued. However, for design reason, existing query types are never deleted when query stats level changes.

In the first phase, the following execution stats are collected.

  • count,
  • latency,
  • max latency.

A new INFORMATION_SCHEMA.QUERY_STATISTICS table is added to expose query stats.