APPSERV-14 Slow SQL tracing views #4422

jbee · 2020-01-10T16:50:14Z

Summary

Adds on server side:

a general data annotation mechanism to data collection (see details below)
SQL max execution time metric based on SQL tracing
a watch for SQL max execution time with threshold from pool configuration
extends watch conditions
adds series tag name wildcards, ala ?:foo (any tag name with value foo including no tag) to allow explicit selection of family of series where some have further tags and others have not like used for lists of health check alerts or a general alert list
REST API for data requests supports better selection of the needed type of data (points, alerts, annotations, watches) to avoid sending data between client and server that isn't used by the client
extends request tracing metric so it does allow adding useful watch
adds a watch to cause alerts in case request tracing is above threshold
adds "compaction" for alert frames to limit memory consumption and client-server data exchange

Adds on client side:

shows thresholds of multiple watches in charts as stacked threshold indicator lines/areas
fixes threshold indicator gradient and line alignment
fixes alert condition formatting
adds annotations to alerts with setting to hide them
adds new widget type Annotations that lists just the matching annotations in a list or table
adds new widget setting Mode to switch between Table or List layout for Annotations widgets
widgets now have a page wide unique id so that same series (pattern) can be used multiple times on the same page as done for SQL, request tracing or alerts.
adds a new SQL preset page that show slow SQL tracing information
fixes alert table vertical overflow by using a scrollbar when needed
adds list of tracing data and tracing metric with alerts to request tracing page
adds a general Alerts page preset that shows all alerts organised in groups

Data Annotations

Annotations are a new concept in collection of metrics to attach key-value data to a particular metric at a particular point in time. Such annotation should only be done in situations that are in some way noteworthy such as a metric's value exceeding a particular threshold. In general annotations are event like information that are stored per series in a queue of fixed size. Newer annotations eventually replace older annotations to avoid number of annotations growing out of control.
Current limit is set at 20 per metric series.

The annotation data is used by the client to enhance views such as lists of alerts with more data for the period of the alert or to allow widgets of type annotation that for example can list the slow SQL meta data in table.

Annotations provide a general mechanism that makes such values available for any metric and from all instances. To verify the general nature of the concept it has also been applied to request tracing which allows to also list the trace span attributes in a table.

Testing

New unit tests were added for

extended variants of Condition logic
extended Series patterns using the added ? wild-card
Alert.Frame compaction

Testing Performed

Manual testing of various features. This PR does change general parts and extends others. It makes sense to test usual workflows even though they were not an explicit target of the changes.

Testing Instructions

General Setup:

build, install and start the server
use set-monitoring-console-configuration --enabled=true to deploy MC
open MC at http://localhost:8080/monitoring-console/
make sure browser cache for JS/CSS is cleared for MC's domain
check that following pages do exist: SQL and Alerts (if not most likely a browser cache issue - or get in touch)

Testing slow SQL tracing:

Open admin console at http://localhost:4848/
navigate to Resources => JDBC => JDBC Connection Pools
select H2Pool tab Advanced
change Slow Query Log Threshold from -1 (disabled) to 0.001 (1ms)
open page SQL in MC
deploy e.g. https://github.com/javaee/tutorial-examples/tree/master/persistence/order (some app using the default pool)
deployment should already cause alert(s) and annotations being shown on the SQL page.
use the order app some and see how "slow" queries cause alerts and SQL is available in the table on SQL page

NOTE: the order app causes errors during deploy on server restart. Undeploy the app and redeploy with server running. AFAIK this is a shortcoming of the app's setup/scripts.

Testing Request Tracing (again):

Open admin console at http://localhost:4848/
navigate to Configurations => server-config => Request Tracing
check Enabled
set Target Count to 2, Time Value to 20, Time Unit to SECONDS
set Threshold Value to somewhere between 5 to 20, Threshold Unit to MILLISECONDS
save changes
open MC Request Tracing page. The polling of MC itself should be enough to cause some data to appear in the widgets

Testing Health Checks (again):

Open admin console at http://localhost:4848/
navigate to Configurations => server-config => HealthCheck
open tab CPU Usage , check Enabled and save changes
eventually open other tabs to enable more checks, the Enabled checkbox on General tab does not have to be checked.
open Health Checks page in MC and check that the graphs for the enabled checks do show data and thresholds
go back to admin console configuration and change a threshold, e.g. heap usage to a level that is in between the low and high point currently present for your GC cycle (see graph).
check that changed threshold is updated in graph and that alerts are caused when heap usage is exceeded (note that the alert condition requires the threshold to be exceeded for average of last 15 points)

Further things in MC to test (again)

create a new page, name it, add some widgets, remove some widgets
reset preset pages after changes
change settings of widgets
change colour scheme
clear local storage for the page and reload the page (full reset or client side settings)

…tialised yet

…tialised yet (2)

…se data

…dget type; add instance coloring to annotation table with widget type specific legend

… adds Alerts page preset

…ists

jbee · 2020-01-15T16:26:42Z

jenkins test please

…d for undefined coloring

jbee · 2020-01-15T17:06:48Z

jenkins test please

jbee · 2020-01-17T10:54:32Z

...erver/jdbc/jdbc-ra/jdbc-core/src/main/java/com/sun/gjc/spi/ManagedConnectionFactoryImpl.java

    public void setSlowQueryThresholdInSeconds(String seconds) {
        spec.setDetail(DataSourceSpec.SLOWSQLLOGTHRESHOLD, seconds);
        double threshold = Double.parseDouble(seconds);
        if (threshold > 0) {
            if (sqlTraceDelegator == null) {
                sqlTraceDelegator = new SQLTraceDelegator(getPoolName(), getApplicationName(), getModuleName());
            }
-            sqlTraceDelegator.registerSQLTraceListener(new SlowSQLLogger((long)(threshold * 1000), TimeUnit.MILLISECONDS));


NB. This was not needed and even could cause outdated threshold to be used as listeners installed will not be replaced when a listener of same class is attempted to be installed again (which would happen later when a connection is created).

jbee · 2020-01-17T10:58:57Z

appserver/jdbc/jdbc-ra/jdbc-core/src/main/java/com/sun/gjc/spi/ManagedConnectionImpl.java

@@ -506,7 +511,7 @@ public Object getConnection(Subject sub, javax.resource.spi.ConnectionRequestInf

        if (sqlTraceDelegator == null) {
            if ((requestTracing != null && requestTracing.isRequestTracingEnabled())
-                    || (connectionPool != null && isSlowQueryLoggingEnabled())) {
+                    || (isSlowQueryLoggingEnabled())) {


NB. isSlowQueryLoggingEnabled includes the null check.

jbee · 2020-01-17T11:01:16Z

appserver/jdbc/jdbc-ra/jdbc-core/src/main/java/fish/payara/jdbc/SQLTraceStoreAdapter.java

+ */
+public class SQLTraceStoreAdapter implements SQLTraceListener {
+
+    private static ThreadLocal<SQLQuery> currentQuery = new ThreadLocal<>();


NB. Sadly this cannot share code with SQLTraceLogger which does almost the same as this has to be its own thread local otherwise the manipulation of the query happens multiple times if both listeners are installed which messes up the query.

Might be worth adding a simplified version of this comment to the code

jbee · 2020-01-17T11:03:55Z

appserver/jdbc/jdbc-runtime/src/main/java/fish/payara/jdbc/SQLTraceStoreImpl.java

+
+@Service
+@Singleton
+public class SQLTraceStoreImpl implements SQLTraceStore, MonitoringDataSource, MonitoringWatchSource {


NB. Most interesting part is the class location. It had to be in a module that supports HK2 stuff and which ideally already depends on modules dealing with SQL tracing.

jbee · 2020-01-17T11:06:46Z

...monitoring-console/core/src/main/java/fish/payara/monitoring/store/InMemoryAlarmService.java

-                .amber(600L, 3, true, 600L, 5, false)
-                .green(-400L, 1, false, null, null, false);
-        addWatch(watch);
+        addWatch(new Watch("Metric Collection Duration", new Metric(new Series("ns:monitoring CollectionDuration")))


NB. The watch for a metric provided by InMemoryMonitoringDataRepository had to be added here as this would otherwise cause a cyclic dependency between the two.

jbee · 2020-01-17T11:09:56Z

...monitoring-console/core/src/main/java/fish/payara/monitoring/store/InMemoryAlarmService.java

@@ -123,12 +126,12 @@ void changedConfig(boolean enabled) {
        if (!enabled) {
            checker.stop();
        } else {
-            checker.start(executor, 2, SECONDS, this::checkWatches);
+            checker.start(executor, 1, SECONDS, this::checkWatches);


NB. Initially I thought evaluating alerts every two seconds is good enough. While this is true in general this does allow to miss value spikes that should cause an alert in case of most basic condition that only looks at the latest value. So it was changed to 1 second to make sure each value is considered.

jbee · 2020-01-17T11:11:41Z

jenkins test please

jbee · 2020-01-17T11:16:16Z

nucleus/common/glassfish-api/src/main/java/org/glassfish/api/jdbc/SQLTraceStore.java

+import org.jvnet.hk2.annotations.Contract;
+
+@Contract
+public interface SQLTraceStore {


NB. Main reason this interface is needed is because of module dependencies.

jbee · 2020-01-17T11:19:14Z

...ommon/internal-api/src/main/java/fish/payara/monitoring/collect/MonitoringDataCollector.java

+            public MonitoringDataCollector annotate(CharSequence metric, long value, String... attrs) {
+                prefixed.setLength(prefix.length());
+                self.annotate(prefixed.append(metric), value, attrs);
+                return this;


NB. It was a bug to not return this for any of the methods of MonitoringDataCollector because when the instance is used with chaining it should be the wrapper that is called (so it does its prefix thing), not the instance returned by self (that would not do the prefix thing). In practice this bug never had an effect as the only usage would not use chaining but for future this now works properly.

jbee · 2020-01-17T11:20:20Z

...re/src/main/java/fish/payara/nucleus/healthcheck/preliminary/HeapMemoryUsageHealthCheck.java

@@ -102,7 +102,7 @@ private static MemoryUsage getMemoryUsage() {

    @Override
    public void collect(MonitoringWatchCollector collector) {
-        collectUsage(collector, "ns:health HeapUsage", "Heap Usage", 10, true);


NB. 10 turned out to be too flaky so I increased it.

Pandrex247

LGTM! 😃
Nothing jumps out to me as egregious, just a few nit-picks.

I ran the JPA tests in EE7 samples against it to trigger the slow SQL rather than your provided app.
I haven't checked it all works against an instance other than the DAS yet, I'll get round to that in a bit.

Pandrex247 · 2020-01-17T12:42:32Z

appserver/jdbc/jdbc-ra/jdbc-core/src/main/java/fish/payara/jdbc/SQLTraceStoreAdapter.java

+ */
+public class SQLTraceStoreAdapter implements SQLTraceListener {
+
+    private static ThreadLocal<SQLQuery> currentQuery = new ThreadLocal<>();


Might be worth adding a simplified version of this comment to the code

appserver/jdbc/jdbc-ra/jdbc-core/src/main/java/fish/payara/jdbc/SQLTraceStoreAdapter.java

...ver/monitoring-console/core/src/main/java/fish/payara/monitoring/model/SeriesAnnotation.java

...uesttracing-core/src/main/java/fish/payara/nucleus/requesttracing/RequestTracingService.java

appserver/monitoring-console/webapp/JS_DOCS.md

jbee · 2020-01-22T13:08:54Z

jenkins test please

jbee · 2020-01-22T13:20:37Z

@Pandrex247 addressed your comments.

I also fixed a potential memory leak I noticed. When SQL traces would hit the new store but monitoring would be disabled the queues could grow without limit. I applied same technique I used for request tracing where the queue is limited to 50 entries. Should it be that size when new entry is added the oldest entry is thrown away.

jbee added 10 commits January 8, 2020 14:57

APPSERV-14 slow SQL monitoring (incomplete)

8df324d

Merge branch 'master' into APPSERV-14-slow-sql

f62cb41

APPSERV-14 reverts changes to JdbcResourcesUtil.java

a511309

APPSERV-11 APPSERV-14 fixed NPE when health check options are not ini…

7119b70

…tialised yet

APPSERV-11 APPSERV-14 fixed NPE when health check options are not ini…

1908a9d

…tialised yet (2)

APPSERV-14 SQL max execution time metric and watch per pool

d2ab11c

APPSERV-14 adds annotations to server and webapp API

0161e41

APPSERV-14 show all watches and decoration as background area

a188f9a

APPSERV-14 fixed gradient background for background areas

c50a42f

APPSERV-14 fixed background area gradient-line alignment

3e67faa

jbee added 3:DevInProgress PR: DO NOT MERGE Don't merge PR until further notice labels Jan 10, 2020

jbee self-assigned this Jan 10, 2020

jbee added 17 commits January 13, 2020 10:27

APPSERV-14 extends alert condition logic and fixes condition formatting

26c6ccd

APPSERV-14 series tag name wildcards

cc1b941

APPSERV-14 fixed series tag name wild-cards

bbad554

APPSERV-14 completes annotations in alert tables

7308b6f

APPSERV-14 alarm annotations via AnnotationTable component

bd038ee

APPSERV-14 annotation fields setting; fixed: alert starts ends on cau…

0044aa4

…se data

APPSERV-14 basic annotations table widget type

24dc17a

APPSERV-14 annotation list and table

fbfddf3

APPSERV-14 fixed: do not show alerts for widget type annotation

415ad98

APPSERV-14 fixed CSS for annotation table font size

c9c0a64

APPSERV-14 adds widgets unique ids on page; adds SQL page preset

d1204fa

APPSERV-14 SQL page details and styling

da0b1ad

APPSERV-14 restrict client-server data exchange to needed data for wi…

b6031a5

…dget type; add instance coloring to annotation table with widget type specific legend

APPSERV-14 adds request tracing watch and annotations

029bd9f

APPSERV-14 scroll vertical when alerts or annotations list overflows;…

7d0fb4a

… adds Alerts page preset

APPSERV-14 fixes request tracing metric clears alerts properly

03d32de

APPSERV-14 adds coloring dependent legend for annotation tables and l…

15a6300

…ists

APPSERV-14 adds alert frame compaction; fixes annotation widget legen…

ac417d7

…d for undefined coloring

jbee changed the title ~~[WIP] APPSERV-14 Slow SQL tracing views~~ APPSERV-14 Slow SQL tracing views Jan 15, 2020

Merge branch 'master' into APPSERV-14-slow-sql

9e270cb

jbee commented Jan 17, 2020

View reviewed changes

jbee removed 3:DevInProgress PR: DO NOT MERGE Don't merge PR until further notice labels Jan 17, 2020

jbee requested review from Pandrex247 and pdudits January 17, 2020 11:22

Pandrex247 reviewed Jan 17, 2020

View reviewed changes

jbee added 2 commits January 22, 2020 13:15

APPSERV-14 addressed Andrew's comments

49da5fa

APPSERV-14 fixed possible memory leak in case monitoring is disabled

46aaca9

jbee requested review from Pandrex247 and dmatej January 22, 2020 13:34

Pandrex247 approved these changes Jan 22, 2020

View reviewed changes

jbee merged commit fe00e88 into payara:master Jan 22, 2020

jbee added this to the 5.201 milestone Jan 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

APPSERV-14 Slow SQL tracing views #4422

APPSERV-14 Slow SQL tracing views #4422

jbee commented Jan 10, 2020 •

edited

jbee commented Jan 15, 2020

jbee commented Jan 15, 2020

jbee Jan 17, 2020 •

edited

jbee Jan 17, 2020

jbee Jan 17, 2020

Pandrex247 Jan 17, 2020

jbee Jan 17, 2020

jbee Jan 17, 2020

jbee Jan 17, 2020

jbee commented Jan 17, 2020

jbee Jan 17, 2020

jbee Jan 17, 2020

jbee Jan 17, 2020

Pandrex247 left a comment

Pandrex247 Jan 17, 2020

jbee commented Jan 22, 2020

jbee commented Jan 22, 2020

APPSERV-14 Slow SQL tracing views #4422

APPSERV-14 Slow SQL tracing views #4422

Conversation

jbee commented Jan 10, 2020 • edited

Summary

Data Annotations

Testing

Testing Performed

Testing Instructions

jbee commented Jan 15, 2020

jbee commented Jan 15, 2020

jbee Jan 17, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbee commented Jan 17, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Pandrex247 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbee commented Jan 22, 2020

jbee commented Jan 22, 2020

jbee commented Jan 10, 2020 •

edited

jbee Jan 17, 2020 •

edited