New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APPSERV-14 Slow SQL tracing views #4422
Conversation
…dget type; add instance coloring to annotation table with widget type specific legend
… adds Alerts page preset
jenkins test please |
…d for undefined coloring
jenkins test please |
public void setSlowQueryThresholdInSeconds(String seconds) { | ||
spec.setDetail(DataSourceSpec.SLOWSQLLOGTHRESHOLD, seconds); | ||
double threshold = Double.parseDouble(seconds); | ||
if (threshold > 0) { | ||
if (sqlTraceDelegator == null) { | ||
sqlTraceDelegator = new SQLTraceDelegator(getPoolName(), getApplicationName(), getModuleName()); | ||
} | ||
sqlTraceDelegator.registerSQLTraceListener(new SlowSQLLogger((long)(threshold * 1000), TimeUnit.MILLISECONDS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. This was not needed and even could cause outdated threshold to be used as listeners installed will not be replaced when a listener of same class is attempted to be installed again (which would happen later when a connection is created).
@@ -506,7 +511,7 @@ public Object getConnection(Subject sub, javax.resource.spi.ConnectionRequestInf | |||
|
|||
if (sqlTraceDelegator == null) { | |||
if ((requestTracing != null && requestTracing.isRequestTracingEnabled()) | |||
|| (connectionPool != null && isSlowQueryLoggingEnabled())) { | |||
|| (isSlowQueryLoggingEnabled())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. isSlowQueryLoggingEnabled
includes the null
check.
*/ | ||
public class SQLTraceStoreAdapter implements SQLTraceListener { | ||
|
||
private static ThreadLocal<SQLQuery> currentQuery = new ThreadLocal<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. Sadly this cannot share code with SQLTraceLogger
which does almost the same as this has to be its own thread local otherwise the manipulation of the query happens multiple times if both listeners are installed which messes up the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth adding a simplified version of this comment to the code
|
||
@Service | ||
@Singleton | ||
public class SQLTraceStoreImpl implements SQLTraceStore, MonitoringDataSource, MonitoringWatchSource { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. Most interesting part is the class location. It had to be in a module that supports HK2 stuff and which ideally already depends on modules dealing with SQL tracing.
.amber(600L, 3, true, 600L, 5, false) | ||
.green(-400L, 1, false, null, null, false); | ||
addWatch(watch); | ||
addWatch(new Watch("Metric Collection Duration", new Metric(new Series("ns:monitoring CollectionDuration"))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. The watch for a metric provided by InMemoryMonitoringDataRepository
had to be added here as this would otherwise cause a cyclic dependency between the two.
@@ -123,12 +126,12 @@ void changedConfig(boolean enabled) { | |||
if (!enabled) { | |||
checker.stop(); | |||
} else { | |||
checker.start(executor, 2, SECONDS, this::checkWatches); | |||
checker.start(executor, 1, SECONDS, this::checkWatches); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. Initially I thought evaluating alerts every two seconds is good enough. While this is true in general this does allow to miss value spikes that should cause an alert in case of most basic condition that only looks at the latest value. So it was changed to 1 second to make sure each value is considered.
jenkins test please |
import org.jvnet.hk2.annotations.Contract; | ||
|
||
@Contract | ||
public interface SQLTraceStore { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. Main reason this interface is needed is because of module dependencies.
public MonitoringDataCollector annotate(CharSequence metric, long value, String... attrs) { | ||
prefixed.setLength(prefix.length()); | ||
self.annotate(prefixed.append(metric), value, attrs); | ||
return this; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. It was a bug to not return this
for any of the methods of MonitoringDataCollector
because when the instance is used with chaining it should be the wrapper that is called (so it does its prefix thing), not the instance returned by self
(that would not do the prefix thing). In practice this bug never had an effect as the only usage would not use chaining but for future this now works properly.
@@ -102,7 +102,7 @@ private static MemoryUsage getMemoryUsage() { | |||
|
|||
@Override | |||
public void collect(MonitoringWatchCollector collector) { | |||
collectUsage(collector, "ns:health HeapUsage", "Heap Usage", 10, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NB. 10 turned out to be too flaky so I increased it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 😃
Nothing jumps out to me as egregious, just a few nit-picks.
I ran the JPA tests in EE7 samples against it to trigger the slow SQL rather than your provided app.
I haven't checked it all works against an instance other than the DAS yet, I'll get round to that in a bit.
*/ | ||
public class SQLTraceStoreAdapter implements SQLTraceListener { | ||
|
||
private static ThreadLocal<SQLQuery> currentQuery = new ThreadLocal<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth adding a simplified version of this comment to the code
appserver/jdbc/jdbc-ra/jdbc-core/src/main/java/fish/payara/jdbc/SQLTraceStoreAdapter.java
Outdated
Show resolved
Hide resolved
...ver/monitoring-console/core/src/main/java/fish/payara/monitoring/model/SeriesAnnotation.java
Outdated
Show resolved
Hide resolved
...uesttracing-core/src/main/java/fish/payara/nucleus/requesttracing/RequestTracingService.java
Show resolved
Hide resolved
jenkins test please |
@Pandrex247 addressed your comments. I also fixed a potential memory leak I noticed. When SQL traces would hit the new store but monitoring would be disabled the queues could grow without limit. I applied same technique I used for request tracing where the queue is limited to 50 entries. Should it be that size when new entry is added the oldest entry is thrown away. |
Summary
Adds on server side:
?:foo
(any tag name with valuefoo
including no tag) to allow explicit selection of family of series where some have further tags and others have not like used for lists of health check alerts or a general alert listAdds on client side:
Data Annotations
Annotations are a new concept in collection of metrics to attach key-value data to a particular metric at a particular point in time. Such annotation should only be done in situations that are in some way noteworthy such as a metric's value exceeding a particular threshold. In general annotations are event like information that are stored per series in a queue of fixed size. Newer annotations eventually replace older annotations to avoid number of annotations growing out of control.
Current limit is set at 20 per metric series.
The annotation data is used by the client to enhance views such as lists of alerts with more data for the period of the alert or to allow widgets of type
annotation
that for example can list the slow SQL meta data in table.Annotations provide a general mechanism that makes such values available for any metric and from all instances. To verify the general nature of the concept it has also been applied to request tracing which allows to also list the trace span attributes in a table.
Testing
New unit tests were added for
Condition
logicSeries
patterns using the added?
wild-cardAlert.Frame
compactionTesting Performed
Manual testing of various features. This PR does change general parts and extends others. It makes sense to test usual workflows even though they were not an explicit target of the changes.
Testing Instructions
General Setup:
set-monitoring-console-configuration --enabled=true
to deploy MCTesting slow SQL tracing:
-1
(disabled) to0.001
(1ms)NOTE: the
order
app causes errors during deploy on server restart. Undeploy the app and redeploy with server running. AFAIK this is a shortcoming of the app's setup/scripts.Testing Request Tracing (again):
2
, Time Value to20
, Time Unit toSECONDS
5
to20
, Threshold Unit toMILLISECONDS
Testing Health Checks (again):
Further things in MC to test (again)