Skip to content

toni-moreno/domainhealth-ng

Repository files navigation

DomainHeatlh-NG

DomainHealth-NG is a fork from the DomainHealth Tool done by Paul Done that you can find at

http://sourceforge.net/projects/domainhealth

A tool designed to provide administrarors with a quick and easy way to monitor a set WebLogic servers effectively. Entineered to have a minimal performance Impact on the managed servers in the domain. DomainHealth supports all WebLogic version from 9.0 to 12.1.x.

The main goal of this fork is to add graphite as backend to store metric data and analize them.

This project will focus efforts on collecting and selecting metrics from any weblogic server better than store or render. So we are removed the graphical frontend contained in the original project.

Collected Metrics

From each managed server DomainHealth try to garther following metrics.

  • jvm:
    • Loaded/Unloaded Classes.
    • Compilation time.
    • Heap/NonHeap memory data.
    • Memory Pools informaction.
    • Garbage Collection metrics.
  • Core:
    • Server State and Open Sockets
    • JVM HEAP size,free,%used.
    • ThreadPool stats.
    • JTA Transaction stats.
  • Datasource:
    • For each Datasource you will get active conections, available, delay time, threads waiting for conection, etc.
  • JMS Destinations:
    • For each JMS destination queue gives you message and consumer countsers.
  • WebApps:
    • For each webapp deployed will get current sessions.
  • EJB:
    • For each EJB you will get ejb pool stats and transacction stats
  • HostMachineStats:
    • For getting host OS stats you need WLHostMachineStats, WLHostMachineStats is a small agent (JMX MBean) that runs in every WebLogic Server to collect O.S. statistics (ie. CPU/Memory/Network usage). http://sourceforge.net/projects/wlhostmchnstats/ This is a good choice if working with a standalone DomainHealth instalation. If working on a multidomain environtment with graphite you will gather better and faster OS stats with other graphite colecting tools like collectd/hekad/etc.
  • ExtendedStats:
    • ( Only when gather stats with WLDF ) you will get workmanager and server channels stats

NOTES

  • JVM secction only works for JVM version equal or higher than 1.5.
  • JVM stats in "core" section will only appear if not choosed the "jvm" section that gather a more complete JVM set of metrics.

Building From Source

This project includes an Ant buildfile in the root directory to enable the project to be completely re-built from source and modified and enhanced where necessary. The project also includes an Eclipse '.project' Project file, enabling developers to optionally use Eclipse to modify the source (just import DomainHealth as an existing project into Eclipse).

To re-build the project, first ensure the Java 1.5.x SDK and Ant 1.6+ is installed and their 'bin' directories are present in PATH environment variable, then check the values in the local.properties file in the project's root directory to ensure this reflects your local WebLogic environment settings.

Run the following commands to clean the project, compile the source code and build the WAR web-application:

 > ant clean
 > ant

OPTIONAL: To run the unit tests for the project, copy the JUnit archive ('junit.jar') from this project's 'lib' directory into 'ANT_HOME/lib'm and then run:

 > ant test

OPTIONAL: To automatically deploy the generated WAR web-application to a running WebLogic Server, first modify the 'local.properties' file in the root of the project, to reflect the required WebLogic settings and then run:

 > ant deploy

To undeploy the application, run:

 > ant undeploy

Install

Once you have the domainhealth-XXX.war package you can Install in two ways

  1. with ant deploy task as the previous section said.
  2. with the console in the AdminServer.

Configuration

DomainHealth-NG have maintained the old configuration system from Original DomainHealth with -D<parameter_key>= style config but also introduces a new dh_global.properties file to centralize all config parameters and simplify changes and maintenance.

In this way you should configure all settings configuring only one parameter dh_config_file as the only JAVA_OPTIONS to add or leaving a config file named "dh_global.properties" in the Domain root path.

An default dh_global.properties is located at the root of the sources directory

a) You can config by editing setDomainEnv.sh and add at the end.

case ${SERVER_NAME} in
        AdminServer)
        echo "Enabling DomainHealth NG Config"
        export JAVA_OPTIONS="$JAVA_OPTIONS -Ddh_config_file=/absolute_path/dh_global.properties"
       ;; 
esac

b) You can also config by placing the dh_global.properties file in the domain root path.

$DOMAIN_ROOT/dh_global.properties

After you can edit dh_global.properties

Configuration Parameters

Base Configuration Parameters

  • dh_always_use_jmxpoll: Forces DH to always use JMX polling to collect metrics rather than allowing DH to decide for itself what to use (which in WLS 10.3+ would otherwise default to using WLDF Harvesting).
  • dh_query_interval_secs: The gap in seconds between consecutive statistic collections (making this too small could impact server performance).
  • dh_backend_output: Select with backend to use graphite,csvfile,both
  • dh_output_log_path: The file where to log all DomainHealth-NG output
  • dh_output_log_level: Set the threshold level. All log events with lower level than the threshold level are ignored by the appender. Default: INFO

Metric Configuration Parameters

  • dh_metric_type_set: type of metric to gather among (jvm,core,datasource,jmsdestination,webapp,ejb,hostmachine,extended) Default: All
  • dh_metric_deep_set: set of metrics to gather among (basic/full). Default: full
  • dh_component_blacklist: The list of deployed application names which should not have statistics collected or displayed - usually used to prevent WebLogic internal applications from appearing in results

CSV Configuration Parameters

  • dh_stats_output_path: Defines the absolute or relative (to server start-up dir) path of the root directory where DH should store captured CSV statistic files. Default ./logs/statistics
  • dh_csv_retain_num_days: The number of days to retain captured CSV data log files for (older ones are automatically removed by DomainHealth to help limit file-system capacity consumption).

Graphite Configuration Parameters

  • dh_graphite_carbon_host: Graphite carbon server carbon (Graphite) Default: localhost
  • dh_graphite_carbon_port: carbon port ( Graphite) Default: 2003
  • dh_graphite_reconnect_timeout: Reconnection attempt time after connection lost in seconds. Default: 60 seconds
  • dh_graphite_force_reconnect_timeout: This parameter forced close and after open after some amount of time in seconds or 0 if disabled. This parameter could be important on balanced environments enabling load rebalance when several domainhealths are sending metrics to the same graphite backend
  • dh_graphite_send_buffer_size: Graphite output buffer size on heavily loaded systems better big buffers. Default: 1Mb
  • dh_graphite_metric_use_host: Enable a metric tree based on host better than a domain tree based approach. Default: True
  • dh_graphite_metric_host_prefix: hostname prefix. Default: pro.bbdd
  • dh_graphite_metric_host_suffix: hostname suffix. Default: wls
  • dh_graphite_default_host: “Machine” Name it uses in host based approach ( if not properly configured with console ) and the hostname in the AdminServer data. Default: default_host
  • dh_graphite_metric_force_domain_name: fix bug on read weblogic domain name on server startup in 9.2. Default: my_domain:
  • dh_graphite_report_dhstats: Send data about Domainhealth data retrieval (Only reported over graphite backend). Default : True

Graphite Tree Model

1.- Host Based ( by default)

<HOST_PREFIX>.<HOST>.<HOST_SUFFIX>.<DOMAIN_NAME>.<SERVER_INSTANCE_NAME>.<RESOURCE_TYPE>.<RESOURCE_NAME>.<METRIC_NAME>

2.- Domain Based.

<DOMAIN_NAME>.<SERVER_INSTANCE_NAME>.<RESOURCE_TYPE>.<RESOURCE_NAME>.<METRIC_NAME>

On both models you can get information on how many and how lond data is gathered on each managed server over de domain tree.

XXXXX.<DOMAIN_NAME>.dh_stats.servers.<SERVER_INSTANCE_NAME>.retrieve_time
XXXXX.<DOMAIN_NAME>.dh_stats.servers.<SERVER_INSTANCE_NAME>.number_metrics

NOTE: in the core metrics there are a "stateVal" which reports on the instance status can be one of these:

  • SHUTDOWN = 0;
  • STARTING = 1;
  • RUNNING = 2;
  • STANDBY = 3;
  • SUSPENDING = 4;
  • RESUMING = 5;
  • SHUTTING_DOWN = 6;
  • FAILED = 7;
  • UNKNOWN = 8;
  • SHUTDOWN_PENDING = 9;
  • SHUTDOWN_IN_PROCESS = 10;
  • FAILED_RESTARTING = 11;
  • ACTIVATE_LATER = 12;
  • FAILED_NOT_RESTARTABLE = 13;
  • FAILED_MIGRATABLE = 14;
  • DISCOVERED = 15;

Security Issues

This tool works fine in any default weblogic installation. If you wish to add authentication to the JVM by adding:

-Xmanagement:authenticate=true

you also need to add:

-Djavax.management.builder.initial=
weblogic.management.jmx.mbeanserver.WLSMBeanServerBuilder

if you want to get JVM statistics.