Tuning Discovery Performance

Most complex software packages have a number of parameters that affect their performance, and IBM® Tivoli® Application Dependency Discovery Manager (TADDM) is no exception. The default settings that TADDM ships with have been found to generally perform well. However, every environment is different so some adjustment of these settings may be needed to deliver optimal discovery performance.

The TADDM GUI provides some information that is useful for monitoring discovery performance.

The sensors displayed on this panel will have one of three states: done, error, and in progress as shown in the Status column. The sensors in the done or error states are no longer being processed. To monitor performance it is only useful to look at the in progress sensors. The in progress sensor will be in one of three stages of execution. These stages are shown in the Description column of the panel. The sensors can be grouped on the display by these stages. To do so, click on the Description column heading.

The first stage is started. For a sensor, this is the stage that is in the process of discovering a CI or CIs.

The second stage is discovered. For a sensor, this is the stage that has finished discovering a CI or CIs but is still waiting for its results to be persisted in the datastore.

The final stage of an in progress sensor is storing. As the name implies, sensors in this stage are having their results persisted in the database.

By observing a discovery run and comparing the number of in progress sensors that are in the started stage versus the number of in Progress sensors in the discovered or storing stages, an assessment can be made on whether attribute discovery is faster or slower then
attribute storage for a particular environment.

The attribute discovery rate is the area with the most potential for tuning. Most of TADDM's modifiable parameters are contained in the collation.properties file:

<TADDM_install_dir>/dist/etc/collation.properties

As its name implies, this is a Java™ properties file with a list of name, value pairs separated by an equal (=) sign. In this file, the property with the most impact on performance is the number of discovery worker threads:

# Max number of discover worker threads
com.collation.discover.dwcount=16

Provided the server has sufficient spare capacity, this setting can be increased. This allows more sensors to run in parallel. Two similar properties are listed below:

# Max number of seed creators
com.collation.discover.sccount=2
# Max Number of Agent selectors
com.collation.discover.ascount=1

However, these values should not be changed.

In most cases, to discover attributes of a CI, a sensor requires a SSH or WMI session with the host computer where the CI resides. To improve performance by reducing session creation overhead, these sessions are pooled and cached. The default pool sizes are sufficient in most cases. However when they are insufficiently large, they can limit the discovery rate. To monitor for this condition the following property in the
collation.properties file can be changed from false to true.

com.collation.platform.session.ExtraDebugging=false

As with all changes to the collation.properties file, the server must be restarted for the change to take effect. Once this change has taken place and a discovery is run, the DiscoverManager logs can be searched for session pool wait time issues. The text to search for in the logs is seconds for pool lock. Below is an example of performance degradation caused by session pool contention from the DiscoverManager.log file:

2006-08-04 16:11:50,733 DiscoverManager [DiscoverWorker-34] WindowsComputerSystemAgent(192.168.16.181) INFO
session.SessionClientPool - Session client [3x ssh2:/admlxz@151.179.84.85]#9612508 waited 158.682 seconds for pool lock

If the log shows excessive waiting for a session from the pool, the pool size can be increased. There are two ways to do this. First, the per host session pool can be increased globally by changing the following property in the collation.properties file:

com.collation.platform.session.PoolSize=3

It is however unlikely that there will be contention for sessions for all or even most hosts in the environment. The contention will likely be restricted to a smaller number of larger hosts for which a great many sensors run. TADDM has a concept of a Scoped Property, which means that many of the properties in the collation.properties file can have one value for general targets and a different value for specific targets. This is done by appending an IP address or Scope name to the property name, like in the following example:

com.collation.platform.session.PoolSize.10.10.250.1=20

In this case, for all hosts other than 10.10.250.1, the PoolSize is 3, but for 10.10.250.1, the PoolSize is 20. By looking at the log messages like the one above, it is easy to determine for which hosts the default session PoolSize is insufficient and make the appropriate
changes to the collation.properties file.

A related setting is the Gateway pool size. It sets the number of sessions allowed between the server and the Windows Gateway. The property is:

com.collation.platform.session.GatewayPoolSize=10

In environments that consist mainly of Windows computer systems, this property should be adjusted upwards to be equal to the number of Discover worker threads.

Tuning Storage Performance

The second major area for tuning is storage. Storage of the discovery results is the discovery performance bottleneck if the number of sensors in the storing state hovers around the value of the property:

com.collation.discover.observer.topopumpcount

This property is the number of parallel storage threads. It is one of the main settings for controlling discovery storage performance. It must however be adjusted carefully. Setting it too high can lead to contention if more than one thread is trying to update the same object at the same time. Setting this property too high can also increase contention to the point that overall throughput is decreased. Contention is more likely to occur when a very small discovery scope is used or when a very large portion of the discovered CIs are from a single server. If either of these two scenarios are commonplace in the environment of a given TADDM server, set this property to a small number, even as low as 1 is recommended.

Another property that can be altered is the generateExplicitRelationship parameter. This property has a very significant impact on storage performance. By default, this value is set to true.

com.collation.topomgr.generateExplicitRelationship=true

If this TADDM server is not part of a full Tivoli CCMDB deployment with Process Managers, this property can be set to false. This dramatically increases the throughput of the discovered attribute storage.

TADDM does not place large demands on its backend database. Oracle tends to perform adequately with default settings. DB2® benefits from a few setting adjustments and some table maintenance. The following DB2 settings are extensively tested and provide good results in the lab environment:

APP_CTL_HEAP_SZ 1024
LOGFILSIZ 4096
LOGPRIMARY 6
LOGSECOND 20
LOCKLIST 1500
MAXLOCKS 35
CATALOGCACHE_SZ 384
PCKCACHESZ 512
AVG_APPLS 3
SORTHEAP 1024
LOGBUFSZ 512
NUM_IOCLEANERS 3
NUM_IOSERVERS 5
DFT_DEGREE ANY

In addition to setting these database parameters, the following DB2 environment setting provides added performance improvements.

db2set DB2_EVALUNCOMMITTED=YES

One word of caution about this setting, it is an instance wide setting and alters the way concurrent transactions are processed. While this is good for TADDM, it is potentially bad for other databases in the instance.

The DB2 query optimizer benefits from having up-to-date statistics on the TADDM tables. There is a program in the <TADDM_install_dir>/dist/support/bin directory called gen_db_stats.jy. This program outputs the database commands for either Oracle or DB2 to update the statistics on the TADDM tables. The following example shows how the program is used:

cd <TADDM_install_dir>/dist/support/bin
./gen_db_stats.jy ><tmpdir>/TADDM_table_stats.sql

The directory is a directory where this file can be created. Once this is complete, copy the file to the database server and run the following command:

DB2: db2 -tvf Oracle: sqlplus

Additional Tuning Parameters

There are a few additional parameters that can be set which affect performance but are not directly related to discovery. Since TADDM is a Java application, the various JVM™ parameters can be modified from their default value. There are two ways to do this:

To change an existing JVM option to a different value, <TADDM_install_dir>/dist/deploy-tomcat/ROOT/WEB-INF/cmdb-context.xml must be edited. If eCMDB is in use, then <TADDM_install_dir >/dist/deploytomcat/ROOT/WEB-INF/ecmdb-context.xml should be modified instead.

To edit one of these files to change the settings for one of the TADDM services, first find the service in the file. Here is an example of the beginning of a service definition in the XML file:
```
<bean id="Discover" class="com.collation.platform.jini.ServiceLifecycle" initmethod="start" destroy-method="stop">
  <property name="serviceName"\>
    <value>Discover</value>
  </property>
```
Within the definition there are some elements and attributes that control the JVM arguments. For example:
```
<property name="jvmArgs"\>
  <value>-Xms8M;-Xmx512M;-Djava.nio.channels.spi.SelectorProvider=sun.nio.ch.PollSelectorProvider</value>
</property>
```
The JVM arguments can be set as a semi-colon separated list in the <property name="jvmArgs"><value> element.
Alternatively, if existing JVM arguments do not need to be changed, but new arguments are required, theye can be added to the collation.properties file. These properties are as follows:
```
com.collation.GigaSpaces.jvmargs.ibm
com.collation.EventsCore.jvmargs.ibm
com.collation.Discover.jvmargs.ibm
com.collation.DiscoverAdmin.jvmargs.ibm
com.collation.Proxy.jvmargs.ibm
com.collation.Topology.jvmargs.ibm
com.collation.GigaSpaces.jvmargs.sun
com.collation.EventsCore.jvmargs.sun
com.collation.Discover.jvmargs.sun
com.collation.DiscoverAdmin.jvmargs.sun
com.collation.Proxy.jvmargs.sun
com.collation.Topology.jvmargs.sun
```
Any arguments in these properties are appended to the arguments in the XML file. Where possible, these properties should be used before modifying the XML file because these changes are preserved during an upgrade. If TADDM is running on a Solaris platform, the sun properties are used; on all other platforms, the ibm properties are used. The settings in the XML file are used on all platforms.

Conclusion

Deploying TADDM in a production environment is an iterative process involving initial discovery, refinement of access lists, refinement of scopes, refinement of custom servers, deployment of application descriptors, and rediscovery. During this process it may be necessary to review and refine the various settings that impact performance. While the default settings generally provide adequate performance out of the box it is possible to substantially increase its performance by tailoring the configuration of TADDM to the environment it is managing.s performance by tailoring the configuration of TADDM to the environment it is managing.