Memory leak on new connections #261

jakelandis · 2018-02-14T20:12:27Z

As of 4.3.0 of this plugin the Sequel connection the database is opened and closed for each scheduled run. While fixed some issues it also appears introduced a memory leak.

Sequel connect method clearly documents:

  # If a block is not given, a reference to this database will be held in
  # <tt>Sequel::DATABASES</tt> until it is removed manually.  This is by
  # design, and used by <tt>Sequel::Model</tt> to pick the default
  # database.  It is recommended to pass a block if you do not want the
  # resulting Database object to remain in memory until the process
  # terminates, or use the <tt>keep_reference: false</tt> Database option.

The connection code in this plugin does not pass a block, nor set the keep_reference flag. Through inspection of the heap it found that the database object had a reference, not allowing it be reclaimed on each connect/disconnect.

The fix for this is simple, we need to pass the keep_reference flag as false on connection. Local testing with a pipeline with multiple jdbc inputs on a 1 second interval with a non-trival amount of selected items in the sql, given enough time, will show a relatively slow memory leak when profiling.

Specifically, a hprof heap dump when examined via visualvm and the following OQL query [1] will show excessively high counts of the configuration map , for example.

{
shallow_size = 7246080.0,
string = user,
count = 120768.0
}

{
shallow_size = 7244640.0,
string = password,
count = 120744.0
}

{
shallow_size = 7244280.0,
string = pool_timeout,
count = 120738.0
}

Profiling before and after this flag being set:
Before (the dip is me forcing GC):

After:

PR coming soon.

[1]

var countByValue = {};
var sizeByValue = {};

// Scroll the strings
heap.forEachObject(
  function(strObject) {
    var key = strObject.key.symbol;
      if (key) {
         key = key.toString();
      } else {
         key = 'unknown';
      }
    var count = countByValue[key];
    var size = sizeByValue[key] ? sizeByValue[key] : 0;
    countByValue[key] = count ? count + 1 : 1;
    sizeByValue[key] = size + sizeof(strObject);
  },
  'org.jruby.RubyHash$RubyHashEntry',
  false
);

// Transform the map into array
var mapEntries = [];
for (var i = 0, keys = Object.keys(countByValue), total = keys.length; i < total; i++) {
  mapEntries.push({
    count : countByValue[keys[i]],
    string : keys[i],
    shallow_size : sizeByValue[keys[i]]
  });
}

// Sort by counts
sort(mapEntries, 'rhs.count - lhs.count');

//Sort by size
//sort(mapEntries, 'rhs.shallow_size - lhs.shallow_size');

The text was updated successfully, but these errors were encountered:

Fixes logstash-plugins#261

jakelandis added a commit to jakelandis/logstash-input-jdbc that referenced this issue Feb 14, 2018

Fix memory leak by setting correct flag

2fe2c8e

Fixes logstash-plugins#261

jakelandis mentioned this issue Feb 14, 2018

Fix memory leak by setting correct flag #262

Closed

elasticsearch-bot closed this as completed in 866958b Feb 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak on new connections #261

Memory leak on new connections #261

jakelandis commented Feb 14, 2018

Memory leak on new connections #261

Memory leak on new connections #261

Comments

jakelandis commented Feb 14, 2018