Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat init script hangs on configtest #1213

Closed
jwadolowski opened this issue Mar 23, 2016 · 4 comments

Comments

Projects
None yet
2 participants
@jwadolowski
Copy link

commented Mar 23, 2016

Hey guys,

I've just started filebeat deployment on my local Vagrant machine and it turned out approach to start in init script file is a little bit controversial in my opinion.

Deployment details:

[root@vagrant-filebeat ~]# cat /etc/redhat-release 
CentOS release 6.7 (Final)
[root@vagrant-filebeat ~]# rpm -qa | grep filebeat
filebeat-1.1.2-1.x86_64
[root@vagrant-filebeat ~]# cat /etc/filebeat/filebeat.yml 
filebeat:
  prospectors: []
  registry_file: "/var/lib/filebeat/registry"
  config_dir: "/etc/filebeat/conf.d"
output:
  redis:
    host: localhost
    port: 6379
logging:
  level: info 
  to_syslog: true

The case is that you guys do a config test before start, which is perfectly fine and absolutely expected, but the options that have been used to execute that may cause some issues.

Here's the relevant code from /etc/init.d/filebeat:

[ -f /etc/sysconfig/filebeat ] && . /etc/sysconfig/filebeat
pidfile=${PIDFILE-/var/run/filebeat.pid}
agent=${PB_AGENT-/usr/bin/filebeat}
args="-c /etc/filebeat/filebeat.yml"
test_args="-e -configtest"
wrapper="filebeat-god"
wrapperopts="-r / -n -p $pidfile"
RETVAL=0

test() {
    $agent $args $test_args
}

start() {
    echo -n $"Starting filebeat: "
    test
    if [ $? -ne 0 ]; then
        echo
        exit 1
    fi
    daemon $daemonopts $wrapper $wrapperopts -- $agent $args
    RETVAL=$?
    echo
    return $RETVAL
}

In my case before start /usr/bin/filebeat -c /etc/filebeat/filebeat.yml -e -configtest is executed. The problem is that:

  • -e means file/syslog logging settings from YML file are completely ignored and all output goes to stderr, so script won't start in background and will just hang forever (ignore redis warnings for a while, explained in next point)
[root@vagrant-filebeat ~]# /etc/init.d/filebeat start
Starting filebeat: 2016/03/23 13:10:02.597246 geolite.go:24: INFO GeoIP disabled: No paths were set under output.geoip.paths
2016/03/23 13:10:02.597273 redis.go:72: WARN Redis Output is deprecated. Please use the Redis Output Plugin from Logstash instead.
2016/03/23 13:10:02.597294 redis.go:100: INFO Reconnect Interval set to: 1s
2016/03/23 13:10:02.597301 redis.go:117: INFO [RedisOutput] Using Redis server localhost:6379
2016/03/23 13:10:02.597308 redis.go:121: INFO [RedisOutput] Redis connection timeout 5s
2016/03/23 13:10:02.597315 redis.go:122: INFO [RedisOutput] Redis reconnect interval 1s
2016/03/23 13:10:02.597320 redis.go:123: INFO [RedisOutput] Using index pattern filebeat
2016/03/23 13:10:02.597324 redis.go:124: INFO [RedisOutput] Topology expires after 15s
2016/03/23 13:10:02.597328 redis.go:125: INFO [RedisOutput] Using db 0 for storing events
2016/03/23 13:10:02.597331 redis.go:126: INFO [RedisOutput] Using db 1 for storing topology
2016/03/23 13:10:02.597336 redis.go:127: INFO [RedisOutput] Using 0 data type
2016/03/23 13:10:02.597953 redis.go:183: WARN Error connecting to Redis (dial tcp [::1]:6379: getsockopt: connection refused). Retrying in 1s
2016/03/23 13:10:03.598405 redis.go:183: WARN Error connecting to Redis (dial tcp [::1]:6379: getsockopt: connection refused). Retrying in 1s
2016/03/23 13:10:04.599630 redis.go:183: WARN Error connecting to Redis (dial tcp [::1]:6379: getsockopt: connection refused). Retrying in 1s
  • even if I removed -e from test_args variable it doesn't really change much (it still hangs, but is just less verbose):
[root@vagrant-filebeat ~]# /etc/init.d/filebeat start
Starting filebeat: 

My expectation was that -configtest flag is quite fast and returns 0 when everything is fine when it comes to config file and non-zero exit code otherwise. By fine I mean not terribly broken (i.e. lack of output, wrong values, etc). I definitely didn't expect that output I defined (Redis in my case) needs to be up and running upfront. It's not ideal approach in my opinion. I'd expect a warning message in log file, but it shouldn't stop filebeat daemon from starting, as there's no reason to do so. I create dozens of machines at the same time (terraform + Chef) and have no idea when Redis or Logstash will become available. All I know is that eventually these services will be online.

Cheers,
Jakub

@jwadolowski

This comment has been minimized.

Copy link
Author

commented Mar 23, 2016

Didn't noticed that earlier, but test() is also executed in restart().

@tsg tsg added bug Filebeat labels Mar 23, 2016

@tsg

This comment has been minimized.

Copy link
Collaborator

commented Mar 23, 2016

Marked this as bug, I didn't expect this behavior either.

@tsg tsg added the P2 label Apr 27, 2016

@tsg

This comment has been minimized.

Copy link
Collaborator

commented Apr 27, 2016

To clarify, it looks like we're attempting to connect on -configtest. If the connect returns an error, we still print Config OK as expected. However, if the connect hangs, -configtest also hangs which is not nice.

I can confirm this as still happens in current master (~5.0.0-alpha1) with elasticsearch and redis outputs at least. Steps to reproduce:

  • Configure in topbeat.yml the elasticsearch host to something like google.com.
  • Start topbeat with: ./topbeat -e -c topbeat.dev.yml -d "*" -configtest

Program only returns after the DNS name is resolved.

CC @urso

@tsg

This comment has been minimized.

Copy link
Collaborator

commented Apr 27, 2016

Here are the logs from the above:

2016/04/27 12:03:44.594577 beat.go:196: INFO Home path: [/Users/tsg/src/github.com/elastic/beats/topbeat] Config path: [/Users/tsg/src/github.com/elastic/beats/topbeat] Data path: [/Users/tsg/src/github.com/elastic/beats/topbeat/data] Logs path: [/Users/tsg/src/github.com/elastic/beats/topbeat/logs]
2016/04/27 12:03:44.594634 filter.go:41: DBG  configuration []
2016/04/27 12:03:44.594665 filter.go:52: DBG  filters:
2016/04/27 12:03:44.594679 beat.go:202: DBG  Filters:
2016/04/27 12:03:44.594730 topbeat.go:98: DBG  Init topbeat
2016/04/27 12:03:44.594747 topbeat.go:99: DBG  Follow processes [".*"]
2016/04/27 12:03:44.594761 topbeat.go:100: DBG  Period 5s
2016/04/27 12:03:44.594776 topbeat.go:101: DBG  System statistics true
2016/04/27 12:03:44.594783 topbeat.go:102: DBG  Process statistics true
2016/04/27 12:03:44.594809 topbeat.go:103: DBG  File system statistics true
2016/04/27 12:03:44.594820 topbeat.go:104: DBG  Cpu usage per core false
2016/04/27 12:03:44.594826 beat.go:225: INFO Setup Beat: topbeat; Version: 5.0.0-SNAPSHOT
2016/04/27 12:03:44.594831 beat.go:227: DBG  Initializing output plugins
2016/04/27 12:03:44.594843 geolite.go:24: INFO GeoIP disabled: No paths were set under shipper.geoip.paths
2016/04/27 12:03:44.594943 output.go:140: INFO Loading template enabled. Reading template file: /Users/tsg/src/github.com/elastic/beats/topbeat/topbeat.template.json
2016/04/27 12:03:44.595165 client.go:452: DBG  ES Ping(url=http://google.com:9200, timeout=1m30s)


2016/04/27 12:04:03.294439 client.go:457: DBG  Ping request failed with: Head http://google.com:9200: dial tcp 216.58.213.14:9200: getsockopt: operation timed out
2016/04/27 12:04:03.294472 outputs.go:88: INFO Activated elasticsearch as output plugin.
2016/04/27 12:04:03.294488 publish.go:236: DBG  Create output worker
2016/04/27 12:04:03.294597 publish.go:278: DBG  No output is defined to store the topology. The server fields might not be filled.
2016/04/27 12:04:03.294631 publish.go:292: INFO Publisher name: Tudors-MacBook-Pro.local
2016/04/27 12:04:03.294960 async.go:63: INFO Flush Interval set to: 1s
2016/04/27 12:04:03.294976 async.go:64: INFO Max Bulk Size set to: 50
2016/04/27 12:04:03.294988 async.go:72: DBG  create bulk processing worker (interval=1s, bulk size=50)
Config OK
2016/04/27 12:04:03.295078 beat.go:261: INFO topbeat cleanup
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.