Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWX 11.0.0 Logstash connection refused #6794

Closed
ifeelcream opened this issue Apr 22, 2020 · 22 comments
Closed

AWX 11.0.0 Logstash connection refused #6794

ifeelcream opened this issue Apr 22, 2020 · 22 comments

Comments

@ifeelcream
Copy link

ifeelcream commented Apr 22, 2020

ISSUE TYPE
  • Bug Report
SUMMARY

AWX is not able to send logs to Logstash. The same settings are working in Tower (v.3.6.3 with Ansible 2.9.6). The "Test" Button tells me that the test was sent successfully. In the awx_web Container I can see the following log messages:

2020-04-22 11:40:25,479 ERROR    awx AWX Connection Test Message
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib64/python3.6/logging/handlers.py", line 936, in emit
    self.socket.send(msg)
OSError: [Errno 9] Bad file descriptor

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.6/logging/handlers.py", line 857, in _connect_unixsocket
    self.socket.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib64/python3.6/logging/handlers.py", line 939, in emit
    self._connect_unixsocket(self.address)
  File "/usr/lib64/python3.6/logging/handlers.py", line 868, in _connect_unixsocket
    self.socket.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
Call stack:
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/wsgi.py", line 141, in __call__
    response = self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/base.py", line 75, in get_response
    response = self._middleware_chain(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 93, in __call__
    response = self.process_request(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/sso/middleware.py", line 40, in process_request
    return self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/crum/__init__.py", line 97, in __call__
    response = self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/utils/deprecation.py", line 94, in __call__
    response = response or self.get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib64/python3.6/contextlib.py", line 52, in inner
    return func(*args, **kwds)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/views/decorators/csrf.py", line 54, in wrapped_view
    return view_func(*args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/django/views/generic/base.py", line 71, in view
    return self.dispatch(request, *args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/api/generics.py", line 299, in dispatch
    return super(APIView, self).dispatch(request, *args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/rest_framework/views.py", line 502, in dispatch
    response = handler(request, *args, **kwargs)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/conf/views.py", line 173, in post
    logging.getLogger('awx').error('AWX Connection Test Message')
  File "/usr/lib64/python3.6/logging/__init__.py", line 1337, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/usr/lib64/python3.6/logging/__init__.py", line 1444, in _log
    self.handle(record)
  File "/usr/lib64/python3.6/logging/__init__.py", line 1454, in handle
    self.callHandlers(record)
  File "/usr/lib64/python3.6/logging/__init__.py", line 1516, in callHandlers
    hdlr.handle(record)
  File "/usr/lib64/python3.6/logging/__init__.py", line 865, in handle
    self.emit(record)
  File "/var/lib/awx/venv/awx/lib/python3.6/site-packages/awx/main/utils/handlers.py", line 22, in emit
    return super(RSysLogHandler, self).emit(msg)
Message: 'AWX Connection Test Message'
Arguments: ()
ERROR:awx:AWX Connection Test Message
ENVIRONMENT
  • AWX version: 11.0.0
  • AWX install method: docker on linux
  • Ansible version: 2.9.5
  • Operating System: CentOS 7.7.1908
  • Web Browser: Brave
STEPS TO REPRODUCE

Trying to send a test message to Logstash.

EXPECTED RESULTS

A new log entry should be visible in Kibana.

ACTUAL RESULTS

No log entry in Kibana.

@ryanpetrello
Copy link
Contributor

@ifeelcream in the web container, could you run:

~ supervisorctl -c /supervisor.conf status

...and also (with any sensitive details redacted):

cat /var/lib/awx/rsyslog/rsyslog.conf

@ifeelcream
Copy link
Author

@ryanpetrello Thanks for your answer. Below are the outputs:

supervisorctl -c /supervisor.conf status

awx-config-watcher               RUNNING   pid 79, uptime 2 days, 0:29:57
tower-processes:awx-rsyslogd     FATAL     Exited too quickly (process log may have details)
tower-processes:daphne           RUNNING   pid 82, uptime 2 days, 0:29:57
tower-processes:nginx            RUNNING   pid 80, uptime 2 days, 0:29:57
tower-processes:uwsgi            RUNNING   pid 81, uptime 2 days, 0:29:57
tower-processes:wsbroadcast      RUNNING   pid 83, uptime 2 days, 0:29:57

cat /var/lib/awx/rsyslog/rsyslog.conf

$WorkDirectory /var/lib/awx/rsyslog
$MaxMessageSize 700000
$IncludeConfig /var/lib/awx/rsyslog/conf.d/*.conf
module(load="imuxsock" SysSock.Use="off")
input(type="imuxsock" Socket="/var/run/awx-rsyslog/rsyslog.sock" unlink="on")
template(name="awx" type="string" string="%rawmsg-after-pri%")
action(type="omfwd" target="redacted" port="514" protocol="udp" action.resumeRetryCount="-1" action.resumeInterval="5" template="awx")

@ryanpetrello
Copy link
Contributor

supervisorctl -c /supervisor.conf restart tower-processes:awx-rsyslogd

If you run this, do you see any clues in the web container logs? I'm guessing that rsyslogd is failing to bind to its socket when it comes online (or something similar); could be a permission issue.

@ifeelcream
Copy link
Author

I get the following errors:

2020-04-22 12:46:38,924 INFO spawned: 'awx-rsyslogd' with pid 290
rsyslogd: pidfile '/var/run/awx-rsyslog/rsyslog.pid' and pid 290 already exist.
If you want to run multiple instances of rsyslog, you need to specify
different pid files for them (-i option).
rsyslogd: run failed with error -3000 (see rsyslog.h or try https://www.rsyslog.com/e/3000 to learn what that number means)
2020-04-22 12:46:38,934 INFO exited: awx-rsyslogd (exit status 1; not expected)
2020-04-22 12:46:39,937 INFO spawned: 'awx-rsyslogd' with pid 291
rsyslogd: pidfile '/var/run/awx-rsyslog/rsyslog.pid' and pid 291 already exist.
If you want to run multiple instances of rsyslog, you need to specify
different pid files for them (-i option).
rsyslogd: run failed with error -3000 (see rsyslog.h or try https://www.rsyslog.com/e/3000 to learn what that number means)
2020-04-22 12:46:39,944 INFO exited: awx-rsyslogd (exit status 1; not expected)
2020-04-22 12:46:41,951 INFO spawned: 'awx-rsyslogd' with pid 292
rsyslogd: pidfile '/var/run/awx-rsyslog/rsyslog.pid' and pid 292 already exist.
If you want to run multiple instances of rsyslog, you need to specify
different pid files for them (-i option).
rsyslogd: run failed with error -3000 (see rsyslog.h or try https://www.rsyslog.com/e/3000 to learn what that number means)
2020-04-22 12:46:41,958 INFO exited: awx-rsyslogd (exit status 1; not expected)
2020-04-22 12:46:44,964 INFO spawned: 'awx-rsyslogd' with pid 293
rsyslogd: pidfile '/var/run/awx-rsyslog/rsyslog.pid' and pid 293 already exist.
If you want to run multiple instances of rsyslog, you need to specify
different pid files for them (-i option).
rsyslogd: run failed with error -3000 (see rsyslog.h or try https://www.rsyslog.com/e/3000 to learn what that number means)
2020-04-22 12:46:44,971 INFO exited: awx-rsyslogd (exit status 1; not expected)
2020-04-22 12:46:45,973 INFO gave up: awx-rsyslogd entered FATAL state, too many start retries too quickly

It seems that even though it changes the PID it is already used.

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Apr 22, 2020

Do you already have an rsyslogd daemon running in this container? What does ps say?

@ifeelcream
Copy link
Author

With ps I get no rsyslogd daemon shown:

    1 ?        00:00:05 tini
    6 ?        00:00:00 bash
   76 ?        00:00:45 supervisord
   79 ?        00:00:01 python3
   80 ?        00:00:00 nginx
   81 ?        00:00:08 uwsgi
   82 ?        00:01:06 daphne
   83 ?        00:01:16 awx-manage
   86 ?        00:00:23 uwsgi
   87 ?        00:00:24 uwsgi
   88 ?        00:00:22 uwsgi
   89 ?        00:00:34 uwsgi
   90 ?        00:00:27 uwsgi
  294 ?        00:00:00 ps

Also I didn't manually start or stop something in the container, just ran the install script.

@ryanpetrello
Copy link
Contributor

Is there a pre-existing /var/run/awx-rsyslog/rsyslog.pid after rsyslogd goes into fatal?

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Apr 22, 2020

This issue maybe looks related, but I don't see a resolution:

rsyslog/rsyslog#2134

@ryanpetrello
Copy link
Contributor

Also, can you share ls -la /var/run/awx-rsyslog/ ?

@ifeelcream
Copy link
Author

ifeelcream commented Apr 22, 2020

There was a rsyslog.pid. I deleted it and restarted the process with supervisorctl -c /supervisor.conf restart tower-processes:awx-rsyslogdand all the services are running now:

awx-config-watcher               RUNNING   pid 79, uptime 2 days, 0:57:19
tower-processes:awx-rsyslogd     RUNNING   pid 355, uptime 0:03:27
tower-processes:daphne           RUNNING   pid 82, uptime 2 days, 0:57:19
tower-processes:nginx            RUNNING   pid 80, uptime 2 days, 0:57:19
tower-processes:uwsgi            RUNNING   pid 81, uptime 2 days, 0:57:19
tower-processes:wsbroadcast      RUNNING   pid 83, uptime 2 days, 0:57:19

ls -la /var/run/awx-rsyslog/

-rw-r--r--. 1 root root 2 Apr 20 11:53 rsyslog.pid
srw-rw-rw-. 1 root root 0 Apr 20 11:53 rsyslog.sock

Now it is unfortunately still not working and the log is of no help:

2020-04-22 13:09:17,329 ERROR    awx AWX Connection Test Message
ERROR:awx:AWX Connection Test Message
[pid: 90|app: 0|req: 321/1440] 172.18.0.1 () {66 vars in 3190 bytes} [Wed Apr 22 13:09:17 2020] POST /api/v2/settings/logging/test/ => generated 0 bytes in 68 msecs (HTTP/1.1 202) 7 headers in 201 bytes (1 switches on core 0)
172.18.0.1 - - [22/Apr/2020:13:09:17 +0000] "POST /api/v2/settings/logging/test/ HTTP/1.1" 202 0 "https://localhost:9002/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36" "-"

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Apr 22, 2020

@ifeelcream let's raise the rsyslogd verbosity a bit and see if we can find any clues.

In the web container, edit /supervisor.conf, find the rsyslogd block, and change it from:

command = rsyslogd -n

...to...

command = rsyslogd -dn

Next, find the supervisord pid in your container (using ps - your comment above #6794 (comment) suggests 76), and reload supervisor:

kill -1 <pid>

This will cause rsyslogd to restart in debug mode, and if you do:

docker logs -f <web container> we should see some more verbose output from rsyslogd that should give us some clues.

@ryanpetrello
Copy link
Contributor

Hey @ifeelcream in the task container, do you see something like this?

bash-4.4# grep rsyslogd /supervisor_task.conf
[program:awx-rsyslogd]
command = rsyslogd -n -i /awx_devel/rsyslog.pid
programs=dispatcher,callback-receiver,awx-rsyslogd

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Apr 22, 2020

Okay, I think we found a bug specific to rsyslogd in the local docker installer that explains the behavior you're seeing @ifeelcream:

#6796

Trying this out will involve building your own images (or waiting until we've cut a new AWX release).

@ifeelcream
Copy link
Author

Thanks for all your help so far, really appreciate it!

Here is the output when testing the logging part:

2020-04-22 14:00:44,134 ERROR    awx AWX Connection Test Message
4044.136253179:imuxsock.c     : imuxsock.c: Message from UNIX socket: #4, size 213
4044.136270595:imuxsock.c     : parser.c: dropped LF at very end of message (DropTrailingLF is set)
4044.136275029:imuxsock.c     : datetime.c: ParseTIMESTAMP3339: invalid year: 0, pszTS: '"'
4044.136293717:imuxsock.c     : main Q: queue.c: EnqueueMsg advised worker start
4044.136297848:imuxsock.c     : imuxsock.c: --------imuxsock calling poll() on 2 fds
4044.136322313:main Q:Reg/w0  : wti 0x55b87b1cf370: wti.c: worker awoke from idle processing
4044.136330870:main Q:Reg/w0  : queue.c: DeleteProcessedBatch: we deleted 0 objects and enqueued 0 objects
4044.136333598:main Q:Reg/w0  : queue.c: rger: deleteBatchFromQStore, nElem 0
4044.136336265:main Q:Reg/w0  : queue.c: doDeleteBatch: delete batch from store, new sizes: log 1, phys 1
4044.136339319:main Q:Reg/w0  : main Q: queue.c: entry deleted, size now log 0, phys 1 entries
4044.136341854:main Q:Reg/w0  : main Q: queue.c: dequeued 1 consumable elements, szlog 0 sz phys 1
4044.136349304:main Q:Reg/w0  : ruleset.c: processBATCH: batch of 1 elements must be processed
4044.136351863:main Q:Reg/w0  : ruleset.c: processBATCH: next msg 0: <11>{"@timestamp": "2020-04-22T14:00:44.134Z", "message": "AWX Connection Test Message", "host": "awxweb", "level": "ERROR", "lo
4044.136355694:main Q:Reg/w0  : rainerscript.c:     ACTION 0 [builtin:omfwd:action(type="builtin:omfwd" ...)]
4044.136362608:main Q:Reg/w0  : ruleset.c: executing action 0
4044.136365967:main Q:Reg/w0  : ../action.c: action 'action-0-builtin:omfwd': called, logging to builtin:omfwd (susp 0/0, direct q 1)
4044.136371020:main Q:Reg/w0  : ../action.c: action 'action-0-builtin:omfwd': is transactional - executing in commit phase
4044.136374229:main Q:Reg/w0  : ../action.c: actionPrepare[action-0-builtin:omfwd]: enter
4044.136377306:main Q:Reg/w0  : ../action.c: checking external state file
4044.136379844:main Q:Reg/w0  : ../action.c: done checking external state file, iRet=0
4044.136382214:main Q:Reg/w0  : omfwd.c: omfwd: beginTransaction
4044.136385104:main Q:Reg/w0  : omfwd.c: omfwd: doTryResume redacted iRet 0
4044.136388595:main Q:Reg/w0  : ../action.c: action[action-0-builtin:omfwd] transitioned to state: itx
4044.136393963:main Q:Reg/w0  : ../action.c: action 'action-0-builtin:omfwd': set suspended state to 0
4044.136396200:main Q:Reg/w0  : ruleset.c: END batch execution phase, entering to commit phase [processed 1 of 1 messages]
4044.136398669:main Q:Reg/w0  : ../action.c: actionCommitAllDirect: action 0, state 1, nbr to commit 1 isTransactional 1
4044.136401052:main Q:Reg/w0  : ../action.c: actionCommit[action-0-builtin:omfwd]: enter, 1 msgs
4044.136402990:main Q:Reg/w0  : ../action.c: actionCommit[action-0-builtin:omfwd]: processing...
4044.136405059:main Q:Reg/w0  : ../action.c: actionTryCommit[action-0-builtin:omfwd] enter
4044.136406981:main Q:Reg/w0  : ../action.c: actionPrepare[action-0-builtin:omfwd]: enter
4044.136409317:main Q:Reg/w0  : ../action.c: doTransaction: have commitTransaction IF, using that, pWrkrInfo 0x55b87b1cf420
4044.136411575:main Q:Reg/w0  : ../action.c: entering actionCallCommitTransaction[action-0-builtin:omfwd], state: itx, nMsgs 1
4044.136414331:main Q:Reg/w0  : omfwd.c: omfwd: doTryResume redacted iRet 0
4044.136416894:main Q:Reg/w0  : omfwd.c:  redacted:514/udp
4044.136483361:main Q:Reg/w0  : ../action.c: actionCallCommitTransaction[action-0-builtin:omfwd] state: itx mod commitTransaction returned 0
4044.136497722:main Q:Reg/w0  : ../action.c: action[action-0-builtin:omfwd] transitioned to state: rdy
4044.136500056:main Q:Reg/w0  : ../action.c: actionCommit[action-0-builtin:omfwd]: return actionTryCommit 0
4044.136502190:main Q:Reg/w0  : ../action.c: actionCommit[action-0-builtin:omfwd]: done, iRet 0
4044.136504248:main Q:Reg/w0  : ruleset.c: processBATCH: batch of 1 elements has been processed
4044.136507390:main Q:Reg/w0  : queue.c: regular consumer finished, iret=0, szlog 0 sz phys 1
4044.136509996:main Q:Reg/w0  : queue.c: DeleteProcessedBatch: etry 0 state 3
4044.136513596:main Q:Reg/w0  : queue.c: DeleteProcessedBatch: we deleted 1 objects and enqueued 0 objects
4044.136515501:main Q:Reg/w0  : queue.c: rger: deleteBatchFromQStore, nElem 1
4044.136517652:main Q:Reg/w0  : queue.c: doDeleteBatch: delete batch from store, new sizes: log 0, phys 0
4044.136519774:main Q:Reg/w0  : main Q: queue.c: dequeued 0 consumable elements, szlog 0 sz phys 0
4044.136522032:main Q:Reg/w0  : queue.c: regular consumer finished, iret=4, szlog 0 sz phys 0
4044.136524399:main Q:Reg/w0  : wti.c: main Q:Reg/w0: worker IDLE, waiting for work.
[pid: 510|app: 0|req: 1/10] 172.18.0.1 () {66 vars in 3198 bytes} [Wed Apr 22 14:00:43 2020] POST /api/v2/settings/logging/test/ => generated 0 bytes in 405 msecs (HTTP/1.1 202) 7 headers in 201 bytes (1 switches on core 0)

When running the grep command in the task container I have the same output:

[program:awx-rsyslogd]
command = rsyslogd -n -i /awx_devel/rsyslog.pid
programs=dispatcher,callback-receiver,awx-rsyslogd

@ifeelcream
Copy link
Author

Okay, I think we found a bug specific to rsyslogd in the local docker installer that explains the behavior you're seeing @ifeelcream:

#6796

Trying this out will involve building your own images (or waiting until we've cut a new AWX release).

That is good to hear! Do you have a rough idea how long it would take to get into the new release? Right now I am still testing AWX for production so I am not in a rush.

@ryanpetrello
Copy link
Contributor

@ifeelcream usually we cut a new release every 4-8 weeks? It generally depends on the severity of discovered issues.

@ifeelcream
Copy link
Author

@ryanpetrello Alright, thanks for the feedback and all your help :)

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Apr 22, 2020

So your rsyslogd output suggests to me that rsyslogd is getting events from AWX and is shipping them out. Are you certain that your log aggregator is open and reachable?

You might want to do something like:

nc -vz <host> 514
tcpdump -i eth0 'dst port 514' -s1024 -vvv

@ifeelcream
Copy link
Author

ncat tells me it is connected to the host:

Ncat: Version 7.70 ( https://nmap.org/ncat )
Ncat: Connected to redacted:514.
Ncat: 0 bytes sent, 0 bytes received in 0.01 seconds.

tcpdump:

tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 1024 bytes

14:21:23.562804 IP (tos 0x0, ttl 64, id 33526, offset 0, flags [DF], proto UDP (17), length 236)
    awxweb.33231 > redacted.syslog: [bad udp cksum 0xd564 -> 0x606f!] [|syslog]
14:21:36.239296 IP (tos 0x0, ttl 64, id 41237, offset 0, flags [DF], proto UDP (17), length 236)
    awxweb.33231 > redacted.syslog: [bad udp cksum 0xd564 -> 0x5971!] [|syslog]
14:22:41.830562 IP (tos 0x0, ttl 64, id 2659, offset 0, flags [DF], proto UDP (17), length 584)
    awxweb.33231 > redacted.syslog: [bad udp cksum 0xd6c0 -> 0xc4ba!] [|syslog]
14:22:43.740552 IP (tos 0x0, ttl 64, id 14728, offset 0, flags [DF], proto TCP (6), length 60)
    awxweb.43594 > redacted.shell: Flags [S], cksum 0xd4a9 (incorrect -> 0x23b4), seq 3237654412, win 29200, options [mss 1460,sackOK,TS val 719885828 ecr 0,nop,wscale 7], length 0
14:22:43.741513 IP (tos 0x0, ttl 64, id 14729, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43594 > redacted.shell: Flags [.], cksum 0xd4a1 (incorrect -> 0x1db3), seq 3237654413, ack 297480902, win 229, options [nop,nop,TS val 719885829 ecr 421349449], length 0
14:22:43.741602 IP (tos 0x0, ttl 64, id 14730, offset 0, flags [DF], proto TCP (6), length 261)
    awxweb.43594 > redacted.shell: Flags [P.], cksum 0xd572 (incorrect -> 0xc358), seq 0:209, ack 1, win 229, options [nop,nop,TS val 719885829 ecr 421349449], length 209
14:22:43.744394 IP (tos 0x0, ttl 64, id 50513, offset 0, flags [DF], proto TCP (6), length 60)
    awxweb.43596 > redacted.shell: Flags [S], cksum 0xd4a9 (incorrect -> 0x9682), seq 2858242902, win 29200, options [mss 1460,sackOK,TS val 719885831 ecr 0,nop,wscale 7], length 0
14:22:43.745018 IP (tos 0x0, ttl 64, id 50514, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43596 > redacted.shell: Flags [.], cksum 0xd4a1 (incorrect -> 0x6b50), seq 2858242903, ack 2180638645, win 229, options [nop,nop,TS val 719885832 ecr 421349452], length 0
14:22:43.745071 IP (tos 0x0, ttl 64, id 50515, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43596 > redacted.shell: Flags [F.], cksum 0xd4a1 (incorrect -> 0x6b4f), seq 0, ack 1, win 229, options [nop,nop,TS val 719885832 ecr 421349452], length 0
14:22:43.751229 IP (tos 0x0, ttl 64, id 50516, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43596 > redacted.shell: Flags [.], cksum 0xd4a1 (incorrect -> 0x6b42), seq 1, ack 2, win 229, options [nop,nop,TS val 719885838 ecr 421349458], length 0
14:22:54.282574 IP (tos 0x0, ttl 64, id 14731, offset 0, flags [DF], proto TCP (6), length 261)
    awxweb.43594 > redacted.shell: Flags [P.], cksum 0xd572 (incorrect -> 0xa552), seq 209:418, ack 1, win 229, options [nop,nop,TS val 719896370 ecr 421349450], length 209
14:22:54.294457 IP (tos 0x0, ttl 64, id 4173, offset 0, flags [DF], proto TCP (6), length 60)
    awxweb.43606 > redacted.shell: Flags [S], cksum 0xd4a9 (incorrect -> 0xc607), seq 4118654832, win 29200, options [mss 1460,sackOK,TS val 719896381 ecr 0,nop,wscale 7], length 0
14:22:54.295049 IP (tos 0x0, ttl 64, id 4174, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43606 > redacted.shell: Flags [.], cksum 0xd4a1 (incorrect -> 0xb42b), seq 4118654833, ack 538117904, win 229, options [nop,nop,TS val 719896382 ecr 421360002], length 0
14:22:54.295070 IP (tos 0x0, ttl 64, id 4175, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43606 > redacted.shell: Flags [F.], cksum 0xd4a1 (incorrect -> 0xb42a), seq 0, ack 1, win 229, options [nop,nop,TS val 719896382 ecr 421360002], length 0
14:22:54.300035 IP (tos 0x0, ttl 64, id 61047, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43606 > redacted.shell: Fl^C
[root@nins-runner-dev ~]# docker exec 12050a5827b3 tcpdump -i eth0 'dst port 514' -s1024 -vvv
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 1024 bytes
14:23:35.467640 IP (tos 0x0, ttl 64, id 14732, offset 0, flags [DF], proto TCP (6), length 261)
    awxweb.43594 > redacted.shell: Flags [P.], cksum 0xd572 (incorrect -> 0xd275), seq 3237654831:3237655040, ack 297480902, win 229, options [nop,nop,TS val 719937555 ecr 421359991], length 209
14:23:35.480869 IP (tos 0x0, ttl 64, id 6672, offset 0, flags [DF], proto TCP (6), length 60)
    awxweb.43636 > redacted.shell: Flags [S], cksum 0xd4a9 (incorrect -> 0x2db2), seq 571636272, win 29200, options [mss 1460,sackOK,TS val 719937568 ecr 0,nop,wscale 7], length 0
14:23:35.481518 IP (tos 0x0, ttl 64, id 6673, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43636 > redacted.shell: Flags [.], cksum 0xd4a1 (incorrect -> 0xfdf6), seq 571636273, ack 4073107799, win 229, options [nop,nop,TS val 719937569 ecr 421401189], length 0
14:23:35.481558 IP (tos 0x0, ttl 64, id 6674, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43636 > redacted.shell: Flags [F.], cksum 0xd4a1 (incorrect -> 0xfdf5), seq 0, ack 1, win 229, options [nop,nop,TS val 719937569 ecr 421401189], length 0
14:23:35.487056 IP (tos 0x0, ttl 64, id 4877, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43636 > redacted.shell: Flags [.], cksum 0xfde9 (correct), seq 1, ack 2, win 229, options [nop,nop,TS val 719937574 ecr 421401195], length 0
14:23:48.033255 IP (tos 0x0, ttl 64, id 14733, offset 0, flags [DF], proto TCP (6), length 261)
    awxweb.43594 > redacted.shell: Flags [P.], cksum 0xd572 (incorrect -> 0x04b0), seq 209:418, ack 1, win 229, options [nop,nop,TS val 719950120 ecr 421401176], length 209
14:23:48.045305 IP (tos 0x0, ttl 64, id 38563, offset 0, flags [DF], proto TCP (6), length 60)
    awxweb.43642 > redacted.shell: Flags [S], cksum 0xd4a9 (incorrect -> 0x8b39), seq 101320599, win 29200, options [mss 1460,sackOK,TS val 719950132 ecr 0,nop,wscale 7], length 0
14:23:48.045950 IP (tos 0x0, ttl 64, id 38564, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43642 > redacted.shell: Flags [.], cksum 0xd4a1 (incorrect -> 0x8372), seq 101320600, ack 547236471, win 229, options [nop,nop,TS val 719950133 ecr 421413754], length 0
14:23:48.046001 IP (tos 0x0, ttl 64, id 38565, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43642 > redacted.shell: Flags [F.], cksum 0xd4a1 (incorrect -> 0x8371), seq 0, ack 1, win 229, options [nop,nop,TS val 719950133 ecr 421413754], length 0
14:23:48.051824 IP (tos 0x0, ttl 64, id 5345, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43642 > redacted.shell: Flags [.], cksum 0x8364 (correct), seq 1, ack 2, win 229, options [nop,nop,TS val 719950139 ecr 421413760], length 0
14:23:51.633234 IP (tos 0x0, ttl 64, id 14734, offset 0, flags [DF], proto TCP (6), length 261)
    awxweb.43594 > redacted.shell: Flags [P.], cksum 0xd572 (incorrect -> 0xc5b7), seq 418:627, ack 1, win 229, options [nop,nop,TS val 719953720 ecr 421413742], length 209
14:23:51.644662 IP (tos 0x0, ttl 64, id 35358, offset 0, flags [DF], proto TCP (6), length 60)
    awxweb.43652 > redacted.shell: Flags [S], cksum 0xd4a9 (incorrect -> 0xe72e), seq 2385974618, win 29200, options [mss 1460,sackOK,TS val 719953732 ecr 0,nop,wscale 7], length 0
14:23:51.645311 IP (tos 0x0, ttl 64, id 35359, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43652 > redacted.shell: Flags [.], cksum 0xd4a1 (incorrect -> 0xd295), seq 2385974619, ack 1198522985, win 229, options [nop,nop,TS val 719953732 ecr 421417353], length 0
14:23:51.645367 IP (tos 0x0, ttl 64, id 35360, offset 0, flags [DF], proto TCP (6), length 52)
    awxweb.43652 > redacted.shell: Flags [F.], cksum 0xd4a1 (incorrect -> 0

The Ansible Tower host is in the same network with the same Firewall Rules and OS and has no problems connecting. If this looks like some network problem I will try and investigate again if there was some misconfiguration.

@ifeelcream
Copy link
Author

I looked into Kibana again and now I can see the messages from yesterday! It seems the halted rsyslogd service was the initial problem and that I looked for a hostname instead of an IP in Kibana. So I guess I have to adjust the name of the docker web container in the inventory file and then all should be fine.

@ryanpetrello
Copy link
Contributor

ryanpetrello commented Apr 23, 2020

@ifeelcream just a heads up, we decided to cut a release today (awx 11.1.0) 😄, and it includes the aforementioned bug fix, which I think is what was ailing you originally:

#6796

Thanks for your help digging!

@ifeelcream
Copy link
Author

@ryanpetrello That was much faster than expected, perfect :D Thank you again for all your support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants