Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved log messages and exceptions handling #23480

Merged

Conversation

nico-stefani
Copy link
Member

Related issue
#23471

Description

This issue closes #23471. It improves some log messages related to exceptions and adds better handling for the WazuhHAPHelperError.

Logs/Alerts example

wazuh_clusterd.debug=0

  • Start HAProxy helper when the DataplaneAPI is not running
root@wazuh-master:/var/ossec# framework/python/bin/python3 framework/scripts/wazuh_clusterd.py -f
Starting cluster in foreground (pid: 351160)
2024/05/16 23:45:13 INFO: [Local Server] [Main] Serving on /var/ossec/queue/cluster/c-internal.sock
2024/05/16 23:45:13 INFO: [Master] [Main] Serving on ('0.0.0.0', 1516)
2024/05/16 23:45:13 INFO: [Master] [Local integrity] Starting.
2024/05/16 23:45:13 INFO: [Master] [Local agent-groups] Sleeping 30s before starting the agent-groups task, waiting for the workers connection.
2024/05/16 23:45:13 CRITICAL: [HAPHelper] [Main] Error 3043 - Could not initialize Proxy API: Check connectivity and the configuration in the `ossec.conf`
2024/05/16 23:45:13 INFO: [HAPHelper] [Main] Task ended
2024/05/16 23:45:13 INFO: [Master] [Local integrity] Finished in 0.085s. Calculated metadata of 34 files.
2024/05/16 23:45:16 INFO: [Worker] [Main] Connection from ('172.27.0.4', 47448)
2024/05/16 23:45:16 INFO: [Worker] [Main] Connection from ('172.27.0.3', 52276)
  • HAProxy is stopped during the helper cycle
2024/05/16 23:55:13 INFO: [Worker worker1] [Agent-groups send] Starting.
2024/05/16 23:55:13 INFO: [Worker worker2] [Agent-groups send] Starting.
2024/05/16 23:55:13 INFO: [Worker worker1] [Agent-groups send] Finished in 0.006s. Updated 1 chunks.
2024/05/16 23:55:13 INFO: [Worker worker2] [Agent-groups send] Finished in 0.006s. Updated 1 chunks.
2024/05/16 23:55:14 ERROR: [HAPHelper] [Main] Error 3045 - Could not connect to HAProxy: no data for wazuh_cluster/master-node: not found
2024/05/16 23:55:14 WARNING: [HAPHelper] [Main] Tasks may not perform as expected. Sleeping 60s before trying again...
2024/05/16 23:55:14 INFO: [Master] [Local integrity] Starting.
2024/05/16 23:55:14 INFO: [Master] [Local integrity] Finished in 0.003s. Calculated metadata of 34 files.
2024/05/16 23:55:20 INFO: [Worker worker1] [Integrity check] Starting.
2024/05/16 23:55:20 INFO: [Worker worker1] [Integrity check] Finished in 0.006s. Received metadata of 34 files. Sync not required.
  • DataplaneAPI is stopped during the helper cycle
2024/05/16 23:58:33 INFO: [Worker worker1] [Agent-groups send] Starting.
2024/05/16 23:58:33 INFO: [Worker worker2] [Agent-groups send] Starting.
2024/05/16 23:58:33 INFO: [Worker worker1] [Agent-groups send] Finished in 0.006s. Updated 1 chunks.
2024/05/16 23:58:33 INFO: [Worker worker2] [Agent-groups send] Finished in 0.020s. Updated 1 chunks.
2024/05/16 23:58:33 ERROR: [HAPHelper] [Main] Error 3044 - Could not connect to the HAProxy dataplane API: All connection attempts failed
2024/05/16 23:58:33 WARNING: [HAPHelper] [Main] Tasks may not perform as expected. Sleeping 60s before trying again...
2024/05/16 23:58:39 INFO: [Master] [Local integrity] Starting.
2024/05/16 23:58:39 INFO: [Master] [Local integrity] Finished in 0.004s. Calculated metadata of 34 files.
2024/05/16 23:58:41 INFO: [Worker worker2] [Agent-info sync] Starting.

wazuh_clusterd.debug=2

  • Start HAProxy helper when the DataplaneAPI is not running
root@wazuh-master:/var/ossec# framework/python/bin/python3 framework/scripts/wazuh_clusterd.py -f
2024/05/17 00:03:04 DEBUG: [Cluster] [Main] Removing '/var/ossec/queue/cluster/'.
2024/05/17 00:03:04 DEBUG: [Cluster] [Main] Removed '/var/ossec/queue/cluster/'.
Starting cluster in foreground (pid: 363679)
2024/05/17 00:03:04 INFO: [Local Server] [Main] Serving on /var/ossec/queue/cluster/c-internal.sock
2024/05/17 00:03:04 DEBUG: [Local Server] [Keep alive] Calculating.
2024/05/17 00:03:04 DEBUG: [Local Server] [Keep alive] Calculated.
2024/05/17 00:03:04 INFO: [Master] [Main] Serving on ('0.0.0.0', 1516)
2024/05/17 00:03:04 DEBUG: [Master] [Keep alive] Calculating.
2024/05/17 00:03:04 DEBUG: [Master] [Keep alive] Calculated.
2024/05/17 00:03:04 INFO: [Master] [Local integrity] Starting.
2024/05/17 00:03:04 INFO: [Master] [Local agent-groups] Sleeping 30s before starting the agent-groups task, waiting for the workers connection.
2024/05/17 00:03:04 ERROR: [HAPHelper] [Main] Error 3043 - Could not initialize Proxy API: Check connectivity and the configuration in the `ossec.conf`
Traceback (most recent call last):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
    yield
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 373, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
    raise exc from None
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
    response = await connection.handle_async_request(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection.py", line 99, in handle_async_request
    raise exc
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection.py", line 76, in handle_async_request
    stream = await self._connect(request)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection.py", line 122, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_backends/auto.py", line 30, in connect_tcp
    return await self._backend.connect_tcp(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 114, in connect_tcp
    with map_exceptions(exc_map):
  File "/var/ossec/framework/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: All connection attempts failed
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 107, in initialize
    response = await client.get(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1801, in get
    return await self.request(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1574, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1661, in send
    response = await self._send_handling_auth(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1689, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1726, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1763, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 372, in handle_async_request
    with map_httpcore_exceptions():
  File "/var/ossec/framework/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: All connection attempts failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/hap_helper.py", line 513, in start
    await helper.initialize_proxy()
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/hap_helper.py", line 86, in initialize_proxy
    await self.proxy.initialize()
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 536, in initialize
    await self.api.initialize()
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 116, in initialize
    raise WazuhHAPHelperError(
wazuh.core.exception.WazuhHAPHelperError: Error 3043 - Could not initialize Proxy API: Check connectivity and the configuration in the `ossec.conf`
2024/05/17 00:03:04 INFO: [HAPHelper] [Main] Task ended
2024/05/17 00:03:04 INFO: [Master] [Local integrity] Finished in 0.094s. Calculated metadata of 34 files.
2024/05/17 00:03:07 INFO: [Worker] [Main] Connection from ('172.27.0.3', 49922)
2024/05/17 00:03:07 DEBUG: [Worker] [Main] Command received: b'hello'
2024/05/17 00:03:07 INFO: [Worker] [Main] Connection from ('172.27.0.4', 52906)
2024/05/17 00:03:07 DEBUG: [Worker] [Main] Command received: b'hello'
^C2024/05/17 00:03:07 INFO: [Cluster] [Main] SIGINT received. Bye!
  • HAProxy is stopped during the helper cycle
2024/05/17 00:06:46 INFO: [Master] [Local agent-groups] Finished in 0.001s.
2024/05/17 00:06:46 INFO: [Worker worker1] [Agent-groups send] Starting.
2024/05/17 00:06:46 INFO: [Worker worker2] [Agent-groups send] Starting.
2024/05/17 00:06:46 DEBUG: [Worker worker1] [Agent-groups send] Sending chunks.
2024/05/17 00:06:46 DEBUG: [Worker worker2] [Agent-groups send] Sending chunks.
2024/05/17 00:06:46 DEBUG: [Worker worker1] [Main] Command received: b'syn_w_g_e'
2024/05/17 00:06:46 INFO: [Worker worker1] [Agent-groups send] Finished in 0.005s. Updated 1 chunks.
2024/05/17 00:06:46 DEBUG: [Worker worker2] [Main] Command received: b'syn_w_g_e'
2024/05/17 00:06:46 INFO: [Worker worker2] [Agent-groups send] Finished in 0.006s. Updated 1 chunks.
2024/05/17 00:06:46 DEBUG2: [HAPHelper] [Proxy] Obtained proxy servers
2024/05/17 00:06:46 ERROR: [HAPHelper] [Main] Error 3045 - Could not connect to HAProxy: no data for wazuh_cluster/master-node: not found
Traceback (most recent call last):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/hap_helper.py", line 374, in manage_wazuh_cluster_nodes
    await self.backend_servers_state_healthcheck()
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/hap_helper.py", line 139, in backend_servers_state_healthcheck
    if await self.proxy.is_server_drain(server_name=server):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 824, in is_server_drain
    server_stats = await self.api.get_backend_server_runtime_settings(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 422, in get_backend_server_runtime_settings
    return await self._make_hap_request(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 185, in _make_hap_request
    raise WazuhHAPHelperError(3045, extra_message=response.json()['message'])
wazuh.core.exception.WazuhHAPHelperError: Error 3045 - Could not connect to HAProxy: no data for wazuh_cluster/master-node: not found
2024/05/17 00:06:46 WARNING: [HAPHelper] [Main] Tasks may not perform as expected. Sleeping 60s before trying again...
  • DataplaneAPI is stopped during the helper cycle
2024/05/17 00:09:39 INFO: [Worker worker1] [Agent-groups send] Finished in 0.006s. Updated 1 chunks.
2024/05/17 00:09:39 DEBUG: [Worker worker2] [Main] Command received: b'syn_w_g_e'
2024/05/17 00:09:39 INFO: [Worker worker2] [Agent-groups send] Finished in 0.009s. Updated 1 chunks.
2024/05/17 00:09:39 ERROR: [HAPHelper] [Main] Error 3044 - Could not connect to the HAProxy dataplane API: All connection attempts failed
Traceback (most recent call last):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
    yield
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 373, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
    raise exc from None
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
    response = await connection.handle_async_request(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection.py", line 99, in handle_async_request
    raise exc
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection.py", line 76, in handle_async_request
    stream = await self._connect(request)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_async/connection.py", line 122, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_backends/auto.py", line 30, in connect_tcp
    return await self._backend.connect_tcp(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 114, in connect_tcp
    with map_exceptions(exc_map):
  File "/var/ossec/framework/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: All connection attempts failed
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 160, in _make_hap_request
    response = await client.request(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1574, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1661, in send
    response = await self._send_handling_auth(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1689, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1726, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_client.py", line 1763, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 372, in handle_async_request
    with map_httpcore_exceptions():
  File "/var/ossec/framework/python/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
    raise mapped_exc(message) from exc
httpx.ConnectError: All connection attempts failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/hap_helper.py", line 374, in manage_wazuh_cluster_nodes
    await self.backend_servers_state_healthcheck()
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/hap_helper.py", line 138, in backend_servers_state_healthcheck
    for server in (await self.proxy.get_current_backend_servers()).keys():
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 701, in get_current_backend_servers
    api_response = await self.api.get_backend_servers(self.wazuh_backend)
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 281, in get_backend_servers
    return await self._make_hap_request(
  File "/var/ossec/framework/python/lib/python3.10/site-packages/wazuh/core/cluster/hap_helper/proxy.py", line 168, in _make_hap_request
    raise WazuhHAPHelperError(3044, extra_message=str(request_exc))
wazuh.core.exception.WazuhHAPHelperError: Error 3044 - Could not connect to the HAProxy dataplane API: All connection attempts failed
2024/05/17 00:09:39 WARNING: [HAPHelper] [Main] Tasks may not perform as expected. Sleeping 60s before trying again...

@nico-stefani nico-stefani self-assigned this May 17, 2024
@nico-stefani nico-stefani force-pushed the fix/23471-improve-log-messages branch from c212596 to 591726a Compare May 17, 2024 00:25
@fdalmaup fdalmaup linked an issue May 17, 2024 that may be closed by this pull request
4 tasks
@fdalmaup fdalmaup self-requested a review May 17, 2024 10:58
Copy link
Member

@fdalmaup fdalmaup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Selutario Selutario merged commit 1d11841 into epic-20887-migrate-haproxy-helper May 17, 2024
6 of 7 checks passed
@Selutario Selutario deleted the fix/23471-improve-log-messages branch May 17, 2024 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve logs for the HAProxy Helper
3 participants