You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a race condition involving clusterd and modulesd that occurs between lines 362 and 365. If modulesd removes the temp file created by safe_move function before line 372 a No such file or directory exception is raised.
A possible workaround to deal with the problem is changing the temporary file name from myfile.tmp to .myfile.tmp. This way modulesd will ignore it.
I strongly recommend performing a refactor in all daemons so any resource (file, database or whatever) is managed by only one process. Any access to these resources should be delegated to its owner.
The text was updated successfully, but these errors were encountered:
Hello @crd1985,
Do you think this issue might affect also agent status in clustered setups?
I am facing an annoying problem in such a setup, which to the best of my understanding is related to the "Error updating agent group/status ... No such file or directory" message.
The problem:
According to the cluster (CLI tools, API, Kibana APP) some agents are disconnected, however i normally receive events. Specifically all disconnected nodes are associated with one worker node.
The agent-info files are updated normally on tis worker node, however the info is not synced back to the master.
I have gone through the cluster logs and there is a ton of error messages.
If i restart the wazuh-manager on the master all agents immediately get connected.
Note that i took one worker down for a week but did not experience any such problems, all agents were connected.
Setup: 1 master, 2 workers, No LB.
Wazuh agents are configured in failover mode and i have manually modified the order of "managers" in each agent config file.
Do you think this is related to the issue you describe? Should i try another version ?
Sometimes an error appears in
cluster.log
:The error arises due to the following function:
wazuh/framework/wazuh/utils.py
Lines 348 to 372 in ee0943d
There is a race condition involving clusterd and modulesd that occurs between lines 362 and 365. If modulesd removes the temp file created by
safe_move
function before line 372 aNo such file or directory
exception is raised.A possible workaround to deal with the problem is changing the temporary file name from
myfile.tmp
to.myfile.tmp
. This way modulesd will ignore it.I strongly recommend performing a refactor in all daemons so any resource (file, database or whatever) is managed by only one process. Any access to these resources should be delegated to its owner.
The text was updated successfully, but these errors were encountered: