New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix unneeded stops and restarts in AD monitoring #1340
Conversation
rebuild |
s.connect((host, port)) | ||
connected = True | ||
for i in range(0, max_tries): | ||
# Make three attempts to get SRV records from DNS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Make max_tries attempts...." - current value of 3 is definable
gui/common/freenasldap.py
Outdated
r.lifetime = _fs().directoryservice.activedirectory.dns.lifetime | ||
r.timeout = r.lifetime / 3 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the reasons you made it this way and that you don't want to mess with sysctl() updates, but I'd keep ability to modify timeout through the sysctl(). How about:
r.timeout = _fs().directoryservice.activedirectory.dns.timeout / 2
Or
r.lifetime = _fs().directoryservice.activedirectory.dns.lifetime * 2
if host_list: | ||
break | ||
else: | ||
self.logger.debug("[ServiceMonitorThread] Attempt %d to query SRV records failed " % (i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, try to avoid legacy "%x" % (v)
syntax. Use either f'{v}'
or '{}'.format(v)
instead.
|
||
if connected: | ||
break | ||
for h in host_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit hard to see it here, but looks like host_list
you are referencing here is declared, if any, in another scope. I actually commented about this to you before.
It's declared on an inner scope regarding this one. Also, if if self.name != 'activedirectory':
- it's not declared at all. Move host_list = []
outside of the if
.
elif service == 'ldap': | ||
fnldap = FreeNAS_LDAP(flags=FLAGS_DBINIT) | ||
else: | ||
fnldap = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor thing, but you can just declare above fnldap = None
and avoid having else:
clause.
@@ -82,31 +85,36 @@ def isEnabled(self, service): | |||
return enabled | |||
|
|||
@private | |||
def tryConnect(self, host, port): | |||
def tryConnect(self, host, port, fnldap): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to declare it as: def tryConnect(self, host, port, fnldap = None):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, as we only call get_ldap_servers
we don't need all this code around fnldap
, see below.
@miwi-fbsd says that Jenkins fails cause your branch is outdated and requires rebase. Can you try it with next iteration? |
for i in range(0, max_tries): | ||
# Make max_tries attempts to get SRV records from DNS | ||
host_list = fnldap.get_ldap_servers(host) | ||
if host_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, who is the host
here? Originally that was one of the DCs, but get_ldap_servers()
expects domain
instead:
def get_ldap_servers(domain, site=None):
dcs = []
if not domain:
return dcs
host = "_ldap._tcp.%s" % domain
if site:
host = "_ldap._tcp.%s._sites.%s" % (site, domain)
dcs = FreeNAS_ActiveDirectory_Base.get_SRV_records(host)
return dcs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Host is ultimately from freenas-v1.db here:
sqlite> SELECT sm_host FROM services_servicemonitor ;
sm_host = THEGIBSON.LAN
thread = ServiceMonitorThread(
id=s['id'], frequency=s['sm_frequency'], retry=s['sm_retry'],
host=s['sm_host'], port=s['sm_port'], name=thread_name,
logger=self.logger, middleware=self.middleware
)
It is the domain.
for i in range(0, max_tries): | ||
# XXX What about UDP? | ||
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | ||
s.settimeout(timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def port_is_listening(host, port, errors=[]):
looks very similar to the given code, except it doesn't set any timeout for the socket connection. It may be reasonable to add:
def port_is_listening(host, port, errors=[], timeout=0):
....
if timeout:
s.settimeout(timeout)
in the freenasldap.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked about this Andrew, timur is correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Made changes.
connected = True | ||
for i in range(0, max_tries): | ||
# Make max_tries attempts to get SRV records from DNS | ||
host_list = fnldap.get_ldap_servers(host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simply call:
host_list = FreeNAS_ActiveDirectory_Base.get_ldap_servers(host)
as it's a static method of the class. So no need for creating fnldap
object at all, neither for passing it around.
Found in the |
Incorporate suggestions form Timur to clean up usage of freenasldap.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like my bike sheds painted hot pink
@@ -639,6 +635,14 @@ def _started_notify(self, verb, what): | |||
res = True | |||
return res | |||
|
|||
async def _reload_activedirectory(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about naming this "_restart_activedirectory" since that is what it is actually doing ;-) If you are going to make a reload method, then have to reload samba_server ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a _restart_active directory method that calls "/etc/directoryservice/ActiveDirectory/ctl restart". I don't see any reason to change the existing restart method because we want to start from a clean slate to ensure proper AD join.
What I would ideally prefer here is regenerating smb4.conf (if needed) and killing / restarting winbind process as an "AD reload". I.e. _reload_cifs, then restart winbind. This will force new DC connection, but not kill existing SMB sessions.
It should be something less than a full "AD restart", but more than just reloading a config file.
@@ -17,7 +16,10 @@ | |||
if not apps.ready: | |||
django.setup() | |||
|
|||
from freenasUI.common.freenassysctl import freenas_sysctl as _fs | |||
from freenasUI.common.freenasldap import ( | |||
FreeNAS_ActiveDirectory_Base, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just import FreeNAS_ActiveDirectory here, there is no need to use the base class
for i in range(0, max_tries): | ||
# XXX What about UDP? | ||
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) | ||
s.settimeout(timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked about this Andrew, timur is correct
connected = True | ||
for i in range(0, max_tries): | ||
# Make max_tries attempts to get SRV records from DNS | ||
host_list = FreeNAS_ActiveDirectory_Base.get_ldap_servers(host) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/FreeNAS_ActiveDirectory_Base/FreeNAS_ActiveDirectory/
s.settimeout(None) | ||
s.close() | ||
for h in host_list: | ||
port_is_listening = FreeNAS_ActiveDirectory_Base.port_is_listening(str(h.target), h.port, errors=[]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/FreeNAS_ActiveDirectory_Base/FreeNAS_ActiveDirectory/
midclt call servicemonitor.stop resulted in self.finished.set() being called, which effectively removed the frequency control for service monitoring. As long as the service remained "connected", "started", and "enabled", then service monitoring would never stop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go
Ticket #33453
There are a several distinct situations where our AD monitoring will either turn off the Active Directory service or restart the Active Directory service when it is not required.
The improvements are as follows:
Simplify checks to see if the AD service is properly running, by reducing it to a single test "wbinfo -t". This reduces the liklihood of a false-positive that we are no longer joined to the domain.