Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autofs: distinguish between empty cache and offline state #5343

Closed
wants to merge 6 commits into from

Conversation

pbrezina
Copy link
Member

@pbrezina pbrezina commented Oct 1, 2020

If the cache is empty and SSSD is offline we now return a proper
error code to autofs so it can iplement a proper retry logic to
fetch the maps when SSSD comes back offline.

This also requires changes in autofs which are not yet available
so the best way to test it with SSSD is to use the autofs test
tool (autofs_test_client) to see that it returns correct error
codes.

Resolves:
#3413

@alexey-tikhonov
Copy link
Member

If the cache is empty and SSSD is offline we now return a proper
error code to autofs so it can iplement a proper retry logic to
fetch the maps when SSSD comes back offline.

This is, of course, out of SSSD scope, but what happens - currently and planned - if:
automount: db1 db2
and db1 returns EHOSTDOWN? (IIUC, in case of ENOENT autofs would ask db2)

In some sense this is a change of API between autofs and plugin. Is this change backward compatible with autofs that doesn't implement proper handling of this error code yet?

@pbrezina
Copy link
Member Author

Even though autofs uses nsswitch.conf for database order, it does not use its logic (the autofs api is not part of glibc). But the code suggest that it's going to fail, so I asked Ian once again in the bugzilla for reassurance.

@pbrezina pbrezina force-pushed the autofs branch 2 times, most recently from 412c622 to acf1a0a Compare November 4, 2020 14:24
@pbrezina
Copy link
Member Author

pbrezina commented Nov 4, 2020

I added _sss_auto_protocol_version to solve the circular dependency between autofs and SSSD, this is yet to be checked with Ian.

@pbrezina
Copy link
Member Author

pbrezina commented Nov 5, 2020

Ian approved these changes. Please continue with the review. Autofs test client got a new -p number parameter to set the requested protocol (0 or 1).

@@ -33,6 +33,32 @@
/* How many entries shall _sss_getautomntent_r retrieve at once */
#define GETAUTOMNTENT_MAX_ENTRIES 512

unsigned int _protocol = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I think _protocol should be:

  1. static at the very least.
  2. strictly speaking access to this var should be thread safe.
    I would prefer C11 atomic_int but sss_client in general uses libpthread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atomics are better in that they do not incur the heavy weight of acquiring a mutx (which is done via atomics anyway). So I see no problem in using that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And how about thread_local, do you see any problem there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo, this won't hurt but I'm not sure if it makes much sense.

_Thread_local here would be to cover case "application wants to use sssd autofs lib with different protocol version from different threads".

Frankly I can't imagine sane reason to do so. But even if there are, and we want to support this, then we should document this somehow/somewhere. Currently looking at the API itself one can't judge if there is thread specific context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no functional reason - autofs will always try to used its latest protocol version, atomic should be enough to do that.

@alexey-tikhonov
Copy link
Member

Thank you for the updates. I have some concerns wrt global var. Please see inline.

@alexey-tikhonov
Copy link
Member

btw, "pbrezina added 5 commits on Mar 6" - date looks weird...

@pbrezina
Copy link
Member Author

:-) It's fine, I had the first patch set ready in March but it was blocked for a long time by Ian's work.

BTW RHEL7 and CentOS apparently does not have stdatomic.h (gcc 4.8 bug resolved in 4.9) so if we choose to use this we need to stop running prci against rhel7/centos7. I'm fine with that.

@alexey-tikhonov
Copy link
Member

BTW RHEL7 and CentOS apparently does not have stdatomic.h (gcc 4.8 bug resolved in 4.9) so if we choose to use this we need to stop running prci against rhel7/centos7. I'm fine with that.

Perhaps we could add ./configure check for this header to error out gracefully.

So we do not publish internal error code.

Resolves:
SSSD#3413
If the backend is offline when autofs starts and reads auto.master map
we don't want to wait 60 seconds before the offline flag is reset. We
need to allow autofs to retry the call much sooner.

Resolves:
SSSD#3413
@pbrezina
Copy link
Member Author

Added.

@alexey-tikhonov
Copy link
Member

Added.

It checks:

checking stdatomic.h usability... no
checking stdatomic.h presence... no
checking for stdatomic.h... no

but doesn't error out.

I would expect ./configure to stop here saying project won't build on this system?

Recent autofs patches adds dependency on automic_uint/_Atomic type from C11
standard. This is supported in both gcc and clang for a long time now.
@alexey-tikhonov
Copy link
Member

Thank you, ACK.

@pbrezina
Copy link
Member Author

pbrezina commented Dec 4, 2020

Pushed PR: #5343

  • master
    • 075519b - configure: check for stdatomic.h
    • 8a22d4a - autofs: correlate errors for different protocol versions
    • 34c519a - autofs: disable fast reply
    • 9098108 - autofs: translate ERR_OFFLINE to EHOSTDOWN
    • e50258d - autofs: return ERR_OFFLINE if we fail to get information from backend and cache is empty
    • 3f0ba4c - cache_req: allow cache_req to return ERR_OFFLINE if all dp request failed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants