-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
files: split update into batches #5552
Conversation
Hi, I didn't include changes with respect to use talloc-pools in this PR because currently I cannot reproduce the original performance improvements I have seen. I will investigate this further. I think #5091 is fixed as a side effect of this patch but so far I haven't found a proper ticket for this PR. @alexey-tikhonov, do you have a suggestion or shall I create a new ticket for the batch processing? As you can see there are some delays and counts hardcoded which can easily made configurable. But so far I think there is no need for configuration options. bye, |
So far when we talked about "batches" I thought we meant "accumulating few consequent update events and processing in a single 'batch' (once)" - to avoid an issue in a case user populates very large /etc/passwd with a script. But this PR aims to fix another valid issue - single operation during single update taking > 3 * WD |
Hi, I'm sorry for not being more specific here. I meant In the patch there is an initiial delay, which can be used to wait some time for more changes to the files before doing the real update.
Would you mind to open a suitable ticket and link it to related downstream tickets? Thanks. bye, |
Hi, |
31092b2
to
f238da3
Compare
Hi, the latest version only updates the commit message. bye, |
DELETE_GROUPS, | ||
READ_GROUPS, | ||
SAVE_GROUPS, | ||
UPDATE_FINISH, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UPDATE_FINISH
and UPDATE_DONE
sounds very similar to me. Maybe UPDATE_COMPLETED
instead of UPDATE_FINISH
, or PROCESSING_COMPLETED
?
state->batch_size = 10; | ||
state->obj_idx = 0; | ||
state->file_idx = 0; | ||
state->initial_delay = 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think some meaningful define here will be better. Or at least information about type of value (us / ms).
DEBUG(SSSDBG_CRIT_FAILURE, | ||
"Failed to add session recording attribute, ignored.\n"); | ||
} | ||
ret = sysdb_transaction_commit(id_ctx->domain->sysdb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If being that long in transaction state is safe? Total time will be the same as before + time spent waiting for tevent timers to trigger function.
} | ||
|
||
tv = tevent_timeval_current_ofs(0, state->initial_delay); | ||
state->te = tevent_add_timer(id_ctx->be->ev, state, tv, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is triggered by signal that file has been modified. What in the case of scenario when file has been modified multiple times in a very short time span (t << state->initial_delay)? If we will not create multiple tevent chains running in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I run small test:
- Changed delay to be 100ms from 100us
- Added some debug info every time sf_enum_files_first_step() is called
- Assigned each batching random number to track it logs
- Called
touch /etc/passwd & touch /etc/passwd & touch /etc/passwd & touch /etc/passwd &
Result was creation of 3 almost parallel tracks of batching procedure:
./sssd_files.log:3558:(2021-04-01 16:30:17): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_USERS, track: 10
./sssd_files.log:3559:(2021-04-01 16:30:17): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_USERS, track: 5
./sssd_files.log:3560:(2021-04-01 16:30:17): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_USERS, track: 76
./sssd_files.log:3561:(2021-04-01 16:30:17): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: DELETE_GROUPS, track: 10
./sssd_files.log:3562:(2021-04-01 16:30:17): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: DELETE_GROUPS, track: 5
./sssd_files.log:3563:(2021-04-01 16:30:17): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: DELETE_GROUPS, track: 76
./sssd_files.log:3564:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_GROUPS, track: 10
./sssd_files.log:3620:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_GROUPS, track: 5
./sssd_files.log:3676:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_GROUPS, track: 76
./sssd_files.log:3732:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 10
./sssd_files.log:3809:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 5
./sssd_files.log:3832:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 76
./sssd_files.log:3855:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 10
./sssd_files.log:3939:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 5
./sssd_files.log:3963:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 76
./sssd_files.log:3987:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 10
./sssd_files.log:4071:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 5
./sssd_files.log:4095:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 76
./sssd_files.log:4119:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 10
./sssd_files.log:4203:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 5
./sssd_files.log:4227:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 76
./sssd_files.log:4251:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 10
./sssd_files.log:4335:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 5
./sssd_files.log:4359:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 76
./sssd_files.log:4383:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 10
./sssd_files.log:4428:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 5
./sssd_files.log:4443:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: SAVE_GROUPS, track: 76
./sssd_files.log:4458:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_GROUPS, track: 10
./sssd_files.log:4459:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_GROUPS, track: 5
./sssd_files.log:4460:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: READ_GROUPS, track: 76
./sssd_files.log:4461:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: UPDATE_FINISH, track: 10
./sssd_files.log:4489:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: UPDATE_FINISH, track: 5
./sssd_files.log:4517:(2021-04-01 16:30:18): [be[files]] [sf_enum_files_first_step] (0x0200): BATCH step: UPDATE_FINISH, track: 76
With the default 100us timeout there is a small chance it will happen using command line and bash scripts but it is still possible if someone will send inotify signal fast enough the other way.
I think every new batching routine may want to check if old routine is still running and use this knowledge somehow (canceling old routine as new will overwrite it, building on top of old routine by merging results etc.).
What is the reason for this patches? To be able to deliver domain-inconsistent message? If that's the case you can send the message via monitor connection. |
IIUC, to avoid single large blocking operation. see #5557
What do you mean? |
IMHO in this case it would be better to do a diff update to improve performance first. It would also solve #4614. It wouldn't of course solve the initial cache population so some batching is needed, but it will affect the design a lot therefore it might be better to start with it.
The reason why the message is not delivered is because we send it over |
I did some testing comparing this PR with vanilla SSSD. I added extra log in Test environment was sssd-test-suite VM (1 core CPU, 2G RAM) running Fedora 31. TL; DR; Estimated time needed to mark domain as inconsistent:
Time consumed by calling operations on
Conclusion: While this PR shortens the time needed to mark a domain as inconsistent it is still affected by the size of To speed up domain marking as inconsistent there are three options:
|
Why this change isn't constant? 9m -> 20m is worrying. |
To be precise it was 30 + 100 users and 30 + 500 users. I was using default 30 /etc/passwd entries as sort of zero - reference.
Please take a look at this comment: #5552 (comment) |
I modified this PR by adding initial zero step which only starts database transaction and returns to tevent loop: Please take a look at updated test results (previous test result included): Estimated time needed to mark domain as inconsistent:
Time consumed by calling operations on
Conclusion: With dummy zero steep this PR will be able to solve #5091 by marking domain as inconsistent fast enough.
|
Hi @pbrezina, this sounds like the proper solution for #5091. There were discussions about dropping the monitor and relying on systemd to start the SSSD components. I do not remember the current state of those discussions but if we ever want to move that way this would be another task of the monitor which must be replaced. About @elkoniu's suggestion to wait with the processing of the files until the messages are send. This can be easily combined with @alexey-tikhonov 's original understanding of batches where we should not immediately start processing if we are notified of a change but wait for e.g. 1s of inactivity on the files and start processing. With this we should be long enough in the event loop to properly finish the communication with the NSS responder, but I guess you would still recommend to send the message via the monitor? bye, |
Hi @pbrezina, where do you see the difficulties to add a diff approach? The step the remove the cache has to be replaced by the diff step and then the entries which changed will be processed. Even with the diff approach we might need process the changes in batches if there are many to not block the backend for too long. bye, |
Ideally, we want to spawn private session of dbus-broker and use only one private message bus instead of running multiple. But so far we can use the monitor for situations like this. About transition to systemd, I'm all for it. We already rely on systemd at some places anyway so I think it is time to finally drop the monitor. But lets not discuss it here.
If the processing is delayed then the provider gets to the loop and sends the message so monitor is not needed in this case.
Let me ask a question first - if the domain is correctly marked as inconsistent, why do we need to have the files backend responsive? |
Can we agree to fix #5091 as a part of this PR with this zero-step approach rather than using monitor message bus? |
Hi, I meant "block" or "responsive" with respect to the watchdog. To avoid that the backend is killed by the watchdog we should make sure to return to the main loop regularly when know that processing the data needs more time. bye, |
Ok, feel free to continue with this PR. |
Hi, thanks for running this kind of test. The fallback to nss was not complete because the cache_req tried to find the related object in the cache. I added a new patch to cache_req to check if the domain is inconsistent or not. @pbrezina, do you agree with 715b407? bye, |
Just run test against 5000 users to check marking domain as inconsistent. This PR can be considered as solution for #5091 |
It looks like in the latest version this issue is solved. |
Another observation I have is that with 5000 users |
Two more issues exists so far: [ISSUE 1]
Process will remove 1 user and will hang with [ISSUE 2] - On top of steps from [ISSUE 1]
userdel running on top of nss (SSSD stoped) still blocks and takes about 30s to remove single user. Also It looks like those two issues are related to each other. The open question is if they are directly related to this PR as in [ISSUE 2] SSSD is not running and it is on top of "raw" files provider. |
The [ISSUE 1] and [ISSUE 2] I described above are presented on master branch build too. This way I do not think we are introducing any regression with this PR. |
Hi, I think the latest issues you've found are not strictly related to the patches. The reason is that if there are many users and groups in the cache the It might be possible to mitigate this by increasing the initial delay before the refresh starts ( Another way might be to speed up the refresh by not deleting and adding everything from scratch but with a diff like approach. But as discussed earlier this should be part of a PR. It might also be possible to modify bye, |
To not block callers when SSSD's files is doing a refresh of /etc/passwd or /etc/group allow to fall back to the next nss module which is typically libnss_files. Resolves: SSSD#5557 :config: Add new config option 'fallback_to_nss'
To avoid constant refreshes if /etc/passwd or /etc/group are modified multiple times in a short interval the refresh is only started after 1s of inactivity. Additionally the request makes sure that only one instance is run. Resolves: SSSD#5557
To make sure current and valid data is used when a certificate should be matched to a users from the files provider the request has to wait until a running refresh is finished. Resolves: SSSD#5557
If a domain is inconsistent the cached data might be inconsistent as well, so better not return it. Resolves: SSSD#5557
Hi, the latest version just adds the bye, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, few minor comments.
case REFRESH_NOT_RUNNIG: | ||
break; | ||
case REFRESH_WAITING_TO_START: | ||
refresh_ctx->passwd_start_again = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the passwd
refresh we have some extra debug messages with are not presented for group
refresh.
refresh_ctx->certmap_req_list = NULL; | ||
} | ||
|
||
ret = check_state(refresh_ctx, flags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check_state
in fact evaluates the state of process.
@@ -34,6 +34,12 @@ | |||
|
|||
#include "providers/data_provider/dp.h" | |||
|
|||
enum refresh_task_status { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this needs to be exposed in the header.
The thing is, IIRC, the real use of sss_cache here is to drop mem-cache. Invalidation of sysdb cache (for all domains!) is undesirable side effect (that does only harm - for "files" provider expiration time doesn't have any affect anyway). |
My questions were addressed. Thank you. Covscan is clean. |
Pushed PR: #5552 |
Hi @alexey-tikhonov , I think it is addressed by part of this PR which is commit eae03fb |
If the files managed by the files provider contain many users or groups
processing them might take a considerable amount of time. To keep the
backend responsive this patch splits the update into multiple steps
running one after the other but returning to the main loop in between.
This avoids issues during startup because the watchdog timer state is
reset properly. Additionally SBUS messages are process and as a result
the domain can be marked inconsistent in the frontends properly.