New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP - toggle between refreshable and one-time instantiation of pac4j clients #5315
Conversation
Thank you very much for submitting this pull request! The patch presented here seems to contain no adequate unit and/or integration tests. Patches that alter CAS core Please re-open the pull request (or ask project maintainers to do this for you) when you're ready and have included tests to verify Furthermore,
If you believe this message to be an error, please post your explanation here as a comment and it will be reviewed. Thanks again! |
@mmoayyed while the implementation is debattable, I think this PR raises a good question about the "refreshable" or not nature of the pac4j clients and the related drawbacks. |
Thank you for the patch and the notes. What you describe is quite true, but the implementation here is not something we can work with, specially since it also contains no tests or a way to verify broken behavior or that of the fix. So I'll go ahead and close this for now, and we'll see if we can provide a remedy in future RCs for 6.5. Thank you! |
Maybe the solution could rather be provided at the pac4j level. Just to be sure, what is the reason we have the |
@leleuj This is, to me, an important question - is there any real deploy-time use case for |
@mmoayyed will tell us the use case needed by |
Since pac4j v5, the What do you think? |
Or even an My feeling is that if there is any issue, it should be solved by pac4j itself. |
The use-case for refreshable clients is exactly is as the name suggests. We want to outsource the construction of identity providers to external systems, APIs who provide administration and management of clients in a way that is completely detached from CAS, allowing each client construction to change at will, and be managed entirely separately from the CAS configuration, without necessarily having to deal with the complexity and/or overhead of the Spring Cloud Config server and family. Suffice it to say, if something is part of CAS, there is a valid, real, deploy-time use case for it to justify its existence with support and maintainer ability to take care of it. It also means somewhere someone is paying, one way or another for us maintainers to support and back it. If you don't have the simplicity or complexity to manage this type of setup, worry not. It's not for you. We work on 10s, sometimes 100s of deployments throughout the year; Not everything found in the codebase is right for everyone, and that's why the menu is large to choose from. However, this particular behavior should not be the default, and certainly should be controllable via configuration, and that is the case today in the current master with the most recent changes. So, there is nothing more to do here. Furthermore, this is not something that should go into pac4j per se. One of the side-benefits of this change is the ability to push pac4j down the stack and not expose it directly to client code and integrations and customizations. A |
I would encourage you to test and verify the changes in master today, and report feedback. When you do, please make sure you have tests that reproduce an issue, or verify a fix with tests. Fixes or feedback without tests takes time to verify, and it's free time we dont have any to spare. A non-tested non-testable fix, is a fix that does not exist. I will also be doing a sweep some time later in the week, or the next to see if some things can be backported to 6.4.x. |
OK. Thanks for the thoroughful explanations. I think I get the point. Taking a look at the master, I'm a bit worried by the source code: @Override
public Collection<IndirectClient> build() {
if (this.clients.isEmpty() || !casProperties.getAuthn().getPac4j().getCore().isLazyInit()) {
this.clients.clear();
configureCasClient(clients);
...
configureHiOrgServerClient(clients);
}
return clients;
} We clear the list of the clients before adding again the clients. Under heavy loads and many requests, as the |
Good point! I think we have pac4j-level puppeteer tests that cover this. Can you see if you can duplicate this as an issue with a test? |
...or perhaps:
? |
I don't think a puppeteer test is appropriate here, it's really about heavy load and multiple requests at the same time. The scenario would be: a first request comes in to trigger an authn delegation for the SAML2Client, the list is rebuilt. At the same time, a second request comes in to trigger an authn delegation for the CasClient, the list is also rebuilt. I think the option would be to rebuild a new list and just set it when it's ready: @Override
public Collection<IndirectClient> build() {
if (this.clients.isEmpty() || !casProperties.getAuthn().getPac4j().getCore().isLazyInit()) {
val newClients = new ArraysList<Client>();
configureCasClient(newClients);
...
configureHiOrgServerClient(newClients);
this.clients = newClients;
}
return clients;
} I will make a test and let you know. |
It's exactly the sort of test puppeteer can help with; we have many other similar types of tests that do very similar things for ticket registries, specially JPA. But, any test is a good test as long as it produces an issue. |
Indeed, the JPA test does that. I didn't know that. I will see what I can do. |
OK. I spent quite some time on this. First of all, I followed your advice and tried to create a puppeteer test which fails. Here is the PR: #5318 Then, I took another approach based on the following demo: https://github.com/casinthecloud/cas-pac4j-oauth-demo and changed the CAS source code to create a fake delay of 10 secondes in the @Override
public Collection<IndirectClient> build() {
this.clients.clear();
try {
Thread.sleep(10000);
} catch (final InterruptedException e) {
LOGGER.error("Pause interrupted", e);
}
configureCasClient(clients);
configureFacebookClient(clients);
configureOidcClient(clients);
configureOAuth20Client(clients);
configureSamlClient(clients);
configureTwitterClient(clients);
configureDropBoxClient(clients);
configureFoursquareClient(clients);
configureGitHubClient(clients);
configureGoogleClient(clients);
configureWindowsLiveClient(clients);
configureYahooClient(clients);
configureLinkedInClient(clients);
configurePayPalClient(clients);
configureWordPressClient(clients);
configureBitBucketClient(clients);
configureHiOrgServerClient(clients);
return clients;
} I open a login page on two tabs on my Chrome. On the first tab, I click on "CAS", on the second tab, I click on "SAML2CLIENT". The second tab returns an issue: "Authorization Denied". If I change the source code to the fix I proposed, the error is gone for the same test (two tabs, one for CAS, the other one for SAML): @Override
public Collection<IndirectClient> build() {
val newClients = new HashSet<IndirectClient>();
try {
Thread.sleep(10000);
} catch (final InterruptedException e) {
LOGGER.error("Pause interrupted", e);
}
configureCasClient(newClients);
configureFacebookClient(newClients);
configureOidcClient(newClients);
configureOAuth20Client(newClients);
configureSamlClient(newClients);
configureTwitterClient(newClients);
configureDropBoxClient(newClients);
configureFoursquareClient(newClients);
configureGitHubClient(newClients);
configureGoogleClient(newClients);
configureWindowsLiveClient(newClients);
configureYahooClient(newClients);
configureLinkedInClient(newClients);
configurePayPalClient(newClients);
configureWordPressClient(newClients);
configureBitBucketClient(newClients);
configureHiOrgServerClient(newClients);
this.clients = newClients;
return newClients;
} Does that make sense?
|
It does, yes. Does the fix go away if you switch to a synchronized set, or simply add @synchronized to the build method? |
What I am saying, while I understand the issue, I think you'd run into the same problem with your proposed fix. It's a matter of when, no? I think a more solid fix perhaps is to sync all ops that deal with the |
I haven't made the test yet, but I think @synchronized would solve the issue but not a synchronized set. You're right, we could sync the whole ops that deal with the With my proposed fix, despite multiple |
I would be OK with that, sure. Do we need to worry about other methods/ops in that component that deal with the |
I guess |
Thank you. I'll take a look. |
A change was introduced in aa5c1c6 to allow runtime refreshing of Pac4j clients. Unfortunately, this is extremely costly, particularly when running several delegated clients: with a CAS client, an OIDC client, and (particularly slow) two SAML clients, we are experiencing hot load times of over a second for
/cas/login
as every one of those clients are built from scratch for each login flow.The other effect of this behavior is to render the
lazy-init
flag essentially useless, as the clients aren't being persisted and are as such effectively always lazy initialized.This patch more or less restores the code as it existed previously, using
lazy-init
as a toggle between usingRefreshableDelegatedClients
vs a static list. Though this is not included in the PR, setting the default value forlazy-init
to true would almost certainly reflect the behavior most end users hope for - losing a second or two during startup is nothing compared to default-enabled, persistent performance degradation across all requests.