New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Trust Router rekeying functionality #2007
Conversation
* Trimmed trailing whites. * Adopt proper line length. * Add missing spaces after commas.
TLS keys associated to dynamic home servers and realms expire. However, FR does not know anything about this expiration time. Instead, FR tries to use a home server until it fail to establish the connection. Then a new REALM is negotiated and used until the new keys expire. This behaviour has the problem that it requires a failed connection (and hence authentication) to realize REALM's keys are no longer usable. This commit introduces a new filed into the home_server_t struct, called "expiration" that marks the point in time after the REALM should not be used. Ideally this field is established based on the received information from libtr. However, an accessor for this information is missing. While this accessor is implemented, a new configuration option "realm_lifetime" is provided to manually indicate the lifetime of a realm. NOTE: Whereas "home_server_t.expiration" is an absolute time_t value indicating a timestamp, "realm_lifetime" is a relative amount of time to be added to time() at a REALM creation.
* Added rekeyer thread running a specific event list. * Added option to enable/disable rekey functionality. Each time a REALM is successfully queried from the trust router server, a rekey event is scheduled to be executed 120s before the TLS keys associated to that REALM expire (based on the home_server_t.expiration filed). A rekey event consists on executing a new TIDC query using the same parameters as for the original creation of the dynamic REALM, obtaining a fresh set of home servers with new TLS keys. If the rekey process succeeds, a new rekey event is scheduled as described above. If a rekey event fails, it tries again after 10 seconds. If it fails 5 times in a row, it stops trying to rekey the realm.
TIDC calls need to be serialized to prevent failures. This commits introduces a new mutex for that purpose, and refactors tr_query_realm() to make use of the new helper simplifying the code.
5cf6a16
to
8038df8
Compare
Hi guys, is there anything I can do to help with the review/acceptance process of this PR? I decided to split it into 4 focused commits, but I could split it even more if required. |
It mostly looks good. I'll have to pull the changes over manually, as all of the trust router things should be in a config sub-section. I'll likely do that Monday. |
@alejandro-perez you guys should also start thinking about how you want this implemented in v4.0.x. rlm_realm has been removed completely, so we need to determine what you need exactly for trust router, and whether it can be implemented in policies instead of modules. |
@alandekok Thanks. Let me know if I can help somehow. On the IDP side there is a dedicated daemon (tids) which inserts the negotiated PSK keys into a sqlite database, which is then queried by a FR policy to get the proper key. A similar approach could be followed on the client. I can create a dedicated Issue to discuss that, in order to avoid mixing things into this PR and to call the rest of the TR community into the conversation. |
I did raise this over a year ago (at moonshot meetings) when initial notes of rlm_realm going in v4 were raised. the more interesting part would be of new policy requirements to ensure that IF its not a moonshot capable construct, then do NOT use it and instead use the normal RADIUS path :) |
@alejandro-perez yes, a separate issue would be good. Dynamic realms are definitely on the roadmap, and will be implemented by the new rlm_radius module. |
I'm OK with these patches going into v3, as they're bug / stability fixes. For v4, the framework should allow for dynamic home servers. At which point the trust router could likely be a module all to itself. It would just have to return IP / port / cert information to the server core, which would then track the connection independently of anything else. And all of the lifetime / cache / renew management for dynamic home servers has to be done anyways. So removing that from the trust router code is a good thing. |
Sqlite3 does appear to deal ok with concurrent accesses from different applications, with the caveat that queries might return 'busy' and need to be retried. For remote access a REST interface would definitely be preferable to a dedicated C module just query the tids. |
Indeed, and given that you mentioned that renew management functionality will be in the core of v4, they can be seen as a backport in some sense (since v3 does not have that).
I guess PSK would still be possible, wouldn't it?
Yes. |
Please, use #2019 for discussion about how to deal with v4.0 and keep this thread to discuss about this specific PR. |
There's only one new config item, and I don't think moving the config items to a subsection is a good idea for v3.0.x, so i'm going to merge this. |
Hi,
This set of focused patches provide rekeying capabilities for the Trust Router client in rlm_realm.
This rekeying functionality basically consists of a dedicated thread which negotiates, in the background, new TLS keys for existing dynamic REALMs before they expire. The achieved aim is to prevent the performance issues that appear when this re-negotiation happens in the foreground as a consequence of a end user authentication, which can then take up to 10 seconds (due to the TR protocol).
Performing this task in the background with the enough margin of time ensures that new keys are used before expiration happens.