New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to include the Trust Router functionality in v4.0 #2019

Open
alejandro-perez opened this Issue Jul 14, 2017 · 9 comments

Comments

Projects
None yet
3 participants
@alejandro-perez
Contributor

alejandro-perez commented Jul 14, 2017

Issue type

  • Questions about the server or its usage should be posted to the users mailing list.
  • Remote security exploits MUST be sent to security@freeradius.org.
  • Defect - Crash or memory corruption.
  • Defect - Non compliance with a standards document, or incorrect API usage.
  • Defect - Unexpected behaviour (obvious or verified by project member).
  • Feature request.

See here for debugging instructions and how to obtain backtraces.

NOTE: PATCHES GO IN PULL REQUESTS. IF YOU SUBMIT A DIFF HERE, THE DEVELOPMENT TEAM WILL HUNT YOU DOWN AND BEAT YOU OVER THE HEAD WITH YOUR OWN KEYBOARD.

Feature description

This thread is intended to discuss how the existing Trust Router client functionality can be implemented in v4.0, where rlm_realm module is gone.

Initial ideas/thoughts can be found in #2007:

Introduction (in a really small nutshell)

The trust router is a service that allows establishing shared secrets between AAA entities based on the shared trust they have in an entity called the "trust router server". In practice (ie. Moonshot), it is used to dynamically negotiate TLS-PSKs between FreeRadius endpoints that had no knowledge of each other prior that negotiation.
In FR v3.0:

  1. On the client side, when the rlm_realm module detects a Request destined to a unknown realm, it makes use of the trust router libraries to query the "trust router server" and get the details for the new realm (that is, IP, port, keying material, expiration...). Then, the realm is added to the realm btree. Besides, as these security associations expire, a rekeying functionality has recently been added to actively refresh them before they become unusable.
  2. On the server side, there is a separated daemon (tids) which stores the negotiated keying material on an sqlite DB. We then use FR's ability to set TLS-PSK form a SQL DB to make this work in a very straightforward way.

Problematic

In FR v4.0 rlm_realm has been removed, and the framework should allow for dynamic home servers.
This raises several questions/challenges (they may be more).

  1. How can we make FR to query the TR, and use the results, when a realm is unknown? If realm handling is done in the core server, do you envisage some sort of callback or anything similar, such as "resolver" modules? Or, is this something that can be done in the "pre-proxy" section with a regular module?

  2. How do we handle rekeying? The process of establishing trust router security associations can take a while (several seconds), specially under certain circumstances. That's why it is a good idea to perform a rekeying process before TLS keys expire. Otherwise, the first end user trying to use an expired association will have to wait until it is refreshed. Now, this implies some sort of "proactivity" on the client, which typically works in a "reactive" way (that's why I added the dedicated thread to the rlm_realm module).

  3. In the server side, can all be kept the same it is now for v3? or does it need to change?

@adam-bishop

This comment has been minimized.

Show comment
Hide comment
@adam-bishop

adam-bishop Jul 14, 2017

How can we make FR to query the TR, and use the results, when a realm is unknown?

Sounds like some kind of defined API is needed, where FreeRADIUS cycles through a list of "realm providers" asking them if they know a realm.

How do we handle rekeying?

I think DNS has a good model for this. Realms could be returned with Expire/Minimum/Refresh/Retry values. I seem to recall v4 has some kind of event queue? Could this be used to fire an asymmetric realm lookup?

In the server side, can all be kept the same it is now for v3? or does it need to change?

I hope so!

A lot of the issues we have would also be problems for radsec dynamic discovery (and more interesting things - realm lookups via http/sql?).

adam-bishop commented Jul 14, 2017

How can we make FR to query the TR, and use the results, when a realm is unknown?

Sounds like some kind of defined API is needed, where FreeRADIUS cycles through a list of "realm providers" asking them if they know a realm.

How do we handle rekeying?

I think DNS has a good model for this. Realms could be returned with Expire/Minimum/Refresh/Retry values. I seem to recall v4 has some kind of event queue? Could this be used to fire an asymmetric realm lookup?

In the server side, can all be kept the same it is now for v3? or does it need to change?

I hope so!

A lot of the issues we have would also be problems for radsec dynamic discovery (and more interesting things - realm lookups via http/sql?).

@alandekok

This comment has been minimized.

Show comment
Hide comment
@alandekok

alandekok Jul 14, 2017

Member

Sounds like some kind of defined API is needed, where FreeRADIUS cycles through a list of "realm providers" asking them if they know a realm.

That's a module. :) And configurable in unlang.

When a home server is added, it can have it's own expire / refresh / etc. values. That should all be in the server core.

A lot of the issues we have would also be problems for radsec dynamic discovery (and more interesting things - realm lookups via http/sql?).

Yes. The goal is to allow dynamic realms / home-servers from any source. That will be much more flexible than v3

Member

alandekok commented Jul 14, 2017

Sounds like some kind of defined API is needed, where FreeRADIUS cycles through a list of "realm providers" asking them if they know a realm.

That's a module. :) And configurable in unlang.

When a home server is added, it can have it's own expire / refresh / etc. values. That should all be in the server core.

A lot of the issues we have would also be problems for radsec dynamic discovery (and more interesting things - realm lookups via http/sql?).

Yes. The goal is to allow dynamic realms / home-servers from any source. That will be much more flexible than v3

@alejandro-perez

This comment has been minimized.

Show comment
Hide comment
@alejandro-perez

alejandro-perez Jul 14, 2017

Contributor

That's a module. :) And configurable in unlang.

I guess this is a new type of module? Because typical modules do not have a interface to query realms, do they?

When a home server is added, it can have it's own expire / refresh / etc. values. That should all be in the server core.

Yes, but every "realm" source may have different ways of handing expiration/refresh events. Will their "callbacks" be called upon these events. That'd be superb.

Yes. The goal is to allow dynamic realms / home-servers from any source. That will be much more flexible than v3

That's really great. A lot of flexibility will come from here. Indeed, I see two major options:

  1. Have a full-fledged module, which does everything within FR.
  2. Or have a external daemon (as we already do for the server side), which provides a REST/SQL/socket interface to the module. The module in FR would be then be really thin, and would act like a proxy.

Benefits of 2) are decoupling, ability of using different languages (or even machines).
Disadvantages of 2) include worse performance, depending on how many remote calls are needed and how often, and having to execute a second daemon, which I don't usually like if avoidable.

Contributor

alejandro-perez commented Jul 14, 2017

That's a module. :) And configurable in unlang.

I guess this is a new type of module? Because typical modules do not have a interface to query realms, do they?

When a home server is added, it can have it's own expire / refresh / etc. values. That should all be in the server core.

Yes, but every "realm" source may have different ways of handing expiration/refresh events. Will their "callbacks" be called upon these events. That'd be superb.

Yes. The goal is to allow dynamic realms / home-servers from any source. That will be much more flexible than v3

That's really great. A lot of flexibility will come from here. Indeed, I see two major options:

  1. Have a full-fledged module, which does everything within FR.
  2. Or have a external daemon (as we already do for the server side), which provides a REST/SQL/socket interface to the module. The module in FR would be then be really thin, and would act like a proxy.

Benefits of 2) are decoupling, ability of using different languages (or even machines).
Disadvantages of 2) include worse performance, depending on how many remote calls are needed and how often, and having to execute a second daemon, which I don't usually like if avoidable.

@alandekok

This comment has been minimized.

Show comment
Hide comment
@alandekok

alandekok Jul 14, 2017

Member

I guess this is a new type of module? Because typical modules do not have a interface to query realms, do they?

The point is that in v3 and earlier, realms were magic. They were welded into the server core and rlm_realm. In v4, there's no magic. There is no "proxying" in the server core. Instead, there's a RADIUS client module which queries home servers, just like the SQL module queries SQL databases.

At that point, dynamic home servers just becomes dynamic updates to a module. Realms lose all magic, and just become module configuration.

Yes, but every "realm" source may have different ways of handing expiration/refresh events. Will their "callbacks" be called upon these events.

Probably not. The goal is to allow create / renew / expire with configurable timers. The RADIUS client module then just either expires the realm / home server, or gets told to keep it alive.

The point is to separate data sources from how that data is used. The more tightly they are tied together, the harder it is to change anything.

Perhaps you could explain what callbacks you need when a home server expires.

A lot of flexibility will come from here. Indeed, I see two major options:

Both of those are the same option:

  1. something somehow gets realm information, and sends it to the RADIUS client module. The communication between the two is attributes, just like v3 with dynamic clients.

You're free to write a custom module to query your own DB, or you can use unlang to query SQL / REST / whatever, and get attributes that way.

Member

alandekok commented Jul 14, 2017

I guess this is a new type of module? Because typical modules do not have a interface to query realms, do they?

The point is that in v3 and earlier, realms were magic. They were welded into the server core and rlm_realm. In v4, there's no magic. There is no "proxying" in the server core. Instead, there's a RADIUS client module which queries home servers, just like the SQL module queries SQL databases.

At that point, dynamic home servers just becomes dynamic updates to a module. Realms lose all magic, and just become module configuration.

Yes, but every "realm" source may have different ways of handing expiration/refresh events. Will their "callbacks" be called upon these events.

Probably not. The goal is to allow create / renew / expire with configurable timers. The RADIUS client module then just either expires the realm / home server, or gets told to keep it alive.

The point is to separate data sources from how that data is used. The more tightly they are tied together, the harder it is to change anything.

Perhaps you could explain what callbacks you need when a home server expires.

A lot of flexibility will come from here. Indeed, I see two major options:

Both of those are the same option:

  1. something somehow gets realm information, and sends it to the RADIUS client module. The communication between the two is attributes, just like v3 with dynamic clients.

You're free to write a custom module to query your own DB, or you can use unlang to query SQL / REST / whatever, and get attributes that way.

@alejandro-perez

This comment has been minimized.

Show comment
Hide comment
@alejandro-perez

alejandro-perez Jul 15, 2017

Contributor

The point is that in v3 and earlier, realms were magic. They were welded into the server core and rlm_realm. In v4, there's no magic. There is no "proxying" in the server core. Instead, there's a RADIUS client module which queries home servers, just like the SQL module queries SQL databases.

At that point, dynamic home servers just becomes dynamic updates to a module. Realms lose all magic, and just become module configuration.

Sounds like a good design and for sure a far less hackish approach for us.

The point is to separate data sources from how that data is used. The more tightly they are tied together, the harder it is to change anything.

Agreed.

Perhaps you could explain what callbacks you need when a home server expires.

What I need is to be able to renew the TLS keys associated to a particular realm before they expire. "Expire + get new one" procedure does not work well for us, since the establishment process is slow and hence some user authentications might seem really slow for the user with no reason.

So the callback would be something such as "renew_realm_info()", which will have a configurable timer < expiration_time. That is, a realm is not used beyond its expiration time, but say 60s beforehand I got notified so I can start doing negotiation and establish new keys before expiration (this would be indeed very similar to IKEv2/IPsec rekeying process). It is important that, while refreshing is being done, users can still use the old realm info for authenticating, creating no disruption for them.

In option 2), that is, having a external daemon that do the refreshing, this would not be needed, as upon expiration FR will query for the new keys that will be there already.

Both of those are the same option:

something somehow gets realm information, and sends it to the RADIUS client module. The communication between the two is attributes, just like v3 with dynamic clients.
You're free to write a custom module to query your own DB, or you can use unlang to query SQL / REST / whatever, and get attributes that way.

I see what you mean. Again, the main "challenge" I see is how you are going to handle expiration/renew of realms. It can be:

  1. Passive. A request comes, FR lookups realm information from the cache. It is expired. FR calls the unlang code to retrieve new information. In this case, proactive rekeying has to be done outside FR.
  2. Proactive. A timer is set so when realm expires, FR calls the unlang code to retrieve new information. In this case, this refreshing is the actual rekeying. The problem with this approach is that, if a user authentication happens in another thread while refreshing, it either fails for expired key, or has to wait until the new keys are established since old ones are expired.
  3. Proactive with soft expiration. A timer is set before actual expiration. FR calls the unlang code to retrieve new information BUT old realm can be used by other threads while not updated, since it is not expired yet.

Obviously, my preferences are with 3). That is the "callback" I was referring to.

Contributor

alejandro-perez commented Jul 15, 2017

The point is that in v3 and earlier, realms were magic. They were welded into the server core and rlm_realm. In v4, there's no magic. There is no "proxying" in the server core. Instead, there's a RADIUS client module which queries home servers, just like the SQL module queries SQL databases.

At that point, dynamic home servers just becomes dynamic updates to a module. Realms lose all magic, and just become module configuration.

Sounds like a good design and for sure a far less hackish approach for us.

The point is to separate data sources from how that data is used. The more tightly they are tied together, the harder it is to change anything.

Agreed.

Perhaps you could explain what callbacks you need when a home server expires.

What I need is to be able to renew the TLS keys associated to a particular realm before they expire. "Expire + get new one" procedure does not work well for us, since the establishment process is slow and hence some user authentications might seem really slow for the user with no reason.

So the callback would be something such as "renew_realm_info()", which will have a configurable timer < expiration_time. That is, a realm is not used beyond its expiration time, but say 60s beforehand I got notified so I can start doing negotiation and establish new keys before expiration (this would be indeed very similar to IKEv2/IPsec rekeying process). It is important that, while refreshing is being done, users can still use the old realm info for authenticating, creating no disruption for them.

In option 2), that is, having a external daemon that do the refreshing, this would not be needed, as upon expiration FR will query for the new keys that will be there already.

Both of those are the same option:

something somehow gets realm information, and sends it to the RADIUS client module. The communication between the two is attributes, just like v3 with dynamic clients.
You're free to write a custom module to query your own DB, or you can use unlang to query SQL / REST / whatever, and get attributes that way.

I see what you mean. Again, the main "challenge" I see is how you are going to handle expiration/renew of realms. It can be:

  1. Passive. A request comes, FR lookups realm information from the cache. It is expired. FR calls the unlang code to retrieve new information. In this case, proactive rekeying has to be done outside FR.
  2. Proactive. A timer is set so when realm expires, FR calls the unlang code to retrieve new information. In this case, this refreshing is the actual rekeying. The problem with this approach is that, if a user authentication happens in another thread while refreshing, it either fails for expired key, or has to wait until the new keys are established since old ones are expired.
  3. Proactive with soft expiration. A timer is set before actual expiration. FR calls the unlang code to retrieve new information BUT old realm can be used by other threads while not updated, since it is not expired yet.

Obviously, my preferences are with 3). That is the "callback" I was referring to.

@alandekok

This comment has been minimized.

Show comment
Hide comment
@alandekok

alandekok Jul 15, 2017

Member

The short answer is that we need a "renewal" timer which is different from the "expiry" timer.

The new design has an event loop per thread. Modules can add events to that event loop. Which means that when the "dynamic home server" modules hits the "renew" timer, it can take action to (somehow) renew / refresh the data.

Member

alandekok commented Jul 15, 2017

The short answer is that we need a "renewal" timer which is different from the "expiry" timer.

The new design has an event loop per thread. Modules can add events to that event loop. Which means that when the "dynamic home server" modules hits the "renew" timer, it can take action to (somehow) renew / refresh the data.

@alejandro-perez

This comment has been minimized.

Show comment
Hide comment
@alejandro-perez

alejandro-perez Jul 16, 2017

Contributor

The short answer is that we need a "renewal" timer which is different from the "expiry" timer.

Exactly

The new design has an event loop per thread. Modules can add events to that event loop. Which means that when the "dynamic home server" modules hits the "renew" timer, it can take action to (somehow) renew / refresh the data.

It seems pretty straightforward. I think implementing the TR client functionality in v4.0 will actually be easier than it was for v3.0.
Is this new design already implemented, so it can be tested?

Contributor

alejandro-perez commented Jul 16, 2017

The short answer is that we need a "renewal" timer which is different from the "expiry" timer.

Exactly

The new design has an event loop per thread. Modules can add events to that event loop. Which means that when the "dynamic home server" modules hits the "renew" timer, it can take action to (somehow) renew / refresh the data.

It seems pretty straightforward. I think implementing the TR client functionality in v4.0 will actually be easier than it was for v3.0.
Is this new design already implemented, so it can be tested?

@alandekok

This comment has been minimized.

Show comment
Hide comment
@alandekok

alandekok Jul 16, 2017

Member

v4.0.x is running now. The APIs are pretty much stable at this point.

You may want to wait a bit for me to finish the RADIUS client module (outgoing packets / proxying). That will make the rest of the work clearer.

Member

alandekok commented Jul 16, 2017

v4.0.x is running now. The APIs are pretty much stable at this point.

You may want to wait a bit for me to finish the RADIUS client module (outgoing packets / proxying). That will make the rest of the work clearer.

@alejandro-perez

This comment has been minimized.

Show comment
Hide comment
@alejandro-perez

alejandro-perez Jul 16, 2017

Contributor

Sure I will. Thanks!

Contributor

alejandro-perez commented Jul 16, 2017

Sure I will. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment