Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove extra '00' padding of VTI interface names. #4071

Closed
wants to merge 13 commits into from

Conversation

BrettMerrick
Copy link

@BrettMerrick BrettMerrick commented Jun 18, 2019

FreeBSD will not accept an interface number greater than 32767, therefore this padding limits the number of VTI interfaces to 32.
This change will increase this to 3276.

ref: https://redmine.pfsense.org/issues/9592
FreeBSD will not accept an interface number greater than 32767, therefore the padding limits the number of VTI interfaces to 32.
This change will increase this to 3276.
Copy link
Contributor

@jim-p jim-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle, this should be OK, but in practice, it can't be changed that abruptly as it would break existing configurations. Anyone who assigned the interface using the old name would have their configuration broken by the change. There are a couple possible paths forward here, and at least one of them would have to be implemented in the PR before this could be merged.

  • Option 1: Make this optional and disabled by default. Using a knob somewhere, such as the Advanced tab in the IPsec settings, allow someone to opt into this behavior. They would have to remove all interface assignments for VTIs before enabling the option, however.
  • Option 2: Add upgrade code to adjust existing VTI assignments and interface references to the new names.

@BrettMerrick
Copy link
Author

Agreed.

It seems the code is trying to split numbering between IKEv1 - where P2 each get their own interface - and IKEv2 - where P2 each get a single interface and alias addresses - in a way that ensures the interface names persist.

The problem is simply that the 0 padding method puts artificial pressure on the upper limit of each P2 type.

I couldn't immediately see anywhere the structure of the numeric component was being used in reverse to pass information back to pfsense, so if anyone has any insight here please let me know.

I propose the following changes:

  • The mapping of OS interface names to P2 connections for VTIs are stored in the config.
  • Upgrade code creates appropriate mappings for existing VTIs
  • Upgrade code updates all opt references to the existing OS interface names in the config where these have numeric values > 32767 and have to be changed.
  • Subsequent VTI interface creations use numeric values generated by a function that ensures unique values less than 32767.

I suspect there is also some logic required around the removal of interfaces where the P2 config changes but interface assignments haven't been removed leaving dependent config that could be inadvertently attached to a new VTI. This will require further investigation.

@jim-p
Copy link
Contributor

jim-p commented Jun 19, 2019

Since the interface names must be unique, the reqid was a convenient unique ID and strong means of identification in both directions, which allowed us to easily determine which interface belonged to which tunnel and vice versa.

While it would certainly be possible to map them in other ways, as you have found, it's not so easy to keep that association so it works in both directions in a way that would never change. Unfortunately, the ipsec interfaces don't let us use separators or we could do something like ipsec<reqid>.<index> without padding, similar to VLANs. There may be some other way to map them by renaming the interfaces but that also seems a bit more confusing.

Without that kind of separator then we'd have to embed the interface name in the config but the difference between how things are handled in IKEv1/IKEv2/split connections makes that awkward as well.

Using reqid on its own is another possibility but it does need some padding or a base number because otherwise it can hit a conflict if strongSwan automatically generates a reqid, but even then you would still need to somehow take the index into account, and it can't just be a single digit since someone could have >10 P2s, though that wouldn't be common with VTI so it's probably not a huge concern. Once you start thinking about how the automatically generated names work at scale, a lot of schemes fall apart when you hit a large number of entries.

I definitely do not want to get into a situation where they are handled inconsistently for old vs new or low vs high. That would be even more difficult to keep track of and confusing for everyone. Whatever new scheme is chosen will need to be applied to all entries, and upgrade code can change the mappings as needed from old to new.

@BrettMerrick
Copy link
Author

Is it an absolute requirement that the interface name directly identifies the tunnel?

I would be in favour of sequential numbering, mapping stored in the config, and functions to access the mappings that performed validation.

I couldn't see anywhere in the existing code that would require a lookup in that direction - ie. which tunnel does interface ipsec13002 belong to. I may have missed this, if so please point me to it.

Yes, upgrade code could simply change all mappings to the new format for consistency.

@jim-p
Copy link
Contributor

jim-p commented Jun 20, 2019

While the automatic lookup functions would generate the names in the config->interface name direction, when users see the interface names it is helpful for them to be able to easily tell the relationship. The interface name might show up bare in things like remote logs, daemon configs, etc. Since the interface index and reqid match currently it's simple to correlate information from things like IPsec status to interface status in either direction, especially when working at the console, and doubly so for support personnel diagnosing issues with customer installations. So while it may be possible to do without it, I'd prefer to see that kind of relationship preserved if at all possible.

@BrettMerrick
Copy link
Author

Understood.

We need to achieve recognisable interface names and still have a reasonable upper limit on either IKEv1 P2/IKEv2 split interfaces or IKEv2 interfaces.

I have conferred with Steven who encountered the issue to see if simply shifting the split to the third digit instead of the second could work. While it meets the needs of his current deployment we believe the result would still not provide enough room for larger deployments:

At present, a new reqid is the lowest available ph2 reqid less than 16000, but in reality this is limited to 32 by the padding used to create the interface.
This means that:

  • for IKEv1/v2 split, if the assigned reqid is a single digit <=3 there can be up to 99 ph2 interfaces.
  • for IKEv1/v2 split, if the assigned reqid is a single digit >3 or double digit <32 there can be up to 9 ph2 interfaces per reqid
  • for IKEv2, there can only be 32 ph2 interfaces in the whole config

If we shifted the split right by one place instead of the two I had in the initial pull request, the ph1 key id would be increased to 327.
It would then follow that:

  • for IKEv1/v2 split, if the assigned reqid is 1 through 31 there can be up to 990 ph2 interfaces. (with code to exclude ph2's ending 00)
  • for IKEv1/v2 split, if the assigned reqid is 32 there can be up to 760 ph2 interfaces.(with code to exclude ph2's ending 00)
  • for IKEv1/v2 split, if the assigned reqid is 33 through 99 there can be up to 99 ph2 interfaces. (no ph2's ending 00)
  • for IKEv1/v2 split, if the assigned reqid is 100 through 326 there can be up to 99 ph2 interfaces. (no ph2's ending 00)
  • for IKEv1/v2 split, if the assigned reqid is 327 there can be up to 67 ph2 interfaces. (no ph2's ending 00)
  • for IKEv2, there can only be 327 ph2 interfaces in the whole config

I cannot see any way of improving beyond this limit of 327 without foregoing the readability of the ipsec interface names.

Would pre-approval or signoff in principal be possible for either:

  • a maximum of 327 IKEv2 VTI's, or
  • a non human readable numbering system,

before I allocate time to writing any code?

@jim-p
Copy link
Contributor

jim-p commented Jun 21, 2019

The more I think about it, the bigger the mess becomes. The best path forward is probably just to keep a mapping of interface names to reqid values in the config then. Problem now is where to keep it, since depending on the config it might be relevant to the P1 or specific P2s, or could switch. Maybe a separate entry under IPsec, so an array at $config['ipsec']['vtimaps'].

@BrettMerrick
Copy link
Author

OK Great! I'll make a start down that route next week.

@BrettMerrick
Copy link
Author

I think the approach I'll take will be to replace the existing constructs with calls to functions using the same keys:
$ipsecifnum = "{$ph1ent['ikeid']}00{$idx}";
becomes
$ipsecifnum = get_ipsecifnum($ph1ent['ikeid'],$idx);
This way the mappings are still readable in the config file, and it minimises the changes required.

I think the config structure $config['ipsec']['vtimaps']['reqid']['index']= would present this clearly.

The function get_ipsecifnum will then be responsible for either locating an existing mapping in the config, or creating and storing a new one.

You mentioned StrongSwan may create reqids that could clash. Is this still a potential issue now we're using sequential mapped ifnums? ie. would a get_ipsecifnum function need to scrape the output of something like /sbin/setkey -D to avoid ifnum clashes too?

@jim-p
Copy link
Contributor

jim-p commented Jun 25, 2019

You mentioned StrongSwan may create reqids that could clash. Is this still a potential issue now we're using sequential mapped ifnums? ie. would a get_ipsecifnum function need to scrape the output of something like /sbin/setkey -D to avoid ifnum clashes too?

It shouldn't be an issue so long as the reqids don't change, just the interface numbers.

Added and included calls to new function get_ipsecifnum
@rbgarga
Copy link
Member

rbgarga commented Sep 3, 2019

@BrettMerrick do you still have plans to work on this PR?

@BrettMerrick
Copy link
Author

Yes. We are currently testing the code. Will post update shortly.

@BrettMerrick
Copy link
Author

Still need to do some more testing, will report back here once I'm happy with the results.

@rbgarga
Copy link
Member

rbgarga commented Sep 4, 2019

Still need to do some more testing, will report back here once I'm happy with the results.

I noted you are making config changes inside get_ipsecifnum() to convert old to new format. This is not the proper place to do this kind of change. You should bump latest config version at globals.inc and then write a function that will convert all items to new format inside upgrade_config.inc

@BrettMerrick
Copy link
Author

The changes to config are just adding new entries/structure, not modifying existing data, so I don't have anything to add to upgrade_config.inc with this approach.

When I looked at what upgrade_config would need to include I realised that, without picking apart every package to see if they ever referenced the vti interface names in their config, it would open up significant potential for breakage.

What I have done, is ensured that the new function stores and returns ipsecifnums that are identical to the old format up to the point where the old format reached the limit, then it backfills to utilise the remaining numbers.

This has the additional benefit of not requiring a change in behaviour for those who haven't encountered this limit.

for example, if you have 10 vti phase 2 interfaces on your 12th phase 1, they would still get interface numbers 12000 - 12009. However now, instead of an error when you add another it will get the lowest available number. This might be 7 or 57 depending on what has already been stored in $config['ipsec']['vtimaps'].

I still need to checks for clashes of gap filling numbers with old format numbers and vice versa in the new function, and then it's ready for review. I'll finish that in the next few days.

@rbgarga
Copy link
Member

rbgarga commented Sep 11, 2019

The changes to config are just adding new entries/structure, not modifying existing data, so I don't have anything to add to upgrade_config.inc with this approach.

When I looked at what upgrade_config would need to include I realised that, without picking apart every package to see if they ever referenced the vti interface names in their config, it would open up significant potential for breakage.

What I have done, is ensured that the new function stores and returns ipsecifnums that are identical to the old format up to the point where the old format reached the limit, then it backfills to utilise the remaining numbers.

This has the additional benefit of not requiring a change in behaviour for those who haven't encountered this limit.

for example, if you have 10 vti phase 2 interfaces on your 12th phase 1, they would still get interface numbers 12000 - 12009. However now, instead of an error when you add another it will get the lowest available number. This might be 7 or 57 depending on what has already been stored in $config['ipsec']['vtimaps'].

I still need to checks for clashes of gap filling numbers with old format numbers and vice versa in the new function, and then it's ready for review. I'll finish that in the next few days.

I still don't like the idea of get_ipsecifnum(), a function that is supposed to get the interface number from a .inc file, keep changing config.xml.

What do you think about that @jim-p?

@jim-p
Copy link
Contributor

jim-p commented Sep 11, 2019

I still don't like the idea of get_ipsecifnum(), a function that is supposed to get the interface number from a .inc file, keep changing config.xml.

What do you think about that @jim-p?

Same for me. From the way it's used, it looks to me like it should just loop through the IPsec tunnels and calculate things during upgrade so it wouldn't even need to do write_config() itself. Though I didn't trace through all the code around it to look at how that might affect things.

Either that or only have it recalculate when the user edits/saves a tunnel.

Copy link
Contributor

@jim-p jim-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now has conflicts that must be resolved. Please review the significant changes made to IPsec for https://redmine.pfsense.org/issues/9603 before attempting to resolve the conflicts.

@rbgarga
Copy link
Member

rbgarga commented Feb 7, 2020

@BrettMerrick do you still plan to work on that?

@BrettMerrick
Copy link
Author

I've been diverted to other projects, however the problem remains. I may come back to it at a later stage if it is not resolved before I am available.

@rbgarga
Copy link
Member

rbgarga commented Feb 13, 2020

Superseded by #4190

@rbgarga rbgarga closed this Feb 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants