Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MTU recommendation about Azure VMs #69477

Closed
yinoa opened this issue Jan 26, 2021 · 26 comments
Closed

MTU recommendation about Azure VMs #69477

yinoa opened this issue Jan 26, 2021 · 26 comments

Comments

@yinoa
Copy link

yinoa commented Jan 26, 2021

[Enter feedback here]
In this article, it is mentioned Azure platform MTU is 1400 and out of order fragment will be droped due to Linux OS.

Should we recommend to set Azure VM MTU to 1400?

Any difference about TCP and UDP fragmentation processing in azure platform?

Azure and VM MTU
The default MTU for Azure VMs is 1,500 bytes. The Azure Virtual Network stack will attempt to fragment a packet at 1,400 bytes.

Note that the Virtual Network stack isn't inherently inefficient because it fragments packets at 1,400 bytes even though VMs have an MTU of 1,500. A large percentage of network packets are much smaller than 1,400 or 1,500 bytes.

Azure and fragmentation
Virtual Network stack is set up to drop "out of order fragments," that is, fragmented packets that don't arrive in their original fragmented order. These packets are dropped mainly because of a network security vulnerability announced in November 2018 called FragmentSmack.

FragmentSmack is a defect in the way the Linux kernel handled reassembly of fragmented IPv4 and IPv6 packets. A remote attacker could use this flaw to trigger expensive fragment reassembly operations, which could lead to increased CPU and a denial of service on the target system.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

@rimayber
Copy link
Contributor

rimayber commented Jan 26, 2021 via email

@yinoa
Copy link
Author

yinoa commented Jan 26, 2021

thanks @rimayber.

I agree with "fragmentation is not bad", but still confusing with why Azure Virtual Network stack will attempt to fragment a packet at 1,400 bytes and then it might be dropped due to "out of order".

Given a more straightforward example: If there is an TCP/IP packet from internet to Azure, it is 1460 bytes (payload length) + 20 bytes (TCP header) + 20 Bytes (IP header) = 1500 Bytes ,

  1. so azure platform will fragment it to two pieces 1400 + 60 ?
  2. If those two fragmented packets reach to destinaiton in azure by "out of oder", it is possible to be dropped due to (Virtual Network stack is set up to drop "out of order fragments,") ?
  3. if those "out of order" fragmented packets are coming from internet , I mean when they reach to azure destination, they will be dropped by azure due to the same reason?

@rimayber
Copy link
Contributor

rimayber commented Jan 26, 2021 via email

@yinoa
Copy link
Author

yinoa commented Jan 26, 2021

Yes, the out of order fragmentation could occur in azure cloud due to multi-path. About TCP out of order fragmented packets, it seems we have reassable capability on azure destination and rare issue is reported for that.

But for UDP out of order fragmentated packets, we need to enable this UDP re-ordering flag for specific subscrition per region.

Just want to confirm if we have such difference for TCP and UDP, if yes, should we add more detailed description about azure drop "out of order" fragmentation packets?

Azure and fragmentation
Virtual Network stack is set up to drop "out of order fragments," that is, fragmented packets that don't arrive in their original fragmented order. These packets are dropped mainly because of a network security vulnerability announced in November 2018 called FragmentSmack.

@rimayber
Copy link
Contributor

rimayber commented Jan 26, 2021 via email

@yinoa
Copy link
Author

yinoa commented Jan 27, 2021

According to the description of "Azure and fragmentation" part, the "out of order" fragments will be dropped.

Is it partially untrue?

If it is TCP "out of order" fragmented packets, azure destination can reassemable them and will not drop them.

If it is UDP "out of order" fragmented packets, it is by design azure destination will drop them. But we have option to enable "enable-udp-fragment-reordering" to reassemable those packets to avoid dropping.

@gideonoost
Copy link

Does this enable-udp-fragment-reordering option really exist? The only place I have found mention of this option is indeed in this thread? but nowhere else.

@hotgore
Copy link

hotgore commented May 19, 2021

Does this enable-udp-fragment-reordering option really exist? The only place I have found mention of this option is indeed in this thread? but nowhere else.

It does, I had to ask for it to resolve an issue with a UDP app. The support person wasn't sure about it until I linked this thread and was able to get rimayber to help.

It is weird, but I don't think MS wants to accept that fragmenting UDP results in tons of out of order packets on the other end, screwing with some apps. I wish they didn't bother fragmenting anything below 1400, seems like a great way to create weird problems.

@jackchenwork
Copy link

@hotgore could you please provide a bit information how to set this "enable-udp-fragment-reordering" option ? Troubleshooting a Radius EAP-TLS Authentication issue and seems it's related with IPSec to Azure and fragmented UDP datagram lost.

@rimayber
Copy link
Contributor

rimayber commented Mar 11, 2022 via email

@rimayber
Copy link
Contributor

rimayber commented Mar 11, 2022 via email

@plessisa
Copy link

plessisa commented Oct 13, 2022

Basically you need a support ticket to request it. The doc link is for internal support employees
Hi @rimayber,

I open a support ticket for "enable-udp-fragment-reordering".
So far, support allowed me to "enable-udp-fragment-reordering" only on the VM Public IP inbound Traffic. When using an Azure Load Balancer or a VPN via a Virtual Network Gateway, the fragment are still dropping.

Best,

@nibanks
Copy link

nibanks commented Nov 9, 2022

I was just informed of this discussion, and I would tend to disagree with "fragmentation is not bad." It causes a whole host of issues, for which the entire QUIC protocol (built on UDP and the really the next protocol for the internet) has taken the stance to set the "Don't Fragment" bit to make sure fragmentation doesn't happen.

But to that end, does the Azure network stack fully respect this bit? Or is there some virtualization layer that does some kind of fragmentation regardless? If so, that's really bad.

BTW, I don't understand the reasoning for this claim: "Most packets are much smaller than 1400 or 1500 – so there’s no risk of fragmentation". Generally, smart protocols will dynamically discover the largest MTUs that "work". But if the network fragments the larger MTUs it ends up causing more problems...

@vbrozik
Copy link

vbrozik commented Nov 9, 2022

@nibanks I agree. These statements seem to be fabricated. They are not supported by real data:

  • "fragmentation is not bad."
  • "Most packets are much smaller than 1400 or 1500 – so there’s no risk of fragmentation"

In the same sense have commented an issue which was closed without any sound reasoning: #65888 (comment)


...the entire QUIC protocol ... has taken the stance to set the "Don't Fragment" bit to make sure fragmentation doesn't happen.

Are you sure? The "Don't Fragment" bit is normally being set by the operating system as part of the Path MTU Discovery algorithm.

@nibanks
Copy link

nibanks commented Nov 9, 2022

Are you sure? The "Don't Fragment" bit is normally being set by the operating system as part of the Path MTU Discovery algorithm.

About what part? The OS is involved in TCP based communication and may use the Don't Fragment bit when doing PMTUD, yes. When it's done in QUIC, the OS isn't necessarily involved in anything more than providing a UDP socket. Then, it's the app layer that's telling the UDP socket to set the Don't Fragment bit.

@fedda-no
Copy link

UDP Fragmentation is a pain in Azure, I have special a problem with EAP-TLS in Radius UDP packet:

  1. Standard Azure VNet drops all EAP TLS where UDP packet get fragmented (NAS to Radius) when the NAS encapsulate the client certificate in the UDP packet (it needs to be fragmented), Azure drops this (NAS logs it with: Radius Server Dead, Radius server: Not Response from NAS).
  • This is for ALL radius traffic traversing Azure, example if you use Azure VWAN, Site to Azure, and Azure Express to Onprem DC (where Radius is hosted), UDP packet get "dropped".
  1. Initial fix for this is to:
    • Get Azure to enable enable-udp-fragment-reordering.
  2. However I still have problem if I host example Cisco ISE 3.2 in Azure, where I experience that Azure Vnet drops the "server certificate response", again, Cisco ISE 3.2 deployed from marked space are setup with standard MTU of 1500, where we see randomly that Azure still dropping the radius response from Cisco ISE. NAS reports: Radius Server dead jump to next radius server in list, ISE Response to the next request: Authentication process already in process and drops the request. But this problem are random, not consistent as From private repo #1.
  • Fix in progress, trying to change the MTU on the Cisco ISE server in Azure to 1400, to prevent any form for fragmentation to occur on the Radius response (special when Radius has to encapsulate the server certificate in response = needs to fragment the packet, not 100% sure if the resolves the problem.

So a final rule of thumb, running 802.1x EAP traffic, Radius server in Azure or sending 802.1x radius traffic through Azure you have to be very, very careful, due to the UDP fragmentation issue / rules in Azure breaking the RFC.

Microsoft should make sure that Azure vNet follows the RFC, and not blame it on some "security reason here and there" breaking the entire UDP stack. If they have a problem with the RFC in the UDP protocol, then work on changing the RFC.

@jackchenwork
Copy link

jackchenwork commented Nov 17, 2022

@fedda-no I had same problem using FortiAuthenticator as Radius server, Fortigate as site2site VPN server in Azure. EAP-TLS cert based authentication some time work, sometime not, reason is fragmented UDP datagram got lost. Fortigate has a "set ip-fragmentation pre-encapsulation" option and that ( plus reduce MTU) fixed the problem for me.

@hotgore
Copy link

hotgore commented Mar 13, 2023

UDP Fragmentation is a pain in Azure, I have special a problem with EAP-TLS in Radius UDP packet:

1. Standard Azure VNet drops all EAP TLS where UDP packet get fragmented (NAS to Radius) when the NAS encapsulate the client certificate in the UDP packet (it needs to be fragmented), Azure drops this (NAS logs it with: Radius Server Dead, Radius server: Not Response from NAS).


* This is for ALL radius traffic traversing Azure, example if you use Azure VWAN, Site to Azure, and Azure Express to Onprem DC (where Radius is hosted), UDP packet get "dropped".


2. Initial fix for this is to:
   
   * Get Azure to enable enable-udp-fragment-reordering.

3. However I still have problem if I host example Cisco ISE 3.2 in Azure, where I experience that Azure Vnet drops the "server certificate response", again, Cisco ISE 3.2 deployed from marked space are setup with standard MTU of 1500, where we see randomly that Azure still dropping the radius response from Cisco ISE. NAS reports: Radius Server dead jump to next radius server in list, ISE Response to the next request: Authentication process already in process and drops the request.  But this problem are random, not consistent as [From private repo #1](https://github.com/MicrosoftDocs/azure-docs/pull/1).


* Fix in progress, trying to change the MTU on the Cisco ISE server in Azure to 1400, to prevent any form for fragmentation to occur on the Radius response (special when Radius has to encapsulate the server certificate in response = needs to fragment the packet, not 100% sure if the resolves the problem.

So a final rule of thumb, running 802.1x EAP traffic, Radius server in Azure or sending 802.1x radius traffic through Azure you have to be very, very careful, due to the UDP fragmentation issue / rules in Azure breaking the RFC.

Microsoft should make sure that Azure vNet follows the RFC, and not blame it on some "security reason here and there" breaking the entire UDP stack. If they have a problem with the RFC in the UDP protocol, then work on changing the RFC.

Did changing the MTU fix the issue?

@asudbring
Copy link
Contributor

Thank you for you dedication to our documentation.

Unfortunately, we have been unable to review this issue in a timely manner. We sincerely apologize for the delayed response. We are closing this issue. If you feel that the problem persists, please respond to this issue with additional information.

Please continue to provide feedback about the documentation. We appreciate your contributions to our community.

#please-close

@gideonoost
Copy link

gideonoost commented Mar 17, 2023 via email

@gideonoost
Copy link

gideonoost commented Mar 17, 2023 via email

@fedda-no
Copy link

fedda-no commented Mar 17, 2023 via email

@hotgore
Copy link

hotgore commented Mar 17, 2023

Issue is not resolved.

Funny that Microsoft requires customers to ask for a secret flag to fix their broken network design. Yeah, build the network on GRE in a way so customers can't use GRE themselves, keep the MTU default so you have to fragment, and then take a stateless protocol, fragment it and deliver the fragments out of order. No one in the world would think that fragmenting UDP packets and delivering them out of order is a good idea. When customers complain their only option is the secret code word whereby the customer can't actually verify it is enabled. This idiocy will only get worse as more and more protocols use UDP (QUIC).

@neilyoung
Copy link

neilyoung commented Mar 24, 2023

I'm having a problem with an RTSP server, which emits RTP payload at maximum 1472 bytes. This renders to max 1514 bytes on ETH, surely too big for MTU 1500. I traced the output of the Azure network interface I have access to and found, that all UDP packages left with Don't fragment bit set. In result the RTSP client couldn't successfully fetch RTSP via UDP, while TCP/TLS worked fine.

I managed to convince the nice developer to provide me a means to reduce the RTP payload size. And already the first attempt was a success (payload 1400). I think the max RTP payload is somewhat like 1458, wich in turn would render to ETH 1500. Don't fragment is still set, but the packages "come through" now.

Full story bluenviron/mediamtx#1588

Not sure, why everybody says Azure fragments at 1400...

@maZuFC
Copy link

maZuFC commented Jul 31, 2023

UDP Fragmentation is a pain in Azure, I have special a problem with EAP-TLS in Radius UDP packet:

  1. Standard Azure VNet drops all EAP TLS where UDP packet get fragmented (NAS to Radius) when the NAS encapsulate the client certificate in the UDP packet (it needs to be fragmented), Azure drops this (NAS logs it with: Radius Server Dead, Radius server: Not Response from NAS).
  • This is for ALL radius traffic traversing Azure, example if you use Azure VWAN, Site to Azure, and Azure Express to Onprem DC (where Radius is hosted), UDP packet get "dropped".
  1. Initial fix for this is to:

    • Get Azure to enable enable-udp-fragment-reordering.
  2. However I still have problem if I host example Cisco ISE 3.2 in Azure, where I experience that Azure Vnet drops the "server certificate response", again, Cisco ISE 3.2 deployed from marked space are setup with standard MTU of 1500, where we see randomly that Azure still dropping the radius response from Cisco ISE. NAS reports: Radius Server dead jump to next radius server in list, ISE Response to the next request: Authentication process already in process and drops the request. But this problem are random, not consistent as From private repo #1.

  • Fix in progress, trying to change the MTU on the Cisco ISE server in Azure to 1400, to prevent any form for fragmentation to occur on the Radius response (special when Radius has to encapsulate the server certificate in response = needs to fragment the packet, not 100% sure if the resolves the problem.

So a final rule of thumb, running 802.1x EAP traffic, Radius server in Azure or sending 802.1x radius traffic through Azure you have to be very, very careful, due to the UDP fragmentation issue / rules in Azure breaking the RFC.

Microsoft should make sure that Azure vNet follows the RFC, and not blame it on some "security reason here and there" breaking the entire UDP stack. If they have a problem with the RFC in the UDP protocol, then work on changing the RFC.

for me... your 'fix in progress' is what resolve our auth issues. Setting the MTU on the ISE node to 1300 did the trick.
initial testing is good and as long as nothign breaks with the mtu at 1300 then we will leave it as that. We only use ISE for 802.1x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests