-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Q: what is tx_send_full? #36
Comments
Hi woosley, the high tx_send_full number may be normal. Haiyang should be able to explain it in detail. Can you please share more info: Can you share the VM's full serial console log (we can get this from Azure portal, if the VM enables Boot Diagnostics), the 'dmesg' , the syslog file and the full output of "ethtool -S eth0"? I assume when the network issue happens, you're still able to login the VM via the Azure Serial Console? Which Linux distro(s) and kernel version(s) are you using (please share "uname -r")? What's the symptom of the network issue -- you're unable to ssh to the VM due to Connection Timeout? Are you still able to ping the VM from another good VM in the same VNET? How easily can you reproduce the issue? What's the workload to reproduce the issue? Do you think if it's possible for us to easily reproduce the issue with the same workload? When the network issue happens, if you're still able to login the VM via Azure serial console: do you notice any other symptoms, e.g. slow disk read/write? high CPU utilization? If your kernel creates the channels directory (find /sys -name channels), can you find the channels directory of your affected NIC (e.g. eth0) and dump the state of the ringbuffers by:
|
tx_send_full only means the netvsc send-buffer is full temporarily, but the outgoing packets can still go out with GPA (guest physical address) list. |
Hi @dcui
This issue is not happening on other VMs, including the k8s master nodes. I compared the sysctl, ethtool -S information etc. the only difference I can tell currently is that on those slave nodes, tx_send_full is high while other VMs, this value is always 0.
|
FYI: Here is some info about Accelerated Networking(AN). Your syslog and "ethtool -S eth0" show that your VM doesn't enable AN, so it looks you're seeing a bug in the Hyper-V network device driver hv_netvsc. Your kernel version is 3.10.0-862.11.6.el7.x86_64 (i.e. CentOS 7.5). There is a known issue with this kernel version and the fix is hv_netvsc: Fix napi reschedule while receive completion is busy. We also know CentOS 7.5 has other related bugs that can also cause the similar symptom if the VM is under heavy workloads. I suggest you should upgrade to the latest CentOS 7.7 (at least please upgrade the kernel part to that in 7.7, i.e. 3.10.0-1062.x), because 7.7 has all the related fixes. If you have to stay with CentOS 7.5, please consider replacing the built-in Hyper-V drivers (i.e. LIS drivers) with the Microsoft-maintained version: LIS 4.3.4, which also includes all the related fixes. Please let us know if either solution can work for you. |
thanks @dcui |
I just realized another bug may cause the similar symptom and the fix is 2019-05-03 hv_netvsc: fix race that may miss tx queue wakeup. This fix is not present in RHEL 7.7 and earlier. The fix has been incorporated into RHEL 7.8-beta. I expect that the formal RHEL 7.8 will be released in a few months and I suggest you should upgrade to RHEL 7.8 then. The Microsoft-maintained version of the Linux drivers (i.e. LIS 4.3.4 ) has also included the fix, so you may consider installing LIS 4.3.4 in your RHEL/CentOS 7.5/7.6/7.7 and older (if you're using RHEL rather than CentOS, please consult with your RHEL technical support before doing so). |
@dcui, can you clarify the relationship between |
The guest has 2 ways to send a packet to the host and the host will send the packet to the network on behalf of the guest: 1): for a packet that's <= 6KBytes, the guest copies the data of the packet into a host/guest-shared buffer and tells the offset of the data to the host; 2): if the packet length is >6KB or 1) fails (i.e. the host/guest-shared buffer is full and the host is not fetching data from the buffer fast enough. In this case, the tx_send_full is increased by 1), the guest tells the physical address and the length of the packet to the host and the host sends the packet to the network on behalf of the guest. 2) avoids an extra copy of data, but the host needs to map/unmap the guest's physical address, so it also has a cost and we only use it for big packets (i.e. the packet length is >6KB). So typically a non-zero tx_send_full itself is not an error, but if the tx_send_full goes very big, it might mean: 1) there may be a guest bug (the guest is not managing the shared buffer correctly); or, 2) the host is very busy and can not fetch data from the buffer fast enough; or 3) the host is not managing the buffer correctly. So far we don't know which one is more likely, based on the existing info. I suggest you dump the NIC's state (please refer to my first reply on Feb 5) in your CentOS/RHEL7.7 VM by checking the files in the channels directory of the affected NIC, and please also share the full output of "ethtool -S eth0" (here I assume eth0 is the affected NIC in your VM). With the info we should have a better understanding about any possible guest bug. Please also try LIS 4.3.4, which integrated the fixes to all the known issues we know of. Having said all these, I don't know why you have the timeout sometimes and you see the retransmission of the sync packets. IMO that may be caused by a bug in the guest NIC driver, or some issue in the local/remote applications/bridge drivers or the intermediate network link. |
this time I did not see
below is the log of the eth0 channel status |
Also I am a bit confused on the sync retransimission sympton. IMO, if, in the output of TCPDump, the sync packet leaves eth0, this should rule out any issue on the guest VM since the packet already left the NIC. If there is a retransmission, it basically means that the packet is lost somewhere, there is nothing to do with the guest. |
The output of "ethtool -S" is normal. The info of the 'channels/' is also normal: it shows both the guest-to-host ring and the host-to-guest ring are empty. So I don't see an issue in the hv_netvsc NIC driver. About the sync retransimission sympton, I suspect some packets are lost somewhere outside of the guest. |
FYI: I'm not sure if Azure drops some SYN packets due to SNAT port exhaustion: |
@dcui Hi, actually in our China East 2 region setup, for outbound connection, the traffic has been routed to Azure firewall with 4 public IPs to avoid SNAT port exhaustion. Plus, since we are just beginning to deploy apps in east 2 region, there is not too much outbound traffic. |
Can the symptom go away if you install the LIS 4.3.4 package? If this doesn't work either, then it will be more unlikely to be a guest issue, and I suspect we can not further debug the issue from within the guest. You probably need to open a ticket so Azure support will get involved to help check if any packets are missing outside of the guest. BTW, the 2 bugs I mentioned previously only happen under very heavy network workloads. I really don't think you should hit them by just running "curl https://public_url in a for loop for 100 times ". |
I have updated LIS 4.3.4 and the alerts seem reduced a lot so the issue might be fix. |
close this issue since we did not see this again. |
Hi there,
I am tracing a production network issue in Azure cloud. We see random outbound timeout on some of the VMs, on those VM, TCPdump showed a lot of sync retranssmission to external host.
When we look the output for "ethtool -S eth0", the only abnomal stat is tx_send_full
I traced the kernel code, and it seems this is the place where the code start from. So, can you explain what is this metrics and why it increases?
The text was updated successfully, but these errors were encountered: