Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

underlay模式下通过service请求数据量超过2k时无法正常通信 #2825

Closed
Git4Mark opened this issue May 17, 2023 · 15 comments · Fixed by #2834
Closed

underlay模式下通过service请求数据量超过2k时无法正常通信 #2825

Git4Mark opened this issue May 17, 2023 · 15 comments · Fixed by #2834
Labels
question Further information is requested

Comments

@Git4Mark
Copy link

Expected Behavior

underlay网络无数据大小限制

Actual Behavior

underlay网络存在数据大小限制

Steps to Reproduce the Problem

1.安装k8s
2.部署kube-ovn 1.11.3 underlay模式
3.创建Nginx pod,svc测试网络
4.在宿主机上通过pod ip请求nginx服务正常,在宿主机上通过svc ip请求Nginx服务当文件大小达到2k时请求无响应
5.抓包目标pod宿主机的br-provider网卡,发现只有ack http请求的包,没有发送数据的包

Additional Info

  • Kubernetes version:

    Output of kubectl version:

WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.4", GitCommit:"95ee5ab382d64cfe6c28967f36b53970b8374491", GitTreeState:"clean", BuildDate:"2022-08-17T18:54:23Z", GoVersion:"go1.18.5", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.4", GitCommit:"95ee5ab382d64cfe6c28967f36b53970b8374491", GitTreeState:"clean", BuildDate:"2022-08-17T18:47:37Z", GoVersion:"go1.18.5", Compiler:"gc", Platform:"linux/amd64"}


- kube-ovn version:

1.11.3


- operation-system/kernel version:

**Output of `awk -F '=' '/PRETTY_NAME/ { print $2 }' /etc/os-release`:**
**Output of `uname -r`:**

"Red Hat Enterprise Linux 8.2 (Ootpa)"
4.18.0-193.el8.x86_64


<!-- Any other additional information -->
- ovn安装参数
```bash
IPV6=${IPV6:-false}
DUAL_STACK=${DUAL_STACK:-false}
ENABLE_SSL=${ENABLE_SSL:-false}
ENABLE_VLAN=${ENABLE_VLAN:-true}
CHECK_GATEWAY=${CHECK_GATEWAY:-true}
LOGICAL_GATEWAY=${LOGICAL_GATEWAY:-true}
U2O_INTERCONNECTION=${U2O_INTERCONNECTION:-false}
ENABLE_MIRROR=${ENABLE_MIRROR:-true}
VLAN_NIC=${VLAN_NIC:-myeth0}
HW_OFFLOAD=${HW_OFFLOAD:-false}
ENABLE_LB=${ENABLE_LB:-true}
ENABLE_NP=${ENABLE_NP:-true}
ENABLE_EIP_SNAT=${ENABLE_EIP_SNAT:-false}
LS_DNAT_MOD_DL_DST=${LS_DNAT_MOD_DL_DST:-true}
ENABLE_EXTERNAL_VPC=${ENABLE_EXTERNAL_VPC:-true}
CNI_CONFIG_PRIORITY=${CNI_CONFIG_PRIORITY:-01}
ENABLE_LB_SVC=${ENABLE_LB_SVC:-false}
ENABLE_KEEP_VM_IP=${ENABLE_KEEP_VM_IP:-true}

EXCHANGE_LINK_NAME=${EXCHANGE_LINK_NAME:-true}

POD_CIDR="10.50.0.0/18" # Do NOT overlap with NODE/SVC/JOIN CIDR
POD_GATEWAY="10.50.0.1"
SVC_CIDR="10.50.128.0/18" # Do NOT overlap with NODE/POD/JOIN CIDR
JOIN_CIDR="10.50.64.0/18" 

VLAN_ID="0"

image

@Git4Mark
Copy link
Author

实际测试发现,在pod所在宿主机通过svc可以正常请求超过2k的文件,在其它主机上不行

@oilbeater
Copy link
Collaborator

检查一下路径上mtu是否有设置错误的地方

@oilbeater oilbeater added the question Further information is requested label May 18, 2023
@Git4Mark
Copy link
Author

@oilbeater br-provider myeth0 都是1500,ovn0是1400,genev_sys_6081 是 65000

检查一下路径上mtu是否有设置错误的地方

ovn0 mtu 1400
myeth0 mtu 1500
br-provider mtu 1500
genev_sys_6081 mtu 1500
pod eth0 mtu 1500
pod 对端veth mtu 1500

@Git4Mark
Copy link
Author

genev_sys_6081 mtu 65000
写错了,不好意思
@oilbeater

@Git4Mark
Copy link
Author

另外,我在k8s其它节点上通过pod宿主机ip:nodeport访问是没问题的,但是在非k8s集群主机上通过nodeport访问仍然是2k限制

@Git4Mark
Copy link
Author

image
这是pod所在宿主机br-provider抓包情况
image
这是源地址主机抓包情况

@Git4Mark
Copy link
Author

@oilbeater 现在就是svc有问题,通过pod ip访问是没问题的,通过svc ip就是2k现在,通过nodeport在集群外是2k限制,在集群节点上通过目标pod所在宿主机ip访问nodeport也正常,但是在集群节点上通过非目标pod宿主机ip访问nodeport也是2k限制

@Git4Mark
Copy link
Author

在宿主机上ping pod id,最大是1372字节
@oilbeater

@Git4Mark
Copy link
Author

@oilbeater
我尝试调整kube-ovn-cni的mtu,实测发现最大只能设置到1442,再大直接通过pod ip请求就会出现大小限制,http请求文件极限大小1390,而且调大mtu会让svc请求数据量变大,pod请求数据量变小,不知道该怎么调了

@oilbeater
Copy link
Collaborator

@zhangzujian 看下,应该是和 logical_gateway 开启有关

@Git4Mark
Copy link
Author

@zhangzujian 看下,应该是和 logical_gateway 开启有关

我测试了下,使用物理网关是没有问题的,但是我这边网络受限,没法使用物理网络作为pod子网,使用逻辑网关该怎么处理这个问题呢

@zhangzujian
Copy link
Member

看一下节点上的路由 ip route

@Git4Mark
Copy link
Author

Git4Mark commented May 19, 2023

源主机
1684465836451
目标pod主机
1684465877658

看一下节点上的路由 ip route

@Git4Mark
Copy link
Author

我感觉还是ovn的mtu小于1500的原因,源主机的出口网卡mtu是1500,但是目的主机的ovn网卡mtu是1400,按理说underlay模式没有隧道封装mtu应该可以设置成1500,但是不知道为啥mtu设置超过1442,主机pod通信就会出问题

@zhangzujian
Copy link
Member

节点和 Pod 跨网段了,网络传输走的隧道。要么把 Underlay 网卡换成一个单独的网卡,MTU 设置成 1400,要么换成 Overlay。

@zhangzujian zhangzujian linked a pull request May 19, 2023 that will close this issue
1 task
@zhangzujian zhangzujian removed a link to a pull request May 19, 2023
1 task
@zhangzujian zhangzujian linked a pull request May 19, 2023 that will close this issue
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants