Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s安装,hostNetwork模式部署时,支持配置监听端口,以避免冲突 #734

Closed
johnlanni opened this issue Dec 25, 2023 · 8 comments · Fixed by #829
Closed

Comments

@johnlanni
Copy link
Collaborator

          这个目前还没支持配置,我先建个issue跟踪下

Originally posted by @johnlanni in #382 (comment)

@Uncle-Justice
Copy link
Contributor

我想认领此issue

  • 预计方案为

在helm values.yaml中追加containerHttpPort, hostHttpPost, containerHttpsPort, HostHttpsPort;对应gateway的Deployment配置中的相应位置的值。

  • 预计验证生效方法
  1. iptables
  2. curl
  3. 运行网关部分e2e测试
  • pr涉及的其他问题
  1. HostNetwork=true(默认false)时,pod与宿主机处于同一网络空间下,因此replicas只能为1(目前默认为2),否则可能存在多个gateway pod间的端口冲突
  2. HostNetwork=true时,貌似会忽略hostPort这一参数的值,而只有containerPort实际生效。目前暂未尝试HostNetwork=true且设置hostPort且与containerPort的值不一致的情况,网上说好像是直接覆盖hostPort

@johnlanni johnlanni removed the help wanted Extra attention is needed label Jan 18, 2024
@johnlanni
Copy link
Collaborator Author

@Uncle-Justice hostnetwork模式下,容器监听什么端口,宿主机上就暴露什么端口,所以只需要配置 httpPort 和 httpsPort 即可,不过代码中可能有一些 hardcord 依赖了 80和443,实现时可以检查下。

@Uncle-Justice
Copy link
Contributor

@johnlanni 请问hostNetwork模式下,对gateway的访问只通过node-ip : node-port的形式吗?我尝试了一下,gateway容器设置为hostNetwork模式,container port & host port均为80,k8s安装之后,curl该端口,显示连接失败。

以及进入node中查看端口映射情况,也只观察到gateway对应的LoadBalancer产生的DNAT规则,而我预期的情况是在hostetwork下应该出现node-ip:nodeportpod-ip:pod-port的映射:

root@higress-control-plane:/#  iptables -nvL -t nat  | grep DNAT
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            172.19.0.1           tcp dpt:53 to:127.0.0.11:34179
   98  8216 DNAT       udp  --  *      *       0.0.0.0/0            172.19.0.1           udp dpt:53 to:127.0.0.11:41293
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-gateway:http2 */ tcp to:172.19.0.2:80
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-controller:http */ tcp to:10.244.0.5:8888
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-controller:grpc */ tcp to:10.244.0.5:15051
    1    60 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-controller:https-dns */ tcp to:10.244.0.5:15012
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp */ tcp to:10.244.0.2:53
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* default/kubernetes:https */ tcp to:172.19.0.2:6443
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-controller:http-monitoring */ tcp to:10.244.0.5:15014
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:metrics */ tcp to:10.244.0.2:9153
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:metrics */ tcp to:10.244.0.4:9153
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-controller:grpc-xds */ tcp to:10.244.0.5:15010
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns-tcp */ tcp to:10.244.0.4:53
   32  3255 DNAT       udp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns */ udp to:10.244.0.4:53
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-controller:https-webhook */ tcp to:10.244.0.5:15017
   39  3939 DNAT       udp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* kube-system/kube-dns:dns */ udp to:10.244.0.2:53
    0     0 DNAT       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            /* higress-system/higress-gateway:https */ tcp to:172.19.0.2:443

原因可能是因为loadBalancer本身还是其他呢?


gateway pod的信息看上去没什么问题,ip确实是与node ip一致

root@higress-control-plane:/# kubectl describe pod higress-gateway -n higress-system
Name:             higress-gateway-7cbb7cd79d-gzfql
Namespace:        higress-system
Priority:         0
Service Account:  higress-gateway
Node:             higress-control-plane/172.19.0.2
Start Time:       Sun, 04 Feb 2024 05:37:55 +0000
Labels:           app=higress-gateway
                  higress=higress-system-higress-gateway
                  pod-template-hash=7cbb7cd79d
                  sidecar.istio.io/inject=false
Annotations:      prometheus.io/path: /stats/prometheus
                  prometheus.io/port: 15020
                  prometheus.io/scrape: true
                  sidecar.istio.io/inject: false
Status:           Running
IP:               172.19.0.2
IPs:
  IP:           172.19.0.2
Controlled By:  ReplicaSet/higress-gateway-7cbb7cd79d
Containers:
  higress-gateway:
    Container ID:  containerd://85054bb67dc88d63502c7905799790ad95b8f6fb683878a5b64c11fca26406fc
    Image:         higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:sha-87c39d3
    Image ID:      higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway@sha256:6184a7d584f4bc59cc9de37afcdafa0a727921a886db64bfaccff705c7cafa3a
    Ports:         15090/TCP, 80/TCP, 443/TCP
    Host Ports:    15090/TCP, 80/TCP, 443/TCP
    Args:
      proxy
      router
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --proxyLogLevel=warning
      --proxyComponentLogLevel=misc:error
      --log_output_level=all:info
      --serviceCluster=higress-gateway
    State:          Running
      Started:      Sun, 04 Feb 2024 05:42:40 +0000
    Ready:          True
    Restart Count:  0
    Readiness:      http-get http://:15021/healthz/ready delay=1s timeout=3s period=2s #success=1 #failure=30
    Environment:
      NODE_NAME:                    (v1:spec.nodeName)
      POD_NAME:                    higress-gateway-7cbb7cd79d-gzfql (v1:metadata.name)
      POD_NAMESPACE:               higress-system (v1:metadata.namespace)
      INSTANCE_IP:                  (v1:status.podIP)
      HOST_IP:                      (v1:status.hostIP)
      SERVICE_ACCOUNT:              (v1:spec.serviceAccountName)
      PILOT_XDS_SEND_TIMEOUT:      60s
      PROXY_XDS_VIA_AGENT:         true
      ENABLE_INGRESS_GATEWAY_SDS:  false
      JWT_POLICY:                  third-party-jwt
      ISTIO_META_HTTP10:           1
      ISTIO_META_CLUSTER_ID:       Kubernetes
      INSTANCE_NAME:               higress-gateway
    Mounts:
      /etc/istio/config from config (rw)
      /etc/istio/pod from podinfo (rw)
      /etc/istio/proxy from proxy-socket (rw)
      /var/lib/istio/data from istio-data (rw)
      /var/run/secrets/istio from istio-ca-root-cert (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-fhnwr (ro)
      /var/run/secrets/tokens from istio-token (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  istio-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  43200
  istio-ca-root-cert:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      higress-ca-root-cert
    Optional:  false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      higress-config
    Optional:  false
  istio-data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  proxy-socket:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  podinfo:
    Type:  DownwardAPI (a volume populated by information about the pod)
    Items:
      metadata.labels -> labels
      metadata.annotations -> annotations
      requests.cpu -> cpu-request
      limits.cpu -> cpu-limit
  kube-api-access-fhnwr:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                From               Message
  ----     ------       ----               ----               -------
  Normal   Scheduled    48m                default-scheduler  Successfully assigned higress-system/higress-gateway-7cbb7cd79d-gzfql to higress-control-plane
  Warning  FailedMount  47m (x7 over 48m)  kubelet            MountVolume.SetUp failed for volume "istio-ca-root-cert" : configmap "higress-ca-root-cert" not found
  Normal   Pulling      47m                kubelet            Pulling image "higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:sha-87c39d3"
  Normal   Pulled       43m                kubelet            Successfully pulled image "higress-registry.cn-hangzhou.cr.aliyuncs.com/higress/gateway:sha-87c39d3" in 3m40.802395633s
  Normal   Created      43m                kubelet            Created container higress-gateway
  Normal   Started      43m                kubelet            Started container higress-gateway

@johnlanni
Copy link
Collaborator Author

开启hostnetwork后,容器直接在宿主机上监听端口,你在宿主机上netstat -lntp没看到envoy监听的80/443吗,如果没看到,是不是因为没有创建ingress?或者那里没配置对

@Uncle-Justice
Copy link
Contributor

@johnlanni 对gateway容器的service创建ingress之后,的确观察到宿主机上出现了envoy在监听80/443,不过如果我把gateway容器监听的端口改成非80,netstat显示envoy依然监听的80

所以higress的逻辑是,higress-controller监听到了ingress资源创建后,才会触发envoy-gateway,使其正常工作,而不是gateway容器创建成功之后?

如果不论gateway容器设置的监听端口是几,envoy-gateway实际监听的都是80端口,那你之前说的hardcode问题应该主要可能出现在控制器ingress与envoy交互的那部分?

@johnlanni
Copy link
Collaborator Author

higress是通过把ingress转换为istio gateway api来控制数据面的端口监听的,可以看下ingress转换实现里convertGateway部分的逻辑哈

@Uncle-Justice
Copy link
Contributor

手动修改ingressv1.convertGateway函数中的端口后,的确可以观察到higress-gateway实际监听的端口(netstat)发生了对应改变。

但是如果是用k8s安装,配置数据面的端口要与higress-gateway pod容器yaml设置的监听接口保持一致,应该如何实现呢?要在istio这一层面去实现读取gateway pod配置了哪些端口吗?感觉istio应该主要还是负责把新的配置推送到数据面吧?

@johnlanni
Copy link
Collaborator Author

@Uncle-Justice 通过环境变量来控制监听端口,并且在Helm模版中根据helm参数设置这个环境变量,并跟yaml中端口保持一致即可

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants