Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help request: Set upstream discovery kubernetes type not work. #7026

Closed
sweetpotatoman opened this issue May 11, 2022 · 34 comments
Closed

help request: Set upstream discovery kubernetes type not work. #7026

sweetpotatoman opened this issue May 11, 2022 · 34 comments
Labels

Comments

@sweetpotatoman
Copy link

Description

Set upstream discovery kubernetes type not work

We installed apisix using the helm method.

  • apisix - config.yaml
...
discovery:
  kubernetes: { }
...
  • upstream - config

discovery_type: kubernetes
service_name: testnet/xxx-reader:http-80

{
  "timeout": {
    "connect": 6,
    "send": 6,
    "read": 6
  },
  "type": "roundrobin",
  "scheme": "http",
  "discovery_type": "kubernetes",
  "pass_host": "pass",
  "name": "xxx-reader",
  "desc": "xxx-reader",
  "service_name": "testnet/xxx-reader:http-80",
  "keepalive_pool": {
    "idle_timeout": 60,
    "requests": 1000,
    "size": 320
  }
}
  • route - config
{
  "uri": "/explorer",
  "name": "onemore",
  "desc": "time",
  "methods": [
    "POST"
  ],
  "host": "test-explorer-api.xxx.io",
  "upstream_id": "407087917031228312", // upstream_name is 'xxx-reader'
  "labels": {
    "app": "one"
  },
  "status": 1
}
  • kubernetes - endpoints

2

  • apisix - log
2022/05/11 04:08:42 [error] 388#388: *80384 [lua] init.lua:512: http_access_phase(): failed to set upstream: no valid upstream node: nil, client: 172.20.17.238, server: _, request: "POST /explorer HTTP/1.1"

When the log returns no valid upstream node: nil, we are not quite sure what's wrong.
Why are we still unable to find upstream node with this configuration?

Environment

  • APISIX version (run apisix version): 2.13.1
  • Operating system (run uname -a): Linux apisix-86b8f89954-m9qbp 5.4.141-67.229.amzn2.x86_64 change: added doc of how to load plugin. #1 SMP Mon Aug 16 12:51:43 UTC 2021 x86_64 Linux
  • OpenResty / Nginx version (run openresty -V or nginx -V): openresty/1.19.9.1
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info): 3.4.18
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):
@tzssangglass
Copy link
Member

When the log returns no valid upstream node: nil, we are not quite sure what's wrong.
Why are we still unable to find upstream node with this configuration?

Are there any other error logs?
Or you can adjust the log level to debug to get more logs.

This is usually the case when APISIX is unable to query a valid message from k8s.

@sweetpotatoman
Copy link
Author

When the log returns no valid upstream node: nil, we are not quite sure what's wrong.
Why are we still unable to find upstream node with this configuration?

Are there any other error logs? Or you can adjust the log level to debug to get more logs.

This is usually the case when APISIX is unable to query a valid message from k8s.

the log level is debug in now.

...
nginx_config:
  error_log_level: "debug"
...

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] radixtree.lua:564: match_route_opts(): hosts match: true, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:388: http_access_phase(): matched route: {"clean_handlers":{},"modifiedIndex":177,"createdIndex":176,"value":{"labels":{"app":"one"},"create_time":1652326626,"status":1,"name":"graphiql","priority":0,"uri":"\/explorer\/graphiql","desc":"123","methods":["GET","POST"],"upstream_id":"407087917031228312","host":"xxx-api.xxx.io","id":"407345815439279060","update_time":1652326641},"key":"\/apisix\/routes\/407345815439279060","orig_modifiedIndex":177,"update_count":0,"has_domain":false}, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:305: get_upstream_by_id(): parsed upstream: {"clean_handlers":{},"createdIndex":79,"value":{"create_time":1652172906,"discovery_type":"kubernetes","update_time":1652241806,"service_name":"testnet\/explorer-reader:http-80","hash_on":"vars","scheme":"http","type":"roundrobin","pass_host":"pass","keepalive_pool":{"idle_timeout":60,"size":320,"requests":1000},"timeout":{"read":6,"send":6,"connect":6},"name":"explorer-reader","desc":"explorer-reader","id":"407087917031228312","parent":{"clean_handlers":"table: 0x7fdd4917e0b0","createdIndex":79,"value":"table: 0x7fdd4cad5c70","key":"\/apisix\/upstreams\/407087917031228312","modifiedIndex":162,"has_domain":false}},"key":"\/apisix\/upstreams\/407087917031228312","modifiedIndex":162,"has_domain":false}, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:55 [error] 43#43: *35664 [lua] init.lua:512: http_access_phase(): failed to set upstream: no valid upstream node: nil, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"
2022/05/12 03:37:57 [info] 45#45: *37332 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 172.20.29.104, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
2022/05/12 03:37:57 [info] 45#45: *37332 [lua] init.lua:383: http_access_phase(): not find any matched route, client: 172.20.29.104, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
2022/05/12 03:37:57 [info] 46#46: *37336 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 172.20.4.221, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
2022/05/12 03:37:57 [info] 46#46: *37336 [lua] init.lua:383: http_access_phase(): not find any matched route, client: 172.20.4.221, server: _, request: "GET / HTTP/1.1", host: "172.20.61.219:9080"
172.20.4.221 - - [12/May/2022:03:37:55 +0000] xxx-api.xxx.io "GET /explorer/graphiql HTTP/1.1" 503 596 0.000 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36" - - - "http://xxx-api.xxx.io"

@zhixiongdu027
Copy link
Contributor

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"

Log message shows that Kubernetes Discovery did not get "testnet/explorer-reader" endpoints value from k8s

Can you provide the ServiceAccount information used by the pod where apisix is located?
We need to make sure this ServiceAccount has the permission of ListWatch Endpoints

@sweetpotatoman
Copy link
Author

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"

Log message shows that Kubernetes Discovery did not get "testnet/explorer-reader" endpoints value from k8s

Can you provide the ServiceAccount information used by the pod where apisix is located? We need to make sure this ServiceAccount has the permission of ListWatch Endpoints

yes, i know kubernetes discovery did not get "testnet/explorer-reader" endpoints value from k8s.

my understanding is that apisix is deployed in k8s.

configured:

...
discovery:
  kubernetes: { }
...

I really didn't verity sa, i try it.

@sweetpotatoman
Copy link
Author

sweetpotatoman commented May 12, 2022

log message

2022/05/12 03:37:55 [info] 43#43: *35664 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT testnet/explorer-reader, client: 172.20.4.221, server: _, request: "GET /explorer/graphiql HTTP/1.1", host: "xxx-api.xxx.io"

Log message shows that Kubernetes Discovery did not get "testnet/explorer-reader" endpoints value from k8s

Can you provide the ServiceAccount information used by the pod where apisix is located? We need to make sure this ServiceAccount has the permission of ListWatch Endpoints

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: apisix
rules:
  - apiGroups: [""]
    resources: ["namespaces"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: apisix
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: apisix
subjects:
  - kind: ServiceAccount
    name: default
    namespace: apisix

Not working.

@zhixiongdu027
Copy link
Contributor

Did you see any other logs printed by Kubernetes Discovery,
In the debug log mode, kubernetes discovery will print the endpoins information every time it listens

core.log.debug(core.json.delay_encode(endpoint))

core.log.debug(core.json.delay_encode(endpoint))

@sweetpotatoman
Copy link
Author

Did you see any other logs printed by Kubernetes Discovery, In the debug log mode, kubernetes discovery will print the endpoins information every time it listens

core.log.debug(core.json.delay_encode(endpoint))

core.log.debug(core.json.delay_encode(endpoint))

Debug mode is turn on, but two lines not show

@zhixiongdu027
Copy link
Contributor

zhixiongdu027 commented May 13, 2022

Kubernetes Discovery also prints log information at other execution points in addition to the places mentioned above:
for example:

core.log.info("--raw=", informer.path, "?", list_query(informer))

core.log.info("--raw=", informer.path, "?", watch_query(informer))

core.log.info("begin to connect ", apiserver.host, ":", apiserver.port)

core.log.error("list failed, kind: ", informer.kind,

core.log.error("watch failed, kind: ", informer.kind,

Can you help to find out if there is any related content, otherwise it is difficult to locate the cause of the problem

@huangyutongs
Copy link

I also had the same problem

apisix daemonset configuration file

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: apisix
  namespace: default
spec:
  selector:
    matchLabels:
      app.kubernetes.io/instance: apisix
      app.kubernetes.io/name: apisix
  template:
    metadata:
      annotations:
        checksum/config: 7fcdf2496b815f03e6da46a2b4f9ccf62862af978f875828bb86d12f75c94107
      labels:
        app.kubernetes.io/instance: apisix
        app.kubernetes.io/name: apisix
    spec:
      containers:
      - image: 192.168.101.30/devops/apache/apisix:2.13.1-alpine
        imagePullPolicy: IfNotPresent
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/sh
              - -c
              - sleep 30
        name: apisix
        ports:
        - containerPort: 80
          hostPort: 80
          name: http
          protocol: TCP
        - containerPort: 443
          hostPort: 443
          name: tls
          protocol: TCP
        - containerPort: 9180
          hostPort: 9180
          name: admin
          protocol: TCP
        readinessProbe:
          failureThreshold: 6
          initialDelaySeconds: 10
          periodSeconds: 10
          successThreshold: 1
          tcpSocket:
            port: 80
          timeoutSeconds: 1
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /usr/local/apisix/conf/config.yaml
          name: apisix-config
          subPath: config.yaml
        - mountPath: /etc/localtime
          name: timezone
          readOnly: true
      dnsPolicy: ClusterFirstWithHostNet
      hostNetwork: true
      initContainers:
      - command:
        - sh
        - -c
        - until nc -z apisix-etcd.default.svc.cluster.local 2379; do echo waiting
          for etcd `date`; sleep 2; done;
        image: busybox:1.28
        imagePullPolicy: IfNotPresent
        name: wait-etcd
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: apisix
      serviceAccountName: apisix
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      volumes:
      - configMap:
          defaultMode: 420
          name: apisix
        name: apisix-config
      - hostPath:
          path: /etc/localtime
          type: ""
        name: timezone
  updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
    type: RollingUpdate

ServiceAccount

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: apisix
rules:
  - apiGroups: [""]
    resources: ["namespaces"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: apisix
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: apisix
subjects:
  - kind: ServiceAccount
    name: apisix
    namespace: default
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: apisix

configmap

apiVersion: v1
data:
  config.yaml: |-
    #
    # Licensed to the Apache Software Foundation (ASF) under one or more
    # contributor license agreements.  See the NOTICE file distributed with
    # this work for additional information regarding copyright ownership.
    # The ASF licenses this file to You under the Apache License, Version 2.0
    # (the "License"); you may not use this file except in compliance with
    # the License.  You may obtain a copy of the License at
    #
    #     http://www.apache.org/licenses/LICENSE-2.0
    #
    # Unless required by applicable law or agreed to in writing, software
    # distributed under the License is distributed on an "AS IS" BASIS,
    # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    # See the License for the specific language governing permissions and
    # limitations under the License.
    #
    apisix:
      node_listen: 80             # APISIX listening port
      enable_heartbeat: true
      enable_admin: true
      enable_admin_cors: true
      enable_debug: false

      enable_dev_mode: false                       # Sets nginx worker_processes to 1 if set to true
      enable_reuseport: true                       # Enable nginx SO_REUSEPORT switch if set to true.
      enable_ipv6: false # Enable nginx IPv6 resolver
      config_center: etcd                          # etcd: use etcd to store the config value
                                                   # yaml: fetch the config value from local yaml file `/your_path/conf/apisix.yaml`

      #proxy_protocol:                 # Proxy Protocol configuration
      #  listen_http_port: 9181        # The port with proxy protocol for http, it differs from node_listen and port_admin.
                                      # This port can only receive http request with proxy protocol, but node_listen & port_admin
                                      # can only receive http request. If you enable proxy protocol, you must use this port to
                                      # receive http request with proxy protocol
      #  listen_https_port: 9182       # The port with proxy protocol for https
      #  enable_tcp_pp: true           # Enable the proxy protocol for tcp proxy, it works for stream_proxy.tcp option
      #  enable_tcp_pp_to_upstream: true # Enables the proxy protocol to the upstream server

      proxy_cache:                     # Proxy Caching configuration
        cache_ttl: 10s                 # The default caching time if the upstream does not specify the cache time
        zones:                         # The parameters of a cache
        - name: disk_cache_one         # The name of the cache, administrator can be specify
                                      # which cache to use by name in the admin api
          memory_size: 50m             # The size of shared memory, it's used to store the cache index
          disk_size: 1G                # The size of disk, it's used to store the cache data
          disk_path: "/tmp/disk_cache_one" # The path to store the cache data
          cache_levels: "1:2"           # The hierarchy levels of a cache
      #  - name: disk_cache_two
      #    memory_size: 50m
      #    disk_size: 1G
      #    disk_path: "/tmp/disk_cache_two"
      #    cache_levels: "1:2"

      allow_admin:                  # http://nginx.org/en/docs/http/ngx_http_access_module.html#allow
        - 127.0.0.1/24
        - 0.0.0.0/0
      #   - "::/64"
      port_admin: 9180

      # Default token when use API to call for Admin API.
      # *NOTE*: Highly recommended to modify this value to protect APISIX's Admin API.
      # Disabling this configuration item means that the Admin API does not
      # require any authentication.
      admin_key:
        # admin: can everything for configuration data
        - name: "admin"
          key: edd1c9f034335f136f87ad84b625c8f1
          role: admin
        # viewer: only can view configuration data
        - name: "viewer"
          key: 4054f7cf07e344346cd3f287985e76a2
          role: viewer
      router:
        http: 'radixtree_uri'         # radixtree_uri: match route by uri(base on radixtree)
                                      # radixtree_host_uri: match route by host + uri(base on radixtree)
        ssl: 'radixtree_sni'          # radixtree_sni: match route by SNI(base on radixtree)
      stream_proxy:                 # TCP/UDP proxy
        only: false
        tcp:                        # TCP proxy port list
          - 9100
        udp:                        # UDP proxy port list
          - 9200
      dns_resolver_valid: 30
      resolver_timeout: 5
      ssl:
        enable: true
        enable_http2: true
        listen_port: 443
        ssl_protocols: "TLSv1 TLSv1.1 TLSv1.2 TLSv1.3"
        ssl_ciphers: "ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:DHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES256-SHA256:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA"
      control:
        ip: 127.0.0.1
        port: 9091

    discovery:
      kubernetes: { }
    nginx_config:                     # config for render the template to genarate nginx.conf
      error_log: "/dev/stderr"
      error_log_level: "debug"         # warn,error
      worker_rlimit_nofile: 20480     # the number of files a worker process can open, should be larger than worker_connections
      event:
        worker_connections: 10620
      http:
        enable_access_log: true
        access_log: "/dev/stdout"
        access_log_format: "$remote_addr - $remote_user [$time_local] $http_host \"$request\" $status $body_bytes_sent $request_time \"$http_referer\" \"$http_user_agent\" $upstream_addr $upstream_status $upstream_response_time \"$upstream_scheme://$upstream_host$upstream_uri\""
        access_log_format_escape: default
        keepalive_timeout: 60s         # timeout during which a keep-alive client connection will stay open on the server side.
        client_header_timeout: 60s     # timeout for reading client request header, then 408 (Request Time-out) error is returned to the client
        client_body_timeout: 60s       # timeout for reading client request body, then 408 (Request Time-out) error is returned to the client
        send_timeout: 10s              # timeout for transmitting a response to the client.then the connection is closed
        underscores_in_headers: "on"   # default enables the use of underscores in client request header fields
        real_ip_header: "X-Real-IP"    # http://nginx.org/en/docs/http/ngx_http_realip_module.html#real_ip_header
        real_ip_from:                  # http://nginx.org/en/docs/http/ngx_http_realip_module.html#set_real_ip_from
          - 127.0.0.1
          - 'unix:'
      http_configuration_snippet: |-
        server_names_hash_bucket_size 128;
        proxy_buffer_size 128k;
        proxy_buffers 32 256k;
        proxy_busy_buffers_size 256k;

    etcd:
      host:                                 # it's possible to define multiple etcd hosts addresses of the same etcd cluster.
        - "http://apisix-etcd.default.svc.cluster.local:2379"
      prefix: "/apisix"     # apisix configurations prefix
      timeout: 30   # 30 seconds
    plugins:                          # plugin list
      - api-breaker
      - authz-keycloak
      - basic-auth
      - batch-requests
      - consumer-restriction
      - cors
      - echo
      - fault-injection
      - grpc-transcode
      - hmac-auth
      - http-logger
      - ip-restriction
      - ua-restriction
      - jwt-auth
      - kafka-logger
      - key-auth
      - limit-conn
      - limit-count
      - limit-req
      - node-status
      - openid-connect
      - authz-casbin
      - prometheus
      - proxy-cache
      - proxy-mirror
      - proxy-rewrite
      - redirect
      - referer-restriction
      - request-id
      - request-validation
      - response-rewrite
      - serverless-post-function
      - serverless-pre-function
      - sls-logger
      - syslog
      - tcp-logger
      - udp-logger
      - uri-blocker
      - wolf-rbac
      - zipkin
      - traffic-split
      - gzip
      - real-ip
      - ext-plugin-pre-req
      - ext-plugin-post-req
      - server-info
      - ldap-auth
    stream_plugins:
      - mqtt-proxy
      - ip-restriction
      - limit-conn

    plugin_attr:
      prometheus:
        export_uri: /apisix/prometheus/metrics
        metric_prefix: apisix_
        enable_export_server: true
        export_addr:
          ip: 127.0.0.1
          port: 9092
    plugin_attr:
      server-info:
        report_ttl: 60
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/managed-by: Helm
  name: apisix
  namespace: default

route

{
  "uri": "/*",
  "name": "asd",
  "host": "test.test.cn",
  "upstream": {
    "timeout": {
      "connect": 6,
      "send": 6,
      "read": 6
    },
    "type": "roundrobin",
    "scheme": "http",
    "discovery_type": "kubernetes",
    "pass_host": "pass",
    "service_name": "default/nginx:80",
    "keepalive_pool": {
      "idle_timeout": 60,
      "requests": 1000,
      "size": 320
    }
  },
  "status": 1
}

Error log generated for a single request

2022/09/23 16:22:41 [info] 45#45: *111554 [lua] route.lua:72: create_radixtree_uri_router(): insert uri route: {"id":"426665771179967242","host":"hyt.test.cn","methods":["GET","POST","PUT","DELETE","PATCH","HEAD","OPTIONS","CONNECT","TRACE"],"uri":"\/*","upstream":{"nodes":[{"weight":1,"port":9443,"host":"hub-console.hub-tenant"}],"type":"roundrobin","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"scheme":"https","parent":{"createdIndex":994,"has_domain":true,"modifiedIndex":994,"key":"\/apisix\/routes\/426665771179967242","value":{"id":"426665771179967242","host":"hyt.test.cn","methods":"table: 0x7f1839175cc8","uri":"\/*","upstream":"table: 0x7f1839141328","priority":0,"status":1,"update_time":1663842217,"name":"minio","create_time":1663842217},"clean_handlers":{},"update_count":0,"orig_modifiedIndex":994},"hash_on":"vars"},"priority":0,"status":1,"update_time":1663842217,"name":"minio","create_time":1663842217}, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] route.lua:72: create_radixtree_uri_router(): insert uri route: {"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":{"type":"roundrobin","keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"hash_on":"vars","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"discovery_type":"kubernetes","scheme":"http","parent":{"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":{"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":"table: 0x7f1833bed410","priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765},"clean_handlers":{},"update_count":0,"orig_modifiedIndex":1108},"service_name":"default\/nginx:80"},"priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765}, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] route.lua:94: create_radixtree_uri_router(): route items: [{"paths":"\/*","handler":"function: 0x7f1833b36678","priority":0,"hosts":"hyt.test.cn","methods":["GET","POST","PUT","DELETE","PATCH","HEAD","OPTIONS","CONNECT","TRACE"]},{"handler":"function: 0x7f1833b39a98","priority":0,"paths":"\/*","hosts":"test.test.cn"}], client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:346: pre_insert_route(): path: / operator: <=, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:234: insert_route(): insert route path: / dataprt: 1, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:346: pre_insert_route(): path: / operator: <=, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: true, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:488: compare_param(): pcre pat: \/((.|\n)*), client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:388: http_access_phase(): matched route: {"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":{"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":{"type":"roundrobin","keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"hash_on":"vars","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"discovery_type":"kubernetes","scheme":"http","parent":{"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":"table: 0x7f1833bed3c8","clean_handlers":{},"update_count":0,"orig_modifiedIndex":1108},"service_name":"default\/nginx:80"},"priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765},"clean_handlers":"table: 0x7f1833bed4b0","update_count":0,"orig_modifiedIndex":1108}, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT default/nginx, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [error] 45#45: *111554 [lua] init.lua:512: http_access_phase(): failed to set upstream: no valid upstream node: nil, client: 192.168.100.88, server: _, request: "GET / HTTP/1.1", host: "test.test.cn"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: false, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:564: match_route_opts(): hosts match: true, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] radixtree.lua:488: compare_param(): pcre pat: \/((.|\n)*), client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:388: http_access_phase(): matched route: {"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":{"id":"426790842154353418","host":"test.test.cn","uri":"\/*","upstream":{"type":"roundrobin","keepalive_pool":{"idle_timeout":60,"requests":1000,"size":320},"hash_on":"vars","pass_host":"pass","timeout":{"send":6,"connect":6,"read":6},"discovery_type":"kubernetes","scheme":"http","parent":{"createdIndex":1011,"has_domain":false,"modifiedIndex":1108,"key":"\/apisix\/routes\/426790842154353418","value":"table: 0x7f1833bed3c8","clean_handlers":{},"update_count":0,"orig_modifiedIndex":1108},"service_name":"default\/nginx:80"},"priority":0,"update_time":1663921234,"status":1,"name":"asd","create_time":1663916765},"clean_handlers":"table: 0x7f1833bed4b0","update_count":0,"orig_modifiedIndex":1108}, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [info] 45#45: *111554 [lua] init.lua:317: nodes(): get empty endpoint version from discovery DICT default/nginx, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"
2022/09/23 16:22:41 [error] 45#45: *111554 [lua] init.lua:512: http_access_phase(): failed to set upstream: no valid upstream node: nil, client: 192.168.100.88, server: _, request: "GET /favicon.ico HTTP/1.1", host: "test.test.cn", referrer: "http://test.test.cn/"

  • APISIX version (run apisix version): 2.13.1-alpine
  • Operating system (run uname -a): 5.15.0-39-generic
  • OpenResty / Nginx version (run openresty -V or nginx -V): openresty/1.19.9.1
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info): bitnami/etcd:3.4.18-debian-10-r14
  • APISIX Dashboard version, if relevant: apisix-dashboard:2.13-alpine
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):

@huangyutongs
Copy link

What other information do I need to provide

@huangyutongs
Copy link

root@master1:~/apisix-2.13.1# kubectl get endpoints nginx -oyaml
apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    endpoints.kubernetes.io/last-change-trigger-time: "2022-09-23T04:31:11Z"
  creationTimestamp: "2022-09-06T10:52:08Z"
  labels:
    component: nginx
  name: nginx
  namespace: default
  resourceVersion: "18220449"
  uid: 58efb351-60b3-49ee-8a3c-c83d0c849c0e
subsets:
- addresses:
  - hostname: nginx-0
    ip: 10.42.0.12
    nodeName: master1
    targetRef:
      kind: Pod
      name: nginx-0
      namespace: default
      uid: 99a23fa8-bc88-4e9e-907d-68da41d36daa
  ports:
  - name: http
    port: 80
    protocol: TCP

root@master1:~/apisix-2.13.1# kubectl describe endpoints nginx 
Name:         nginx
Namespace:    default
Labels:       component=nginx
Annotations:  endpoints.kubernetes.io/last-change-trigger-time: 2022-09-23T04:31:11Z
Subsets:
  Addresses:          10.42.0.12
  NotReadyAddresses:  <none>
  Ports:
    Name  Port  Protocol
    ----  ----  --------
    http  80    TCP

Events:  <none>

@zhixiongdu027
Copy link
Contributor

image
your endpoints had a port.name "http"

so ,you should set upstream service_name to "default/nginx:http "

@Hyt05

@huangyutongs
Copy link

图片 您的端点有一个 port.name "http"

因此,您应该将上游服务名称设置为“default/nginx:http”

@Hyt05
I just made a modification, it doesn't work, it's the same as the above error, does this have something to do with my use of hostNetwork?

@zhixiongdu027
Copy link
Contributor

zhixiongdu027 commented Sep 23, 2022

Maybe you only modified the configmap, but didn't restart the apisix pod?
@Hyt05

@huangyutongs
Copy link

I'm sure I restarted the apisix pod

@zhixiongdu027
Copy link
Contributor

@Hyt05
Sorry, I didn't see errors in the configuration.

Can you send QQ or WeChat to my email ( root@libssl.com )
Maybe we can quickly communicate via IM

@tokers
Copy link
Contributor

tokers commented Sep 26, 2022

Maybe you only modified the configmap, but didn't restart the apisix pod? @Hyt05

Modify the ConfigMap? You just need to update the route object through Admin API or Dashboard.

@zhixiongdu027
Copy link
Contributor

zhixiongdu027 commented Sep 26, 2022

Debugging in the user environment found that,

If use [ apisix-2.13.3:debian apisix-2.13.3:alpine ] image,
the
KUBERNETES_SERVER_HOST,
KUBERNETES_SERVER_PORT
environment variables are not injected as expected,
which further causes kubernetes discovery to fail to work

Everything works fine with apisix-2.13.3:centos image

Looks like this is a bug related to environment variables or schema

@spacewander @tokers @tzssangglass

TKS @Hyt05 for providing a test environment

@tokers
Copy link
Contributor

tokers commented Sep 26, 2022

Strange, this behavior should be consistent even on different OS.

@huangyutongs
Copy link

I can provide an environment to reproduce the problem at any time if needed

@soulbird
Copy link
Contributor

It may be because apisix:2.13.3-centos uses apisix 2.15.0 by mistake, I have re-pushed the image, you can try again.

@huangyutongs
Copy link

It may be because apisix:2.13.3-centos uses apisix 2.15.0 by mistake, I have re-pushed the image, you can try again.

Just tested, apisix:2.13.3-centos doesn't work anymore,so should i switch to 2.15, i want to be compatible with my dashbord

@zhixiongdu027
Copy link
Contributor

I will test kubernetes discovery in APISix 2.13.3 working in native and container

@zhixiongdu027
Copy link
Contributor

@soulbird @tokers @Hyt05

After comparing the release version and the code pr record

Kubernetes-related environment variable injection only started in version 2.15.X
In version 2.13.3, you need to set related environment variables in config.yaml by yourself

And U can read this issue

@huangyutongs
Copy link

discovery doesn't work
error log

init_worker_by_lua error: /usr/local/apisix/apisix/discovery/kubernetes/init.lua:342: not found environment variable KUBERNETES_SERVICE_HOST
        /usr/local/apisix/apisix/discovery/kubernetes/init.lua:342: in function 'init_worker'

image

@huangyutongs
Copy link

Thanks to @zhixiongdu027 guidance, kubernetes discovery is running correctly with the following configuration

    discovery:
      kubernetes: { }
    nginx_config:                     # config for render the template to genarate nginx.conf
      envs:
        - KUBERNETES_SERVICE_HOST
        - KUBERNETES_SERVICE_PORT

@tokers
Copy link
Contributor

tokers commented Oct 7, 2022

I think this detail can be recorded to the FAQ. @Hyt05 Could you help to submit a PR to add a FAQ item about this? Thanks!

@huangyutongs
Copy link

I think this detail can be recorded to the FAQ. @Hyt05 Could you help to submit a PR to add a FAQ item about this? Thanks!

I'd love to submit a PR on this, but I'm not familiar with the whole process, is there an example or documentation to refer to

@tzssangglass
Copy link
Member

I'd love to submit a PR on this, but I'm not familiar with the whole process, is there an example or documentation to refer to

https://apisix.apache.org/docs/general/contributor-guide/

@robertluoxu
Copy link

robertluoxu commented Dec 22, 2022

i have new quesion , apisix version :2.15.0
172.18.25.37:30662 "GET /mew/traffic/v1/roadStatusList HTTP/1.1" 503 596 0.001 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" - - - "http://172.18.25.37:30662"

I use the gateway namespace to deploy apisix, and use the namespace traffic to deploy business services. I need to use the service discovery method to access the business. How to do it
config.yaml

discovery:
  kubernetes: 
    service:
      schema: https
      host: ${KUBERNETES_SERVICE_HOST}
      port: ${KUBERNETES_SERVICE_PORT}
    client:
      token_file: ${KUBERNETES_CLIENT_TOKEN_FILE}

upstem

{
  "timeout": {
    "connect": 6,
    "send": 6,
    "read": 6
  },
  "type": "roundrobin",
  "scheme": "http",
  "discovery_type": "kubernetes",
  "pass_host": "pass",
  "name": "traffic",
  "service_name": "traffic/mew-traffic-webapi-nodeport:31002",
  "keepalive_pool": {
    "idle_timeout": 60,
    "requests": 1000,
    "size": 320
  }
}

@tzssangglass
Copy link
Member

Getting more debug logs by: #7026 (comment) helps us to determine if it is the same issue.

@zhixiongdu027
Copy link
Contributor

So far, the faults in the use of kubernetes discovery that I have found mainly include four aspects:

  1. Using kubernetes discovery in version 2.13, if the configuration value refers to environment variables (the default configuration will be used automatically), it needs to be injected through nginx_config.envs
   discovery:
      kubernetes: { }
    nginx_config:                     # config for render the template to genarate nginx.conf
      envs:
        - KUBERNETES_SERVICE_HOST
        - KUBERNETES_SERVICE_PORT
  1. The server_name address configuration is incorrect

service_name should match pattern: [namespace]/[name]:[portName]
namespace: The namespace where the Kubernetes endpoints is located
name: The name of the Kubernetes endpoints
portName: The ports.name value in the Kubernetes endpoints, if there is no ports.name, use targetPort, port instead

  1. ServiceAccount permission is not enough

Q: What permissions do [ServiceAccount](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-> account/) require?

A: ServiceAccount requires the permissions of cluster-level [ get, list, watch ] endpoints resources, the declarative

  1. The proxy network timeout does not match the timeout of the watch apiserver

    see issue help request: As a user, I use kubernetes service discovery ,same apisix instance ,It took a long time to get the changed ip #8313

    you can check against the list . @robertluoxu

Copy link

github-actions bot commented Dec 8, 2023

This issue has been marked as stale due to 350 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the dev@apisix.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Dec 8, 2023
Copy link

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants