Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consul Services Register with port instead of nodePort #299

Closed
reddog335 opened this issue Jul 23, 2020 · 8 comments
Closed

Consul Services Register with port instead of nodePort #299

reddog335 opened this issue Jul 23, 2020 · 8 comments
Labels
area/sync Related to catalog sync type/question Question about product, ideally should be pointed to discuss.hashicorp.com

Comments

@reddog335
Copy link

reddog335 commented Jul 23, 2020

Hi all. We are migrating from Swarm orchestration to Kubernetes orchestration and I’m having a heck of a time getting Consul Service registration sorted out. We have an external Consul cluster and I’ve successfully deployed the Consul agents and the consul-k8s Catalog sync. My new service successfully registers to the Consul cluster; however, it always registers with the port/targetPort instead of the nodePort. We do no want to get into the business of managing static ports so we do not specify a port in our Kubernetes service manifest and let the cluster assign a free port to nodePort. I need to force it to register with the nodePort that gets assigned by the K8 cluster. Anyone have any ideas on how to force a service to register with nodePort instead of port?

`
apiVersion: v1
kind: Service
metadata:
name: k8-consul-test
namespace: vcs-lab
annotations:
"consul.hashicorp.com/service-name": k8-consul-test
"consul.hashicorp.com/service-port": spec.ports[].nodePort
"consul.hashicorp.com/service-sync": "true"
spec:
type: NodePort
ports:

  • nodePort:
    targetPort: 80
    port: 80
    selector:
    app: k8-consul-test
    `

I originally opened a Consul Discussion Topic (https://discuss.hashicorp.com/t/consul-services-register-with-port-instead-of-nodeport/11566) and was directed to open an issue here as it appears to be a bug.

@thisisnotashwin thisisnotashwin added the type/bug Something isn't working label Jul 27, 2020
@kschoche kschoche added the area/sync Related to catalog sync label Jul 28, 2020
@kschoche
Copy link
Contributor

Hi @reddog335 - I'm having some difficulty in reproducing the issue, could you provide a bit more information on the specifics of what you're seeing?

I was able to use the following workflow, deploying using similar yaml to what you've provided, a Service as well as a deployment which references it and then able to reach it using the ServicePort referenced in Consul.

apiVersion: v1
kind: Service
metadata:
  name:  k8-consul-test
  annotations:
    "consul.hashicorp.com/service-name": k8-consul-test
    "consul.hashicorp.com/service-port": spec.ports[].nodePort
    "consul.hashicorp.com/service-sync": "true"
spec:
  type: NodePort
  ports:
    - targetPort: 9001
      port: 9000
  selector:
    app:  k8-consul-test
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8-consul-test
  labels:
    app: k8-consul-test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: k8-consul-test
  template:
    metadata:
      labels:
        app: k8-consul-test
    spec:
      containers:
      - name: k8-consul-test
        image: hashicorp/counting-service:0.0.2
        ports:
        - containerPort: 9001
          name: http
      serviceAccountName: k8-consul-test
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: k8-consul-test

Inside the metadata for the service which is registered in Consul

$ curl http://127.0.0.1:8500/v1/catalog/service/k8-consul-test | jq .
  <snip>
 },
    "ServiceMeta": {
      "external-k8s-ns": "default",
      "external-source": "kubernetes",
      "port-": "9000"
    },
    "ServicePort": 30487,  <----------- this port is used by the service from k8s
    "ServiceEnableTagOverride": false,
    "ServiceProxy": {
      "MeshGateway": {},
      "Expose": {}
    },
<snip>

I was then able to curl to the externalIP:servicePort from my laptop :

$ curl $EXTERNAL_IP:30487
{"count":2,"hostname":"k8-consul-test-5559975597-b6k2f"}%

Is this an accurate reproduction of your environment?

@kschoche kschoche added the waiting-reply Waiting on the issue creator for a response before taking further action label Jul 28, 2020
@reddog335
Copy link
Author

reddog335 commented Jul 30, 2020

Hi @kschoche Thank you for your response and I apologize for the delayed update, I just got back from vacation. It appears the issue must be with my consul agent. I copy/pasted your deployment/service yaml and deployed it into my Kubernetes cluster. Unlike your test which correctly registered the assigned nodePort, mine registered the targetPort instead of the nodePort.

curl -k https://consul-sandbox.mutualofomaha.com/v1/catalog/service/k8-consul-test | jq .

"ServiceMeta": {},
"ServicePort": 9001,
"ServiceEnableTagOverride": false,
"ServiceProxy": {
"MeshGateway": {},
"Expose": {}
},

Below is the yaml for my consul agent daemon set:

`
apiVersion: apps/v1
kind: DaemonSet
metadata:
annotations:
meta.helm.sh/release-name: sandbox
meta.helm.sh/release-namespace: vcs-dev
labels:
app: consul
app.kubernetes.io/managed-by: Helm
chart: consul-helm
heritage: Helm
release: sandbox
name: sandbox
namespace: vcs-dev
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
app: consul
chart: consul-helm
component: client
hasDNS: "true"
release: sandbox
template:
metadata:
annotations:
consul.hashicorp.com/connect-inject: "false"
creationTimestamp: null
labels:
app: consul
chart: consul-helm
component: client
hasDNS: "true"
release: sandbox
spec:
containers:
- command:
- /bin/sh
- -ec
- |
CONSUL_FULLNAME="sandbox"

      exec /bin/consul agent \
        -node="${NODE}" \
        -advertise="${ADVERTISE_IP}" \
        -bind=0.0.0.0 \
        -client=0.0.0.0 \
        -node-meta=pod-name:${HOSTNAME} \
        -hcl='leave_on_terminate = true' \
        -hcl='ports { grpc = 8502 }' \
        -config-dir=/consul/config \
        -datacenter=ho \
        -data-dir=/consul/data \
        -retry-join="<redacted>:7001" \
        -domain=consul
    env:
    - name: ADVERTISE_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    - name: NAMESPACE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.namespace
    - name: NODE
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: spec.nodeName
    - name: HOST_IP
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: status.hostIP
    image: consul:1.8.0
    imagePullPolicy: IfNotPresent
    name: consul
    ports:
    - containerPort: 8500
      hostPort: 8500
      name: http
      protocol: TCP
    - containerPort: 8502
      hostPort: 8502
      name: grpc
      protocol: TCP
    - containerPort: 8301
      hostPort: 8301
      name: serflan-tcp
      protocol: TCP
    - containerPort: 8301
      hostPort: 8301
      name: serflan-udp
      protocol: UDP
    - containerPort: 8302
      name: serfwan
      protocol: TCP
    - containerPort: 8300
      name: server
      protocol: TCP
    - containerPort: 8600
      name: dns-tcp
      protocol: TCP
    - containerPort: 8600
      name: dns-udp
      protocol: UDP
    readinessProbe:
      exec:
        command:
        - /bin/sh
        - -ec
        - |
          curl http://127.0.0.1:8500/v1/status/leader \
          2>/dev/null | grep -E '".+"'
      failureThreshold: 3
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        cpu: 100m
        memory: 100Mi
      requests:
        cpu: 100m
        memory: 100Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /consul/data
      name: data
    - mountPath: /consul/config
      name: config
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: sandbox-client
  serviceAccountName: sandbox-client
  terminationGracePeriodSeconds: 10
  volumes:
  - hostPath:
      path: /opt/consul
      type: DirectoryOrCreate
    name: data
  - hostPath:
      path: /etc/consul
      type: ""
    name: config

updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
`
Is there an agent configuration setting I'm missing to tell the consul agent to register the nodePort instead of the targetPort?

@kschoche
Copy link
Contributor

Hi @reddog335! thank you for getting back, I don't see anything obvious in your agent config, do you think you could pass along your custom values file that you used when installing via helm? If there is a misconfiguration it is likely there.

@reddog335
Copy link
Author

reddog335 commented Jul 31, 2020

Hi @kschoche! I've attached the values.yaml file I used when installing via helm.

values.yaml.txt

@kschoche kschoche removed the waiting-reply Waiting on the issue creator for a response before taking further action label Jul 31, 2020
@reddog335
Copy link
Author

@kschoche Do you need any additional information?

@kschoche
Copy link
Contributor

kschoche commented Aug 25, 2020

Hi @reddog335! Sorry for the delayed response, I actually have not been able to reproduce this issue yet in house but I think I just stumbled on what might be going on here after coming back to it.

There are a couple syntax errors in your yaml file which appear to have let you setup a configuration which I believe shouldn't be functional. Because of the syntax errors you probably have deployed servers which allowed you to proceed in the first place, but in a weird config, without these servers you'd have noticed the consul-clients never come online.

You have server.enabled = "false" and "false" actually evaluates to true, you'll need to drop the quotes.
Secondly you have that same syntax error on client.enabled = "true", but it evaluates to true anyways in our helm templates.

If I swap these back to valid syntax of false and true respectively I end up with a broken installation where the clients never become Ready because you have not setup externalServers.
You have externalServers.enabled = false and need to setup the externalServers stanza to reach your remote consul cluster.

I suspect if you set this up correctly you'll probably be good to go.

Also as a friendly tip, with helm you only need to maintain a custom values file which contains the deltas from the default helm values, it is much easier to read and maintain in this fashion, eg :

global:
  name: kyle-consul
  enabled: false
server:
  enabled: false   #<-- I removed the quotes
client:
  enabled: true    #<-- Removed these quotes also
externalServers:
  enabled: false   #<-- this stanza needs to be populated appropriately

If you're able to get up and running with this please do let me know!

@kschoche kschoche added the waiting-reply Waiting on the issue creator for a response before taking further action label Aug 25, 2020
@reddog335
Copy link
Author

Hey @kschoche! Thank you for your response. Man, I jacked that up by the numbers. I'll try your suggestions and get back to you. Thanks for the pro-tip on helm and for all of your help!

@lkysow
Copy link
Member

lkysow commented Sep 2, 2020

I'm going to close this for now but if that's not working let us know and we'll re-open.

@lkysow lkysow closed this as completed Sep 2, 2020
@lkysow lkysow added type/question Question about product, ideally should be pointed to discuss.hashicorp.com and removed type/bug Something isn't working waiting-reply Waiting on the issue creator for a response before taking further action labels Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sync Related to catalog sync type/question Question about product, ideally should be pointed to discuss.hashicorp.com
Projects
None yet
Development

No branches or pull requests

4 participants