Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.2.0+ trying to connect to CA when Connect protocol is disabled #4421

Closed
pztrn opened this issue Jul 20, 2018 · 10 comments
Closed

v1.2.0+ trying to connect to CA when Connect protocol is disabled #4421

pztrn opened this issue Jul 20, 2018 · 10 comments
Labels
theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies type/bug Feature does not function as expected
Milestone

Comments

@pztrn
Copy link

pztrn commented Jul 20, 2018

That's my setup:

  1. Kubernetes (minikube on my machine).
  2. 3 Consul instances successfully clustered, from consul:1.2.1 image.
  3. Configuration:
{
    "connect": {
        "enabled": false
    },
    "encrypt_verify_incoming": false,
    "encrypt_verify_outgoing": false,
    "log_level": "trace",
    "performance": {
        "raft_multiplier": 1
    },
    "verify_incoming": false,
    "verify_incoming_rpc": false,
    "verify_outgoing": false,
    "verify_server_hostname": false
}
  1. Using github.com/hashicorp/consul/api like this:
func (s *Service) RegisterAtConsul(port int, versionMajor string, versionFull string, registerHealthCheck bool) error {
    s.versionFull = versionFull
    s.versionFullDashes = strings.Replace(s.versionFull, ".", "-", -1)
    s.versionMajor = versionMajor

    // Register with consul with only major version.
    agent := s.Client.Agent()
    serviceMajorVersion := &api.AgentServiceRegistration{
        Name:    s.Name + "-" + s.versionMajor,
        Tags:    s.Tags,
        Address: s.localIP,
        Port:    port,
    }

    serviceMajorVersion.ID = s.ID + "-" + s.versionMajor

    if registerHealthCheck {
        serviceMajorVersion.Check = &api.AgentServiceCheck{
            Name:     serviceMajorVersion.ID + " HTTP check",
            HTTP:     "http://" + s.localIP + ":" + strconv.Itoa(port) + "/api/v1/healthCheck/",
            Method:   http.MethodPost,
            Interval: "5s",
            Timeout:  "2s",
        }
    }

    err := agent.ServiceRegister(serviceMajorVersion)
    if err != nil {
        return err
    }

    // Register at consul with full version.
    serviceFullVersion := &api.AgentServiceRegistration{
        Name:    s.Name + "-" + s.versionFullDashes,
        Tags:    s.Tags,
        Address: s.localIP,
        Port:    port,
    }

    serviceFullVersion.ID = s.ID + "-" + s.versionFullDashes

    if registerHealthCheck {
        serviceFullVersion.Check = &api.AgentServiceCheck{
            Name:     serviceFullVersion.ID + " HTTP check",
            HTTP:     "http://" + s.localIP + ":" + strconv.Itoa(port) + "/api/v1/healthCheck/",
            Method:   http.MethodPost,
            Interval: "5s",
            Timeout:  "2s",
        }
    }

    err1 := agent.ServiceRegister(serviceFullVersion)
    if err1 != nil {
        return err1
    }

    return nil
}

Where s.Client - from api.NewClient().

I haven't created AgentServiceConnect structures, do I think it should be disabled by default.

Everything works, service registers and address can be reached via HTTP or DNS requests. The problem with CA bootstraping which I want to disable, as Consul cluster runs in private environment without internet access. On service registering and de-registering it executes a request which push back this error:

2018/07/17 09:45:30 [ERR] http: Request GET /v1/agent/connect/ca/leaf/test-192.168.137.144, error: cluster has no CA bootstrapped from=192.168.99.1:54062

After first message appeared it starts to eat 100-150% CPU of 4-core host constantly. First message appeared right after first service registration.

consul info output from master (which eats CPU):

agent:
        check_monitors = 0
        check_ttls = 0
        checks = 0
        services = 0
build:
        prerelease =
        revision = 39f93f01
        version = 1.2.1
consul:
        bootstrap = false
        known_datacenters = 1
        leader = true
        leader_addr = 172.20.0.4:8300
        server = true
raft:
        applied_index = 146
        commit_index = 146
        fsm_pending = 0
        last_contact = 0
        last_log_index = 146
        last_log_term = 2
        last_snapshot_index = 0
        last_snapshot_term = 0
        latest_configuration = [{Suffrage:Voter ID:3634851e-5b58-d0ad-3f22-546de4d294c5 Address:172.20.0.4:8300} {Suffrage:Voter ID:ee5204e4-2a30-f83f-3dde-6c2e0d46902d Address:172.20.0.5:8300} {Suffrage:Voter ID:27a3b189-89e6-ec95-0177-003b023fdefb Address:172.20.0.6:8300}]
        latest_configuration_index = 1
        num_peers = 2
        protocol_version = 3
        protocol_version_max = 3
        protocol_version_min = 0
        snapshot_version_max = 1
        snapshot_version_min = 0
        state = Leader
        term = 2
runtime:
        arch = amd64
        cpu_count = 2
        goroutines = 12433
        max_procs = 2
        os = linux
        version = go1.10.1
serf_lan:
        coordinate_resets = 0
        encrypted = false
        event_queue = 0
        event_time = 2
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 4
        members = 3
        query_queue = 0
        query_time = 1
serf_wan:
        coordinate_resets = 0
        encrypted = false
        event_queue = 0
        event_time = 1
        failed = 0
        health_score = 0
        intent_queue = 0
        left = 0
        member_time = 5
        members = 3
        query_queue = 0
        query_time = 1

Complete log for register-deregister with TRACE level:

2018/07/17 10:05:07 [ERR] http: Request GET /v1/agent/connect/ca/leaf/test-192.168.137.144, error: cluster has no CA bootstrapped from=192.168.99.1:54892
2018/07/17 10:05:07 [ERR] http: Request GET /v1/agent/connect/ca/leaf/test-192.168.137.144, error: cluster has no CA bootstrapped from=192.168.99.1:54888
2018/07/17 10:05:07 [INFO] agent: Synced service "test-192.168.137.144-0"
2018/07/17 10:05:08 [INFO] agent: Synced service "test-192.168.137.144-0-1-1"
2018/07/17 10:05:09 [INFO] agent: Deregistered service "test-192.168.137.144-0"
2018/07/17 10:05:09 [INFO] agent: Deregistered service "test-192.168.137.144-0-1-1"
2018/07/17 10:05:09 [ERR] http: Request GET /v1/agent/connect/ca/leaf/test-192.168.137.144, error: cluster has no CA bootstrapped from=192.168.99.1:54902
2018/07/17 10:05:10 [INFO] agent: Synced service "test-192.168.137.144-0"
2018/07/17 10:05:11 [INFO] agent: Synced service "test-192.168.137.144-0-1-1"
2018/07/17 10:05:11 [INFO] agent: Deregistered service "test-192.168.137.144-0"
2018/07/17 10:05:12 [INFO] agent: Deregistered service "test-192.168.137.144-0-1-1"
2018/07/17 10:05:12 [ERR] http: Request GET /v1/agent/connect/ca/leaf/test-192.168.137.144, error: cluster has no CA bootstrapped from=192.168.99.1:54888
2018/07/17 10:05:12 [ERR] http: Request GET /v1/agent/connect/ca/leaf/test-192.168.137.144, error: cluster has no CA bootstrapped from=192.168.99.1:54886
2018/07/17 10:05:12 [ERR] http: Request GET /v1/agent/connect/ca/leaf/test-192.168.137.144, error: cluster has no CA bootstrapped from=192.168.99.1:54908
2018/07/17 10:05:13 [INFO] agent: Synced service "test-192.168.137.144-0"
2018/07/17 10:05:13 [INFO] agent: Synced service "test-192.168.137.144-0-1-1"
2018/07/17 10:05:14 [INFO] agent: Deregistered service "test-192.168.137.144-0"
2018/07/17 10:05:14 [INFO] agent: Deregistered service "test-192.168.137.144-0-1-1"

Configuration for Kubernetes:

  1. Service definition:
apiVersion: v1
kind: Service
metadata:
  name: consul
  labels:
    name: consul
spec:
  clusterIP: None
  ports:
    - name: http
      port: 8500
      targetPort: 8500
    - name: https
      port: 8443
      targetPort: 8443
    - name: rpc
      port: 8400
      targetPort: 8400
    - name: serflan-tcp
      protocol: "TCP"
      port: 8301
      targetPort: 8301
    - name: serflan-udp
      protocol: "UDP"
      port: 8301
      targetPort: 8301
    - name: serfwan-tcp
      protocol: "TCP"
      port: 8302
      targetPort: 8302
    - name: serfwan-udp
      protocol: "UDP"
      port: 8302
      targetPort: 8302
    - name: server
      port: 8300
      targetPort: 8300
    - name: consuldns
      port: 8600
      targetPort: 8600
  selector:
    app: consul
  1. StatefulSet definition:
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: consul
spec:
  serviceName: consul
  replicas: 3
  selector:
    matchLabels:
      app: consul
  template:
    metadata:
      labels:
        app: consul
    spec:
      #affinity:
      #  podAntiAffinity:
      #    requiredDuringSchedulingIgnoredDuringExecution:
      #      - labelSelector:
      #          matchExpressions:
      #            - key: app
      #              operator: In
      #              values:
      #                - consul
      #        topologyKey: kubernetes.io/hostname
      terminationGracePeriodSeconds: 10
      containers:
        - name: consul
          image: "consul:1.2.1"
          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
          args:
            - "agent"
            - "-advertise=$(POD_IP)"
            - "-bind=0.0.0.0"
            - "-bootstrap-expect=3"
            - "-retry-join=consul-0.consul.$(NAMESPACE).svc.cluster.local"
            - "-retry-join=consul-1.consul.$(NAMESPACE).svc.cluster.local"
            - "-retry-join=consul-2.consul.$(NAMESPACE).svc.cluster.local"
            - "-client=0.0.0.0"
            - "-config-file=/consul/myconfig/config.json"
            - "-datacenter=dc1"
            - "-data-dir=/consul/mydata"
            - "-server"
            - "-ui"
            - "-disable-host-node-id"
          volumeMounts:
            - name: data
              mountPath: /consul/mydata
            - name: config
              mountPath: /consul/myconfig
          lifecycle:
            preStop:
              exec:
                command:
                - /bin/sh
                - -c
                - consul leave
          ports:
            - containerPort: 8500
              name: ui-port
            - containerPort: 8400
              name: alt-port
            - containerPort: 53
              name: udp-port
            - containerPort: 8443
              name: https-port
            - containerPort: 8080
              name: http-port
            - containerPort: 8301
              name: serflan
            - containerPort: 8302
              name: serfwan
            - containerPort: 8600
              name: consuldns
            - containerPort: 8300
              name: server
      volumes:
        - name: config
          configMap:
            name: consul
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi

Cluster was created with these commands:

kubectl create configmap consul --from-file=config.json=server.json
kubectl create -f service.yaml
kubectl create -f statefulset-dev.yaml
@banks banks added this to the 1.2.2 milestone Jul 20, 2018
@banks banks added type/bug Feature does not function as expected theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies labels Jul 20, 2018
@banks
Copy link
Member

banks commented Jul 20, 2018

Hi @pztrn

I asked in the mailing list thread about this for a couple of additional things that would help understand exactly how you hit this issue and got into the current state.

But even without understanding exactly how you got into this state I can guess what the bug is: when connect is disabled and there is no CA configured, we assume nothing will attempt to fetch certificates however if it does it hits this bug during watch because the CA response never blocks.

Unverified but I think the fix is either to make the CA and certificate endpoints on the server either still block even if there is no connect enabled, or make them actually error with 500 or similar such that clients will back off or even stop retrying instead of busy loop.


If you are interested in understanding what's going on in your cluster (I would be) here is my train of though for you to follow and confirm if it's correct.

You have connect disabled in config and you are not registering a managed proxy in your service call.

But something is requesting a leaf Connect certificate /v1/agent/connect/ca/leaf/test-192.168.137.144. The only software I know of that makes that request is the built in Connect proxy.

The possible explanations I can think of are:

  1. you have other services being registered on the agent with config/code you are not showing here.
  2. you once did register a service with a connect proxy via the API on this agent and it's persisted the state and attempts to start a proxy for it still
  3. you once did register a service with a managed connect proxy, shutdown your agent and deleted it's state dir so the agent no longer knows about the proxy, but the proxy is still running as an orphan (since agent restarts can't drop proxy connections).
  4. there is a really strange bug that causes proxies to start even when nothing configured on (this seems very unlikely at this point).

If you an confirm any one of those is true then great.

If not then it would be useful to see:

  • the logs from when the agent starts up (this will tell us if it's loading proxy state from disk). Ideally full logs in a separate gist or similar. Feel free to email if you are concerned about posting them publicly.
  • what ps aux | grep consul looks like. I suspect you have at least one consul connect proxy process running. You can kill any proxies you find and they will either be gone for good if the agent no longer knows about them or will be respawned by the agent until the service that requested them is deregistered

Hope this is useful.

@pztrn
Copy link
Author

pztrn commented Jul 23, 2018

Hey, sorry for being quiet, was out of city (and without internet) :)

First, I haven't registered any Connect proxies. The code was:

func (s *Service) RegisterAtConsul(port int, versionMajor string, versionFull string, registerHealthCheck bool) error {
	s.versionFull = versionFull
	s.versionFullDashes = strings.Replace(s.versionFull, ".", "-", -1)
	s.versionMajor = versionMajor

	// Register with consul for major version.
	agent := s.Client.Agent()
	serviceMajorVersion := &api.AgentServiceRegistration{
		Name:    s.Name + "-" + s.versionMajor,
		Tags:    s.Tags,
		Address: s.localIP,
		Port:    port,
	}

	serviceMajorVersion.ID = s.ID + "-" + s.versionMajor

	if registerHealthCheck {
		serviceMajorVersion.Check = &api.AgentServiceCheck{
			Name:     serviceMajorVersion.ID + " HTTP check",
			HTTP:     "http://" + s.localIP + ":" + strconv.Itoa(port) + "/api/v1/healthCheck/",
			Method:   http.MethodPost,
			Interval: "5s",
			Timeout:  "2s",
		}
	}

	err := agent.ServiceRegister(serviceMajorVersion)
	if err != nil {
		return err
	}

	// Register at consul with full version.
	serviceFullVersion := &api.AgentServiceRegistration{
		Name:    s.Name + "-" + s.versionFullDashes,
		Tags:    s.Tags,
		Address: s.localIP,
		Port:    port,
	}

	serviceFullVersion.ID = s.ID + "-" + s.versionFullDashes

	if registerHealthCheck {
		serviceFullVersion.Check = &api.AgentServiceCheck{
			Name:     serviceFullVersion.ID + " HTTP check",
			HTTP:     "http://" + s.localIP + ":" + strconv.Itoa(port) + "/api/v1/healthCheck/",
			Method:   http.MethodPost,
			Interval: "5s",
			Timeout:  "2s",
		}
	}

	err1 := agent.ServiceRegister(serviceFullVersion)
	if err1 != nil {
		return err1
	}

	return nil
}

where s.Client from api.NewClient().

As you can see there is only service registration without proxifying.

Moreover, every experiment was done with persistent storage removal, so no states was restored. Also, last log line from agent was about successful cluster synchronization after master election, with any log level.

About possible explanations - it's number 4. Again - nothing was registered with (or as) Connect proxy, unless Consul's Golang API package does something nasty and, despite on disabling Connect things, still tries to use it.

I'll attach logs and outputs in next message, within next couple of minutes.

@pztrn
Copy link
Author

pztrn commented Jul 23, 2018

ps aux output BEFORE register-unregister thing:

PID   USER     TIME   COMMAND
    1 root       0:00 {docker-entrypoi} /usr/bin/dumb-init /bin/sh /usr/local/bin/docker-entrypoint.sh agent -advertise=172.20.0.4 -bind=0.0.0.0 -bootstrap-expect=3 -retry-join=consul-0.consul.default.svc.cluster.local -retry-join=consul-1.consul.default.svc.cluster.local -retry-join=consul-2.consul.default.svc.cluster.local -client=0.0.0.0 -config-file=/consul/myconfig/config.json -datacenter=dc1 -data-dir=/consul/mydata -server -ui -disable-host-node-id
    6 consul     0:02 consul agent -data-dir=/consul/data -config-dir=/consul/config -advertise=172.20.0.4 -bind=0.0.0.0 -bootstrap-expect=3 -retry-join=consul-0.consul.default.svc.cluster.local -retry-join=consul-1.consul.default.svc.cluster.local -retry-join=consul-2.consul.default.svc.cluster.local -client=0.0.0.0 -config-file=/consul/myconfig/config.json -datacenter=dc1 -data-dir=/consul/mydata -server -ui -disable-host-node-id
   22 root       0:00 consul monitor -log-level=trace
   35 root       0:00 ps aux

Here we go, no proxy, no connect, nothing :)

Logs:

consul-before-reg-unreg.log

And after REGISTER-UNREGISTER exec:

ps aux:

PID   USER     TIME   COMMAND
    1 root       0:00 {docker-entrypoi} /usr/bin/dumb-init /bin/sh /usr/local/bin/docker-entrypoint.sh agent -advertise=172.20.0.4 -bind=0.0.0.0 -bootstrap-expect=3 -retry-join=consul-0.consul.default.svc.cluster.local -retry-join=consul-1.consul.default.svc.cluster.local -retry-join=consul-2.consul.default.svc.cluster.local -client=0.0.0.0 -config-file=/consul/myconfig/config.json -datacenter=dc1 -data-dir=/consul/mydata -server -ui -disable-host-node-id
    6 consul     0:18 consul agent -data-dir=/consul/data -config-dir=/consul/config -advertise=172.20.0.4 -bind=0.0.0.0 -bootstrap-expect=3 -retry-join=consul-0.consul.default.svc.cluster.local -retry-join=consul-1.consul.default.svc.cluster.local -retry-join=consul-2.consul.default.svc.cluster.local -client=0.0.0.0 -config-file=/consul/myconfig/config.json -datacenter=dc1 -data-dir=/consul/mydata -server -ui -disable-host-node-id
   22 root       0:00 consul monitor -log-level=trace
   46 root       0:00 ps aux

Full logs:

consul-after-reg-unreg.log

@banks
Copy link
Member

banks commented Jul 23, 2018

Hmm @pztrn

Apologies if I'm confused. The logs you posted there are indeed clean with no sign of proxies, however they also don't have the bug you described here - no failed attempts to fetch a leaf certificate, so it's hard to know how much that helps.

Have you completely wiped this agent?

About possible explanations - it's number 4

I'm still not convinced mostly because this is the only case we've seen in nearly a month - even if it is a bug it's clearly not an obvious one where we accidentally forgot to check about connect and always run proxies no matter what!

We'll fix the known bug here which is CPU burn loop in this case regardless, but I still don't have a good story for how you have some Connect client running despite apparently never intending to start one.

I realise you've pretty much said this lots of times but just to be totally sure we are not talking past each other could you answer these questions explicitly by number/quote so we can rule it out completely?

  1. Have you ever had connect enabled on this agent -- even if state was wiped out since you did?
  2. Have you ever registered a service with connect proxy on this agent -- even if state was wiped out? (this is not redundant - you might have done this even without it being enabled in config which is important)
  3. Are you using the "native" SDK anywhere? That means hashicorp/consul/connect and registering a service with connect.NewService()? (guess not just want to rule it out)
  4. Are you running proxies manually using consul connect proxy maybe as separate containers (since it seems your agent is running in Docker)? E.g. following the nomad guide?
  5. Do you still see those failing Leaf requests anymore - last logs you posted had none in?

Here is another clue: GET /v1/agent/connect/ca/leaf/test-192.168.137.144

The last segment of that URL is the service_id, if you look in your original posted logs right next to it and the code you posted all the services that code registers have the version (major or full) appended like test-192.168.137.144-0. That means that whatever it was that registered a proxy and is trying to fetch it there was not from the code you posted here (or at least an earlier version of it).

Hope this helps you track down what happened. If you aren't seeing it anymore then it's up to you if you want to continue trying to work out how it started - as mentioned we'll fix the actual CPU bug here anyway. I'm 99.9% sure it's not a bug that started a proxy completely on it's own without ever being asked, but even if you clear out state that isn't a full reset since proxy processes are left running by the agent and keep going in an orphaned state. That's my best guess for your case still.

@pztrn
Copy link
Author

pztrn commented Jul 23, 2018

Have you completely wiped this agent?

Yes. Storage was definetely wiped.

this is the only case we've seen in nearly a month

Probably I have unique setup :D

I'm sorry for not answering your questions, because I've just found that I lied - I've just found connect.NewService() line... Aaargh, sorry for wasting your time about that :(. Probably I've followed bad quick-start-guide I googled, because your site contains none of them :D.

So, is CA required for Connect service to work? If so - then why there is no message in log like "trying to register Connect proxy without bootstrapping CA"?

Anyway, we have two questions then:

  1. CPU burning loop.
  2. Non-informative log line. CA wasn't bootstrapped, and what then? :)

Probably separate issues should be created for them?

@banks
Copy link
Member

banks commented Jul 23, 2018

I'm sorry for not answering your questions, because I've just found that I lied - I've just found connect.NewService() line... Aaargh, sorry for wasting your time about that :(.

No worries! Glad we figure that out.

Probably I've followed bad quick-start-guide I googled, because your site contains none of them :D.

We do have https://www.consul.io/intro/getting-started/connect.html as well as documentation on native app integration.

We can certainly do better though - can you give some feedback I can share with the team about what you were trying to do and whether or not it's covered in those docs. If it is then maybe we need to make them more discoverable somehow.

So, is CA required for Connect service to work?

Yes. Connect is a TLS feature so it certainly needs to be enabled and certainly needs a CA to actually sign certificates!

then why there is no message in log like "trying to register Connect proxy without bootstrapping CA"?

Great suggestion, we should try to make that more obvious. Mostly it just didn't occur to us people would try to run proxies/native apps without connect support enabled but totally reasonable to make that not buggy and have better error message.

In practice though, the agent doesn't necessarily know that connect is disabled (since we only require that config on the servers) until something (the proxy or a native app) tries to request certificates.

In this case your connect.NewService was registering a native integration not a proxy which means that the agent had no chance to give you a meaningful error until your native app actually tried to load it's certificates. I'll at least make that not burn CPU but it already logs cluster has no CA bootstrapped which is pretty descriptive of what happened. I can maybe make that point out that this is because connect is disabled too if that helps?

Probably separate issues should be created for them?

In this case I'll fix the real bug and try to do what i can to make the logs helpful in this case so no separate issue needed, thanks!

@banks
Copy link
Member

banks commented Jul 23, 2018

As expected this is trivial to reproduce:

  1. Run an agent with connect disabled
    $ consul agent -dev -hcl 'connect { enabled = false }'
    
  2. Try and fetch a certificate for a made up service
    $ curl -s -i http://127.0.0.1:8500/v1/agent/connect/ca/leaf/foo
    HTTP/1.1 500 Internal Server Error
    Vary: Accept-Encoding
    Date: Mon, 23 Jul 2018 19:39:15 GMT
    Content-Length: 30
    Content-Type: text/plain; charset=utf-8
    
    cluster has no CA bootstrapped
    
  3. Watch CPU melt:

image

@pztrn
Copy link
Author

pztrn commented Jul 24, 2018

We can certainly do better though - can you give some feedback I can share with the team about what you were trying to do and whether or not it's covered in those docs. If it is then maybe we need to make them more discoverable somehow.

You have Golang API - make a quick start for using it with common caveats like using Connect without bootstrapping CA. I was one step before forking Consul and deleting whole Connect thing for internal usage :D

agent had no chance to give you a meaningful error until your native app actually tried to load it's certificates

There is always a chance to give such error. For example by adding additional HTTP endpoint which checks for Connect to be enabled at Consul cluster. As it will use configuration value cached in RAM - it will be blazing fast and will give possibility to additionally print "Connect is disabled but remote client tries to register" in logs on agent and return same error to client that tried to register. This also can prevent launching SSL certificates watching in separate goroutine.

@banks
Copy link
Member

banks commented Jul 24, 2018

Thanks for the feedback!

return same error to client that tried to register

Yeah specifically in this case when you use the SDK connect.NewService it is not doing a registration - we expect you to continue registering the service the same way you otherwise do which might be via a cluster scheduler or config file etc. So there is no "register" happening there to send error.

The fact this is not doing service registration is a known-issue in terms of something that surprises people and isn't clearly documented so we'll certainly be fixing that.

The actual certificate request does get an error both in logs on agent and application that currently says connect CA not bootstrapped yet. We could make that more explicit with "This may be because Connect is not enabled" but it's not the only reason for that case (could be a timing issue).

Anyway I'll fix this and take that on board, thanks!

@valarauca
Copy link

Is there a method to disable SPIFFE authentication on endpoints and CA certificate generation?

I'm wondering what the value is of a SPIFFEE authentication when the underlying SNI isn't validated against the host records?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
theme/connect Anything related to Consul Connect, Service Mesh, Side Car Proxies type/bug Feature does not function as expected
Projects
None yet
Development

No branches or pull requests

3 participants