Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAD CERTIFICATE with consul connect #16617

Open
fred-gb opened this issue Mar 11, 2023 · 2 comments
Open

BAD CERTIFICATE with consul connect #16617

fred-gb opened this issue Mar 11, 2023 · 2 comments

Comments

@fred-gb
Copy link

fred-gb commented Mar 11, 2023

Hi 馃憢馃徎

Overview of the Issue

I tried to create hashistack with ACL and TLS on single node for now.
When I launch a test job with consul connect won't work

I posted issue here after this thread on discuss: Nomad discuss after it is advised to post on Consul forum.

Consul 1.15.1
Nomad 1.5

I see in changes what it needed to work. But I don't know how to apply with Nomad job deploy.

I'm not sure about which from Consul or Nomad come from issue


Reproduction Steps

Create a single node Hashistack with Consul 1.15.1 Vault 1.13 (consul backend) Nomad 1.5 .

Launch job with first Sidecar:

job "mosquitto" {
  region = "global"
  datacenters = ["dc1"]
  type = "service"


  group "mosquitto" {

    count = 1

    restart {
      attempts = 10
      interval = "5m"
      delay = "10s"
      mode = "delay"
    }

    network {
      mode = "bridge"

        port "mqtt" {
        to = 1883
        static = 1883
      }
    }

    service {
      name = "mqtt"
      port = "1883"

      connect {
        sidecar_service {}

        sidecar_task {
          resources {
            cpu    = 64
            memory = 64
          }
        }
      }
    }

    task "mosquitto" {
      driver = "docker"

      config {
        image = "eclipse-mosquitto:latest"

        mount {
          type = "bind"
          target = "/mosquitto/config/mosquitto.conf"
          source = "local/mosquitto.conf"
          readonly = false
          bind_options {
            propagation = "rshared"
          }
        }

        ports = ["mqtt"]
      }

      template {
        data = <<EOH
listener 1883
allow_anonymous true
EOH
        destination = "local/mosquitto.conf"
      }

      template {
        data = <<EOH
ANSIBLE_FORCE_COLOR=TRUE

EOH
        destination = "secrets/file.env"
        env         = true
      }

      resources {
        cpu    = 1024
        memory = 1024
      }
    }
  }
}

and second job to connect to sidecar:

job "tester" {
  region = "global"
  datacenters = ["dc1"]
  type = "service"

  group "tester" {

    count = 1

    restart {
      attempts = 10
      interval = "5m"
      delay = "10s"
      mode = "delay"
    }

    network {
      mode = "bridge"
    }

    service {
      name = "mesh"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "mqtt"
              local_bind_port  = "1883"
            }
          }
        }
        sidecar_task {
          resources {
            cpu    = 16
            memory = 16
          }
        }
      }
    }

    task "tester" {
      driver = "docker"

      config {
        image = "alpine:latest"
        entrypoint = ["/bin/sleep", "3600"]
      }

      resources {
        cpu    = 128
        memory = 128
      }
    }
  }
}

Consul info for both Client and Server

Consul Info:

agent:
	check_monitors = 0
	check_ttls = 1
	checks = 9
	services = 10
build:
	prerelease = 
	revision = 7c04b6a0
	version = 1.15.1
	version_metadata = 
consul:
	acl = enabled
	bootstrap = true
	known_datacenters = 1
	leader = true
	leader_addr = 192.168.64.69:8300
	server = true
raft:
	applied_index = 18146
	commit_index = 18146
	fsm_pending = 0
	last_contact = 0
	last_log_index = 18146
	last_log_term = 10
	last_snapshot_index = 16384
	last_snapshot_term = 10
	latest_configuration = [{Suffrage:Voter ID:8f79ed11-c2e9-aefe-796c-a216b9c08055 Address:192.168.64.69:8300}]
	latest_configuration_index = 0
	num_peers = 0
	protocol_version = 3
	protocol_version_max = 3
	protocol_version_min = 0
	snapshot_version_max = 1
	snapshot_version_min = 0
	state = Leader
	term = 10
runtime:
	arch = arm64
	cpu_count = 2
	goroutines = 243
	max_procs = 2
	os = linux
	version = go1.20.1
serf_lan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 1
	event_time = 10
	failed = 0
	health_score = 0
	intent_queue = 1
	left = 0
	member_time = 10
	members = 1
	query_queue = 0
	query_time = 1
serf_wan:
	coordinate_resets = 0
	encrypted = true
	event_queue = 0
	event_time = 1
	failed = 0
	health_score = 0
	intent_queue = 0
	left = 0
	member_time = 1
	members = 1
	query_queue = 0
	query_time = 1

Client and Server HCL config (single node)

{
    "acl": {
        "default_policy": "deny",
        "down_policy": "extend-cache",
        "enable_token_persistence": true,
        "enabled": true,
        "token_ttl": "30s",
        "tokens": {
            "initial_management": "dcdac9a4-e224-5b59-b9dc-2f6bfb55362e",
            "replication": "cfbb5111-31ff-5954-8ec7-8f561bab8c67"
        }
    },
    "addresses": {
        "dns": "0.0.0.0",
        "grpc_tls": "0.0.0.0",
        "http": "0.0.0.0",
        "https": "0.0.0.0"
    },
    "advertise_addr": "192.168.64.69",
    "advertise_addr_wan": "192.168.64.69",
    "auto_encrypt": {},
    "autopilot": {
        "cleanup_dead_servers": false,
        "last_contact_threshold": "200ms",
        "max_trailing_logs": 250,
        "server_stabilization_time": "10s"
    },
    "bind_addr": "192.168.64.69",
    "bootstrap": false,
    "bootstrap_expect": 1,
    "client_addr": "127.0.0.1",
    "connect": {
        "enabled": true
    },
    "data_dir": "/opt/consul",
    "datacenter": "dc1",
    "disable_update_check": false,
    "domain": "consul",
    "enable_local_script_checks": false,
    "enable_script_checks": false,
    "encrypt": "wfuQxs/nL0zNgFtJ54JxK+V+k3aTGBGO9G0PPsVPPDY=",
    "encrypt_verify_incoming": true,
    "encrypt_verify_outgoing": true,
    "log_file": "/var/log/consul/consul.log",
    "log_level": "INFO",
    "log_rotate_bytes": 0,
    "log_rotate_duration": "24h",
    "log_rotate_max_files": 0,
    "performance": {
        "leave_drain_time": "5s",
        "raft_multiplier": 1,
        "rpc_hold_timeout": "7s"
    },
    "ports": {
        "dns": 8600,
	"grpc": 8502,
        "grpc_tls": 8503,
        "http": -1,
        "https": 8501,
        "serf_lan": 8301,
        "serf_wan": 8302,
        "server": 8300
    },
    "primary_datacenter": "dc1",
    "raft_protocol": 3,
    "retry_interval": "30s",
    "retry_interval_wan": "30s",
    "retry_join": [
        "192.168.64.69"
    ],
    "retry_max": 0,
    "retry_max_wan": 0,
    "server": true,
    "tls": {
        "defaults": {
            "ca_file": "/etc/ssl/hashistack/hashistack-ca.pem",
            "cert_file": "/etc/ssl/hashistack/dc1-server-consul.pem",
            "key_file": "/etc/ssl/hashistack/dc1-server-consul.key",
            "tls_min_version": "TLSv1_2",
            "verify_incoming": true,
            "verify_outgoing": true
        },
        "https": {
            "verify_incoming": false
        },
        "internal_rpc": {
            "verify_incoming": true,
            "verify_server_hostname": true
        }
    },
    "translate_wan_addrs": false,
    "ui_config": {
        "enabled": true
    }
}

According to this docs: Nomad Consul Connect integration

Nomad conf:

consul {
    # The address to the Consul agent.
    address      = "127.0.0.1:8501"		
    grpc_address = "127.0.0.1:8503"
    ssl = true
    grpc_ca_file = "/etc/ssl/hashistack/hashistack-ca.pem"
    ca_file = "/etc/ssl/hashistack/hashistack-ca.pem"
    cert_file = "/etc/ssl/hashistack/dc1-server-consul.pem"
    key_file = "/etc/ssl/hashistack/dc1-server-consul.key"
    token = "ebfb82e3-1d84-95d3-22d0-269b427136fb"
    # The service name to register the server and client with Consul.
    server_service_name = "nomad-servers"
    client_service_name = "nomad-clients"
    tags = {}

    # Enables automatically registering the services.
    auto_advertise = true

    # Enabling the server and client to bootstrap using Consul.
    server_auto_join = true
    client_auto_join = true
}

Operating system and Environment details

Ubuntu 22.04 (in VM for testing)

Log Fragments

In Nomad UI:

[2023-03-11 09:41:49.044][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 376s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE
[2023-03-11 09:42:02.143][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 389s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE
[2023-03-11 09:42:15.919][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 403s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE
[2023-03-11 09:42:32.197][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 419s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE
[2023-03-11 09:42:59.585][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 447s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE
[2023-03-11 09:43:21.912][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 469s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE
[2023-03-11 09:43:48.853][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 496s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE
[2023-03-11 09:44:06.962][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 514s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268436498:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_CERTIFICATE

I created certs with openssl and ansible. It works. Without launch job, I have no error in communications between each components of Hashistack.

Help! 馃啒

@fred-gb
Copy link
Author

fred-gb commented Mar 14, 2023

Found a workaround, with this topic: Discuss Hashicorp

In consul config:

tls {
        "grpc": {
                "verify_incoming": false
        }
[...]

But, I don't really understand if exists a solution for Consul 1.15+

Thanks

@fred-gb
Copy link
Author

fred-gb commented Mar 17, 2023

After many tries. No more functionnal.

I tried to create a separate CA and cert.

tls = {
  defaults = {
    ca_file = "/etc/ssl/hashistack-ca.pem"
    cert_file = "/etc/ssl/dc1-server-consul.pem"
    key_file = "/etc/ssl/dc1-server-consul.key"
    tls_min_version = "TLSv1_2"
    verify_incoming = true
    verify_outgoing = true
  }
  grpc = {
    ca_file = "/etc/ssl/envoy-ca.pem"
    cert_file = "/etc/ssl/dc1-server-envoy.pem"
    key_file = "/etc/ssl/dc1-server-envoy.key"
  }
  https = {
    "verify_incoming" = false
  }
  internal_rpc = {
    verify_incoming = true
    verify_server_hostname = true
  }
}

Error message change! 馃コ

Now I have:

[2023-03-17 07:54:56.493][1][warning][config] [./source/common/config/grpc_stream.h:201] DeltaAggregatedResources gRPC config stream to local_agent closed since 371s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED

馃槩

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant