Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Istio support statefulset #10659

Open
hzxuzhonghu opened this issue Dec 26, 2018 · 89 comments
Open

Istio support statefulset #10659

hzxuzhonghu opened this issue Dec 26, 2018 · 89 comments

Comments

@hzxuzhonghu
Copy link
Member

@hzxuzhonghu hzxuzhonghu commented Dec 26, 2018

Istio currently does not support statefulset. There are already many issues related, so here is a umbrella issue.

#10053 #1277 #10490 #10586 #9666

Anyone can add missing issues below.

@hzxuzhonghu hzxuzhonghu self-assigned this Dec 26, 2018
@hzxuzhonghu hzxuzhonghu added this to the 1.2 milestone Dec 26, 2018
@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Dec 26, 2018

We should build listeners and clusters for statefulset pod, like outbound|9042||cassandra-0.cassandra.default.svc.cluster.local . Not just outbound|9042||cassandra.default.svc.cluster.local
https://kubernetes.io/docs/tutorials/stateful-application/cassandra/

@costinm

This comment has been minimized.

Copy link
Contributor

@costinm costinm commented Jan 8, 2019

AFAIK we do support stateful sets - using original dst, with the restriction that stateful sets can't share port.

Cassandra - it is not clear what is wrong, I tried it once and the envoy config looked right but it didn't work. We need to repro again and get tcpdumps.

We can't build n.cassandra.... clusters - too many and we would need too many listeners as well.

On-demand LDS may allow more flexibility and sharing the port - but I think dedicated port is a reasonable option until we have that.

@aminmithil

This comment has been minimized.

Copy link

@aminmithil aminmithil commented Jan 17, 2019

@costinm I have tried with zookeeper and kafka. I am facing similar issue because we use headless service for inter pod communication. What should be the workaround for that? I have tried this solution - #7495. But still zookeeper pods cannot talk to each other. Istio resolve zookeeper.zookeeper.svc.cluster.local instead of resolving zookeeper-0.zookeeper.svc.cluster.local.

@cyucelen

This comment has been minimized.

Copy link

@cyucelen cyucelen commented Jan 19, 2019

cassandra can not handshake even without mTLS.

@diemtvu

This comment has been minimized.

Copy link
Contributor

@diemtvu diemtvu commented Jan 31, 2019

FYI, #10053 is not because of statefulset. After change the cassandra listen_address to localhost, everything now work, with or without mTLS.

Afaik, statefulset is working fine as of today (1.0.x, 1.1)

@supereagle

This comment has been minimized.

Copy link
Member

@supereagle supereagle commented Feb 3, 2019

After change the cassandra listen_address to localhost, everything now work, with or without mTLS.

Does this method also work for other statefulset applications, like Zookeeper?

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Mar 22, 2019

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Mar 22, 2019

@hzxuzhonghu I am trying stateful sets with simple TCP server - Do I have to create multiple Destination Rules one for each host? any idea which steps I should follow - Does it even work?

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Mar 22, 2019

sorry, I donot understand you.

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Mar 22, 2019

@hzxuzhonghu Sorry for being vague. My question was - Is stateful set functionality fully supported in Istio? I am seeing conflicting docs and when I try TCP service - I am not able to make it work. So
My question was

  1. Are there any docs that explain how to configure stateful service?
  2. Some issues mention that we have to create destination services for each of the stateful service node - Is that how it will work?
@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Mar 22, 2019

Is stateful set functionality fully supported in Istio?

I think no. At least this #10659 (comment) is not implemented.

Are there any docs that explain how to configure stateful service?

No

Some issues mention that we have to create destination services for each of the stateful service node - Is that how it will work?

Do you mean DestinationRule? If so, I can hardly think how to config. But i think we can create a service entry for statefulset pod as a workaround.

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Mar 22, 2019

Yes. I meant ‘DestinationRule’. Ok I tried creating ‘ServiceEntry’ also. But let me look at that as well

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 10, 2019

@hzxuzhonghu I tried a stateful set service with three nodes with simple TCP service and found the following

  • Pilot is creating a upstream cluster with ORIGINAL_DST LB for each of the three nodes deployed as stateful sets and
  • However, Pilot is creating only one EGRESS Listener that points to cluster which is regular EDS.
    So if the request lands on this egress listener, it goes to any of the nodes.
    I think as you mentioned #10659 (comment) we need three egress listeners pointing to each of the ORIGINAL_DST clusters. Is that what you are also thinking?
@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Apr 10, 2019

How did you get ORIGINAL_DST cluster, with ServiceEntry.Resolution=NONE ?

I think as you mentioned #10659 (comment) we need three egress listeners pointing to each of the ORIGINAL_DST clusters. Is that what you are also thinking?

Yes, i mean we need three listeners if the service is tcp protocol. Each one

    {
        "name": "xxxxxxx_8000",
        "address": {
            "socketAddress": {
                "address": "xxxxxxx",   // pod0 ip
                "portValue": 8000
            }
        },

...

    {
        "name": "xxxxxx2_8000",
        "address": {
            "socketAddress": {
                "address": "xxxxxxx2",   // pod2 ip
                "portValue": 8000
            }
        },
@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 11, 2019

@hzxuzhonghu I just deployed at stateful TCP service with the following service object

apiVersion: v1
kind: Service
metadata:
  name: istio-tcp-stateful
  labels:
    app: istio-tcp-stateful
    type: LoadBalancer
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: istio-tcp-stateful
  ports:
  - port: 9000
    name: tcp
    protocol: TCP
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: istio-tcp-stateful
spec:
  serviceName: istio-tcp-stateful
  replicas: 3
  template:
    metadata:
      labels:
        app: istio-tcp-stateful
        version: v1
    spec:
      containers:
      - name: tcp-echo-server
        image: ops0-artifactrepo1-0-prd.data.sfdc.net/docker-sam/rama.rao/ramaraochavali/tcp-echo-server:latest
        args: [ "9000", "one" ] # prefix: one
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9000

With the above definition,

  • It generated one ingress listener and inbound cluster which is as expected.
  • It generated one ORIGINAL_DST cluster for outbound - Not sure if this is expected
    {
     "version_info": "2019-04-11T09:30:43Z/70",
     "cluster": {
      "name": "outbound|9000||istio-tcp-stateful.default.svc.cluster.local",
      "type": "ORIGINAL_DST",
      "connect_timeout": "1s",
      "lb_policy": "ORIGINAL_DST_LB",
      "circuit_breakers": {
       "thresholds": [
        {
         "max_retries": 1024
        }
       ]
      }
     },
        "last_updated": "2019-04-11T09:30:51.987Z"
    }
  • And one egress listener - which is definitely an issue as mentioned above.

Did you see any thing different in your tests?

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 12, 2019

I tried with stateful gRPC service and it generated the following cluster of type EDS - Should it not generate as many clusters as there are nodes?

{
     "version_info": "2019-04-12T07:22:04Z/25",
     "cluster": {
      "name": "outbound|7443||istio-test.default.svc.cluster.local",
      "type": "EDS",
      "eds_cluster_config": {
       "eds_config": {
        "ads": {}
       },
       "service_name": "outbound|7443||istio-test.default.svc.cluster.local"
      },
      "connect_timeout": "1s",
      "circuit_breakers": {
       "thresholds": [
        {
         "max_retries": 1024
        }
       ]
      },
      "http2_protocol_options": {
       "max_concurrent_streams": 1073741824
      }
     },
@arielb135

This comment has been minimized.

Copy link

@arielb135 arielb135 commented Apr 14, 2019

Is there any update to this?
i've tried to attach istio to RabbitMQ - with MTLS it failed, without - it succeeded.

i've tried adding ServiceEntry, but it did just ignore the TLS and worked without.

i'm reading mixed posts about it - what's the verdict?

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 15, 2019

i've tried to attach istio to RabbitMQ - with MTLS it failed, without - it succeeded.

Can you share your RabbitMQ service definition and how you attached Istio and how you are accessing individual node without mTLS? Even that is not working for me - I did not test RabbitMQ specifically but a simple TCP service. I want to understand where is the gap.

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 15, 2019

@hzxuzhonghu Did you try http/gRPC stateful service? What do you think about #10659 (comment)? If we create only one EDS cluster like regular services - how can we access individual node? WDYT?

@arielb135

This comment has been minimized.

Copy link

@arielb135 arielb135 commented Apr 16, 2019

i've tried to attach istio to RabbitMQ - with MTLS it failed, without - it succeeded.

Can you share your RabbitMQ service definition and how you attached Istio and how you are accessing individual node without mTLS? Even that is not working for me - I did not test RabbitMQ specifically but a simple TCP service. I want to understand where is the gap.

Hi, so eventually i managed to set up mtls successfully for rabbitmq - which means, it can be done with any stateful set.

the main gotchas are:

  • Register each stateful pod and its ports as a ServiceEntry
  • If a pod uses pod IP - it's a big no, and should use localhost / 127.0.0.1, if not possible - it is require to exclude this port from MTLS
  • note that headless services are not participating in MTLS communication.

I've written a full article and attached a working helm charts with istio and MTLS support:
https://github.com/arielb135/RabbitMQ-with-istio-MTLS

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 16, 2019

Register each stateful pod and its ports as a ServiceEntry

I did try registering each stateful pod as a ServiceEntry. But it did not create the necessary clusters/listeners in Istio 1.1.2. Which version of Istio did you try this with?

@arielb135

This comment has been minimized.

Copy link

@arielb135 arielb135 commented Apr 17, 2019

Register each stateful pod and its ports as a ServiceEntry

I did try registering each stateful pod as a ServiceEntry. But it did not create the necessary clusters/listeners in Istio 1.1.2. Which version of Istio did you try this with?

i'm using istio 1.0.
Can you please describe what are your errors? does rabbit crash with the epmd error? if so - you'll have to exclude this port from the headless service as i'm stating in my article:

for example:

apiVersion: authentication.istio.io/v1alpha1
kind: Policy
metadata:    
  name: rabbitmq-disable-mtls
  namespace: rabbitns
spec:
  targets:
  - name: rabbitmq-headless
    ports:
    - number: 4369

Another option is to go into the epmd code, and change it to use localhost / 127.0.0.1 instead of local IP. but i guess its not worth it.

@ramaraochavali

This comment has been minimized.

Copy link
Contributor

@ramaraochavali ramaraochavali commented Apr 17, 2019

I am trying a simple TCP and Http Service not the rabbitMQ (sorry for the confusion - I was thinking any TCP should work similar to rabbitMQ that is the reason I asked for the config). I am creating a service entry as shown below

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: istio-http1-stateful-service-entry
  namespace: default
spec:
  hosts:
  - istio-http1-stateful-0.istio-http1-stateful.default.svc.cluster.local
  - istio-http1-stateful-1.istio-http1-stateful.default.svc.cluster.local
  - istio-http1-stateful-2.istio-http1-stateful.default.svc.cluster.local
  location: MESH_INTERNAL
  ports:
  - number: 7024
    name: http-port
    protocol: TCP
  resolution: NONE

and I am creating service as shown below

apiVersion: v1
kind: Service
metadata:
  name: istio-http1-stateful
  labels:
    app: istio-http1-stateful
spec:
  ports:
  - name: http-port
    port: 7024
  selector:
    app: istio-http1-stateful
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: istio-http1-stateful
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: istio-http1-stateful
        version: v1
    spec:
      containers:
      - name: istio-http1-stateful
        image: h1-test-server:latest
        args:
        - "0.0.0.0:7024"
        imagePullPolicy: Always
        ports:
        - containerPort: 7024

I am not using mTLS. With this I am expecting clusters to be created for the ServiceEntry - but it is not. Could it be Istio version difference?

@mbanikazemi

This comment has been minimized.

Copy link
Contributor

@mbanikazemi mbanikazemi commented Aug 8, 2019

Stateful sets work fine as long as the corresponding headless service (clusterIP: None) contains the ports information. No need for service entries or anything else. Istio mtls can be enabled/disabled as usual. The only caveat is that stateful sets should avoid sharing ports.

@andrewjjenkins

This comment has been minimized.

Copy link
Contributor

@andrewjjenkins andrewjjenkins commented Aug 8, 2019

@mbanikazemi another caveat is that you cannot currently use mTLS if a Pod in a StatefulSet wants to connect to its own PodIP (#12551). This is common in many Kafka configurations.

@vadimeisenbergibm

This comment has been minimized.

Copy link
Contributor

@vadimeisenbergibm vadimeisenbergibm commented Sep 3, 2019

@andrewjjenkins @arielb135 @esnible @howardjohn I am confused by using ServiceEntries with host names for TCP protocols like the ones used by RabbitMQ. For a TCP protocol, the original host name is not known, so how the ServiceEntry in https://github.com/arielb135/RabbitMQ-with-istio-MTLS could be useful? How Envoy will use rabbitmq-n.rabbitmq-discovery.rabbitns.svc.cluster.local hosts for an amqp port?

@howardjohn

This comment has been minimized.

Copy link
Member

@howardjohn howardjohn commented Sep 3, 2019

@vadimeisenbergibm for TCP hostnames are just used for DestinationRules/VirtualServices as far as I know, not by envoy directly

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Oct 15, 2019

Now we have supported headless service instances listeners and split inbound and outbound listener, i think this has been fixed. @rshriram @lambdai Could we close this?

@rshriram

This comment has been minimized.

Copy link
Member

@rshriram rshriram commented Oct 15, 2019

yep

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Oct 15, 2019

/close

If anyone has some other issue, feel free to reopen it.

@howardjohn howardjohn closed this Oct 15, 2019
@cscetbon

This comment has been minimized.

Copy link

@cscetbon cscetbon commented Oct 17, 2019

I think you should reopen that umbrella. I tried 1.3.3 and Cassandra nodes still can't connect to their local POD_IP which prevents the cluster from working. I was expecting the split between inbound and outbound to solve that issue but it does not if it's in 1.3.3.

$ helm ls istio                  
NAME      	REVISION	UPDATED                 	STATUS  	CHART           	APP VERSION	NAMESPACE
istio     	1       	Wed Oct 16 22:57:36 2019	DEPLOYED	istio-1.3.3     	1.3.3      	istio-system
istio-init	1       	Wed Oct 16 22:52:19 2019	DEPLOYED	istio-init-1.3.3	1.3.3      	istio-system

$ kl --tail=10 cassandra-1 cassandra 
INFO  03:13:18 Cannot handshake version with cassandra-0.cassandra.cassandra-e2e.svc.cluster.local/10.244.1.15
INFO  03:13:18 Handshaking version with cassandra-0.cassandra.cassandra-e2e.svc.cluster.local/10.244.1.15
@ssoerensen

This comment has been minimized.

Copy link

@ssoerensen ssoerensen commented Oct 24, 2019

@hzxuzhonghu Do you have a link for the relavant documentation or changelog?

@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Oct 25, 2019

@huwany

This comment has been minimized.

Copy link

@huwany huwany commented Oct 25, 2019

@ssoerensen

This comment has been minimized.

Copy link

@ssoerensen ssoerensen commented Oct 28, 2019

@hzxuzhonghu That is awesome, let me ask something different what should we do differently/change, right now we are using the rabbitmq example that have been circling around various issues.

  • Do we still need specific entries for each pod in a statefulset?
  • What do we do when we have a headless and a clusterIP for a statefulset
  • Is there any special step we need to take when using mTLS?
@hzxuzhonghu

This comment has been minimized.

Copy link
Member Author

@hzxuzhonghu hzxuzhonghu commented Oct 28, 2019

Do we still need specific entries for each pod in a statefulset?

NO

What do we do when we have a headless and a clusterIP for a statefulset

You can access by service name as normal service from client. For peer to peer communication, you need to specify the pod dns name.

Is there any special step we need to take when using mTLS?

For some apps, you need take care. FYI https://istio.io/faq/security/#mysql-with-mtls

@cscetbon

This comment has been minimized.

Copy link

@cscetbon cscetbon commented Nov 26, 2019

@howardjohn it seems you missed my comment #10659 (comment)

@arielb135 I've read your article and still can't make it work with Cassandra by disabling mtls on Gossip ports. I'm atm trying to disable MTLS for all the ports it uses (and will at the end just encrypt the clients connections). Here's what I use, if you can help that'd be more than appreciated https://pastebin.com/raw/pgDSH0RN. I can't understand why they still can't communicate. I see that it applies my DestinationRule

{
        "name": "outbound|7001||cassandra.cassandra-e2e.svc.cluster.local",
        "type": "ORIGINAL_DST",
....
        "metadata": {
            "filterMetadata": {
                "istio": {
                    "config": "/apis/networking/v1alpha3/namespaces/cassandra-e2e/destination-rule/tls-only-native-port"
....

We can also see the MTLS is disabled for the service

HOST:PORT                                                       STATUS       SERVER      CLIENT           AUTHN POLICY                                 DESTINATION RULE
cassandra.cassandra-e2e.svc.cluster.local:7000                  OK           DISABLE     DISABLE          cassandra-e2e/policy-disable-mtls            cassandra-e2e/tls-only-native-port
cassandra.cassandra-e2e.svc.cluster.local:7001                  OK           DISABLE     DISABLE          cassandra-e2e/policy-disable-mtls            cassandra-e2e/tls-only-native-port
cassandra.cassandra-e2e.svc.cluster.local:7199                  OK           DISABLE     DISABLE          cassandra-e2e/policy-disable-mtls            cassandra-e2e/tls-only-native-port
cassandra.cassandra-e2e.svc.cluster.local:9042                  OK           DISABLE     DISABLE          cassandra-e2e/policy-disable-mtls            cassandra-e2e/tls-only-native-port
@Kampe

This comment has been minimized.

Copy link
Contributor

@Kampe Kampe commented Jan 15, 2020

Can confirm this is still a problem in 1.4

@anannaya

This comment has been minimized.

Copy link

@anannaya anannaya commented Jan 23, 2020

Any workaround to solve this issue??

@esnible esnible reopened this Jan 23, 2020
@esnible

This comment has been minimized.

Copy link
Contributor

@esnible esnible commented Jan 23, 2020

Opening because this problem is still being reported.

If the maintainers feel this is mostly solved please create a new issue for the parts that aren't solved for that activity. We can't keep ignoring feedback regarding stateful sets.

@anannaya

This comment has been minimized.

Copy link

@anannaya anannaya commented Jan 24, 2020

https://github.com/istio/istio/pull/19992/files This fixes headless service problem, Now the actual problem is with stateful apps. Any workaround would be helpful.

@kaushiksrinivas

This comment has been minimized.

Copy link

@kaushiksrinivas kaushiksrinivas commented Jan 30, 2020

We are facing a issue when trying mtls with zookeeper.
below is the summary,
items done:

  1. default mtls policy with STRICT mode in the client/server(both are in single namespace) namespace.
  2. default destination rule with *.local wildcard in the client/server(both are in single namespace) namespace with ISTIO_MUTUAL mode.
  3. global.mtls is disabled.
  4. zookeeper quorumListenOnAllIPs=true to make zk listen on all the IPs.
  5. No virtual service or gateways created.
  6. exclude 2388 port from envoy side car interception.
    traffic.sidecar.istio.io/includeInboundPorts: "2181"
    traffic.sidecar.istio.io/excludeInboundPorts: "2888,3888"
    traffic.sidecar.istio.io/excludeOutboundIPRanges: "0.0.0.0/0"
    traffic.sidecar.istio.io/includeOutboundIPRanges: ""

With these settings, only 2181 client port is exposed via envoy and with mTLS.
single zookeeper replica.

To test this, introduced another pod with zookeeper shell libraries and mTLS enabled as a client.
Now when zookeeper shell is run from the client pod with zk pod fqdn (zk-zkist-0.zk-zkist-headless.istiotest.svc.cluster.local), we see below error and warnings

WARN Session 0x0 for server zk-zkist-0.zk-zkist-headless.istiotest.svc.cluster.local/x.x.x.x:2181, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
java.io.IOException: Packet len352518400 is out of range!
at org.apache.zookeeper.ClientCnxnSocket.readLength(ClientCnxnSocket.java:113)
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:79)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /
at org.apache.zookeeper.KeeperException.create(KeeperException.java:102)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1541)
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1569)
at org.apache.zookeeper.ZooKeeperMain.processZKCmd(ZooKeeperMain.java:732)
at org.apache.zookeeper.ZooKeeperMain.processCmd(ZooKeeperMain.java:600)
at org.apache.zookeeper.ZooKeeperMain.executeLine(ZooKeeperMain.java:372)
at org.apache.zookeeper.ZooKeeperMain.run(ZooKeeperMain.java:358)
at org.apache.zookeeper.ZooKeeperMain.main(ZooKeeperMain.java:291)

zookeeper logs the below:
"type":"log", "host":"zk-zkist-0", "level":"WARN", "neid":"zookeeper-902a7179214b4780bf0189fceb111b59", "system":"zookeeper", "time":"2020-01-30T07:19:31.766Z", "timezone":"UTC", "log":{"message":"NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181 - org.apache.zookeeper.server.NIOServerCnxn - Unable to read additional data from client sessionid 0x0, likely client has closed socket"}}
{"type":"log", "host":"zk-zkist-0", "level":"INFO", "neid":"zookeeper-902a7179214b4780bf0189fceb111b59", "system":"zookeeper", "time":"2020-01-30T07:19:31.766Z", "timezone":"UTC", "log":{"message":"NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181 - org.apache.zookeeper.server.NIOServerCnxn - Closed socket connection for client /127.0.0.1:47750 (no session established for client)"}}

Note: the zk process is up, confirmed the same by running the same command from within the zookeeper container itself and works fine.
Also when we delete the mtls policy, things work fine even from the client pod.

Only when the mTLS is enabled this issue is occurring.
Any reasonable issues in configurations above ?

@anannaya

This comment has been minimized.

Copy link

@anannaya anannaya commented Jan 30, 2020

@kaushiksrinivas From your testclient Is dns resolution happening for zk-zkist-0.zk-zkist-headless.istiotest.svc.cluster.local ?

@kaushiksrinivas

This comment has been minimized.

Copy link

@kaushiksrinivas kaushiksrinivas commented Jan 30, 2020

it is happening, because i am able to see the warning logs in the zookeeper server as soon as i run my client commands and also when mtls is disabled, this works fine. I tried even adding a serviceEntry with this fqdn and still does not work.

telnet zk-zkist-0.zk-zkist-headless.istiotest.svc.cluster.local 2181
Trying 192.168.1.61...
Connected to zk-zkist-0.zk-zkist-headless.istiotest.svc.cluster.local.

@rpocase

This comment has been minimized.

Copy link

@rpocase rpocase commented Feb 13, 2020

Edit: I've since been able to get this to work, but preserving the below in case its helpful for anyone else. I ended up needing to label grpcs services as tcp instead of http or grpc. This still fills like a bug on istio/envoys end, but could just be an incompatibility with my current application requirements (i.e., if I was using istio mTLS or grpc instead of grpcs maybe it would work).

I am working on integrating sidecars into a PERMISSIVE default profile istioctl installation. I am hitting issues once I attempt to communicate with a handful of StatefulSet services (specifically, hyperledger fabric nodes). Communication works prior to turning on side-car injection.

These nodes are configured with one-way grpcs. I've seen a lot of independent problems with directly configured TLS in permissive mode, StatefulSets, and grpc. Are there any instructions/debugging tips on how to narrow down where my problem might lie?

If it helps, APGGroeiFabriek/PIVT is my reference implementation (using single node raft samples as a starting point). I've made local modifications to implement a headless service for both orderer and peer. I'm happy to try and provide a fork with relevant changes to narrow down issues, but will take some time to detangle them from internal assumptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.