Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad things happen when a DaemonSet with hostNetwork:true is deployed #366

Closed
jpweber opened this issue Feb 15, 2018 · 5 comments
Closed
Assignees
Labels
Milestone

Comments

@jpweber
Copy link

jpweber commented Feb 15, 2018

When deploying a DaemonSet with hostNetwork set to true an assortment of network related issues happened. kube-apiserver was not reachable. Hosts could not be SSHed in to, conduit gui could not connect to conduit components etc.

I'm raising this issue not because I want to deploy a DaemonSet with conduit injected in to and use hostNetwork. But I don't think it should effectively take down a cluster either. There has to be a way to protect from this. To replicate the issue deploy the manifest below.

This occurred in the following environment.
k8s 1.9.3
ubuntu 16.04

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: 'true'
  labels:
    app: node-exporter
    name: node-exporter
  name: node-exporter
spec:
  clusterIP: None
  ports:
  - name: scrape
    port: 9100
    protocol: TCP
  selector:
    app: node-exporter
  type: ClusterIP
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  template:
    metadata:
      labels:
        app: node-exporter
      name: node-exporter
    spec:
      containers:
      - image: prom/node-exporter
        name: node-exporter
        ports:
        - containerPort: 9100
          hostPort: 9100 #comment these out for it to work
          name: scrape
      hostNetwork: true #comment these out for it to work
      hostPID: true
      tolerations:
        - effect: NoSchedule
          operator: Exists
@briansmith
Copy link
Contributor

Is this caused buy the init container rewriting the iptables rules for the entire host when hostNetwork: true?

I believe conduit inject needs to either ignore hostNetwork: true pods (with a warning to stderr?) or exit with an error message and non-zero exit code when a hostNetwork: true pod is used.

See istio/istio.io#655.

/cc @pcalcado

@briansmith
Copy link
Contributor

Also, are we really intending to inject into deamonsets by default? We have code that appears to do so, but I don't know that we actually intend to do that. I don't think the UI is expecting injections into non-pod objects, at least. For example, the UI doesn't discern the type of object and might even group a pod and a deamonset with the same name together or do other bad things. I think we should restrict injection to pods exclusively and file a follow-up issue about spec'ing what, if anything, to do about deamonsets.

@olix0r
Copy link
Member

olix0r commented Feb 16, 2018

Wow, @jpweber, nice find!


@briansmith

are we really intending to inject into deamonsets by default?

the UI doesn't discern the type of object and might even group a pod and a deamonset with the same name together or do other bad things

This isn't how I understand daemonsets -- daemonsets are like replicacontrollers or deployments, essentially, in that they are a parent resource that creates pods. So I think daemonsets are very much analogous to what we refer to as "deployments" in the UI at the moment.

There's no reason that I'm aware of that we wouldn't want to support injecting daemonsets.

However, it seems we absolutely must avoid adding proxy-init into any resource that has hostNetwork: true.

@wmorgan
Copy link
Member

wmorgan commented Feb 16, 2018

@olix0r should we try to get a fix in for 0.3? "conduit inject may effectively take down a cluster" is a pretty big caveat for a release.

@olix0r olix0r added this to the Conduit 0.3 milestone Feb 16, 2018
@briansmith briansmith self-assigned this Feb 16, 2018
@olix0r olix0r added the review/ready Issue has a reviewable PR label Feb 16, 2018
@wmorgan
Copy link
Member

wmorgan commented Feb 17, 2018

@jpweber We'll fix this for 0.3, scheduled for next week.

briansmith added a commit that referenced this issue Feb 18, 2018
Refactor `conduit inject` code to make it unit-testable.

Refactor the conduit inject code to make it easier to add unit tests. This work was done by @deebo91 in #365. This is the same PR without the conduit install changes, so that it can land ahead of #365. In particular, this will be used for testing the fix for high-priority bug #366.

Signed-off-by: Dennis Adjei-Baah <dennis@buoyant.io>
Signed-off-by: Brian Smith <brian@briansmith.org>
briansmith added a commit that referenced this issue Feb 18, 2018
The init container injected by conduit inject rewrites the iptables configuration for its network namespace. This causes havoc when the network namespace isn't restricted to the pod, i.e. when hostNetwork=true.

Skip pods with hostNetwork=true to avoid this problem.

Fixes #366.

Signed-off-by: Brian Smith <brian@briansmith.org>
@olix0r olix0r removed the review/ready Issue has a reviewable PR label Feb 18, 2018
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 18, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants