Skip to content

Alpha Nodes continually running out of memory and OOM killing #5400

@guhan-v

Description

@guhan-v

What version of Dgraph are you using?

20.03.1

Have you tried reproducing the issue with the latest release?

This is the latest release

What is the hardware spec (RAM, OS)?

K8s running on 3 M5.xLarge (4 vcpu/16 gb ram)

Steps to reproduce the issue (command/config used to run Dgraph).

K8s yaml for cluster setup

apiVersion: v1
kind: Service
metadata:
  name: dgraph-alpha-public
  labels:
    app: dgraph-alpha
    monitor: alpha-dgraph-io
spec:
  type: LoadBalancer
  ports:
  - port: 8080
    targetPort: 8080
    name: alpha-http
  - port: 9080
    targetPort: 9080
    name: alpha-grpc
  selector:
    app: dgraph-alpha
---
apiVersion: v1
kind: Service
metadata:
  name: dgraph-alpha-private
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: "true"
  labels:
    app: dgraph-alpha
    monitor: alpha-dgraph-io
spec:
  type: LoadBalancer
  ports:
  - port: 8080
    targetPort: 8080
    name: alpha-http
  - port: 9080
    targetPort: 9080
    name: alpha-grpc
  selector:
    app: dgraph-alpha
# ---
# # This service is created in-order to debug & profile a specific alpha.
# # You can create one for each alpha that you need to profile.
# # For a more general HTTP APIs use the above service instead.
# apiVersion: v1
# kind: Service
# metadata:
#   name: dgraph-alpha-0-http-public
#   labels:
#     app: dgraph-alpha
# spec:
#   type: LoadBalancer
#   ports:
#   - port: 8080
#     targetPort: 8080
#     name: alpha-http
#   selector:
#     statefulset.kubernetes.io/pod-name: dgraph-alpha-0
---
apiVersion: v1
kind: Service
metadata:
  name: dgraph-ratel-public
  labels:
    app: dgraph-ratel
spec:
  type: LoadBalancer
  ports:
  - port: 8000
    targetPort: 8000
    name: ratel-http
  selector:
    app: dgraph-ratel
---
# This is a headless service which is necessary for discovery for a dgraph-zero StatefulSet.
# https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#creating-a-statefulset
apiVersion: v1
kind: Service
metadata:
  name: dgraph-zero
  labels:
    app: dgraph-zero
spec:
  ports:
  - port: 5080
    targetPort: 5080
    name: zero-grpc
  clusterIP: None
  # We want all pods in the StatefulSet to have their addresses published for
  # the sake of the other Dgraph Zero pods even before they're ready, since they
  # have to be able to talk to each other in order to become ready.
  publishNotReadyAddresses: true
  selector:
    app: dgraph-zero
---
# This is a headless service which is necessary for discovery for a dgraph-alpha StatefulSet.
# https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/#creating-a-statefulset
apiVersion: v1
kind: Service
metadata:
  name: dgraph-alpha
  labels:
    app: dgraph-alpha
spec:
  ports:
  - port: 7080
    targetPort: 7080
    name: alpha-grpc-int
  clusterIP: None
  # We want all pods in the StatefulSet to have their addresses published for
  # the sake of the other Dgraph alpha pods even before they're ready, since they
  # have to be able to talk to each other in order to become ready.
  publishNotReadyAddresses: true
  selector:
    app: dgraph-alpha
---
# This StatefulSet runs 3 Dgraph Zero.
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: dgraph-zero
spec:
  serviceName: "dgraph-zero"
  replicas: 3
  selector:
    matchLabels:
      app: dgraph-zero
  template:
    metadata:
      labels:
        app: dgraph-zero
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values: ["dgraph-zero"]
              topologyKey: kubernetes.io/hostname
      containers:
      - name: zero
        image: dgraph/dgraph:v20.03.1
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 5080
          name: zero-grpc
        - containerPort: 6080
          name: zero-http
        volumeMounts:
        - name: datadir
          mountPath: /dgraph
        env:
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: GODEBUG
            value: madvdontneed=1
        command:
          - bash
          - "-c"
          - |
            set -ex
            [[ `hostname` =~ -([0-9]+)$ ]] || exit 1
            ordinal=${BASH_REMATCH[1]}
            idx=$(($ordinal + 1))
            if [[ $ordinal -eq 0 ]]; then
              exec dgraph zero --my=$(hostname -f):5080 --idx $idx --replicas 3 -v=2
            else
              exec dgraph zero --my=$(hostname -f):5080 --peer dgraph-zero-0.dgraph-zero.${POD_NAMESPACE}.svc.cluster.local:5080 --idx $idx --replicas 3 -v=2
            fi
        livenessProbe:
          httpGet:
            path: /health
            port: 6080
          initialDelaySeconds: 15
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 6
          successThreshold: 1
        readinessProbe:
          httpGet:
            path: /state
            port: 6080
          initialDelaySeconds: 15
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 6
          successThreshold: 1
      terminationGracePeriodSeconds: 60
      volumes:
      - name: datadir
        persistentVolumeClaim:
          claimName: datadir
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: datadir
      annotations:
        volume.alpha.kubernetes.io/storage-class: anything
    spec:
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: 5Gi
---
# This StatefulSet runs 3 replicas of Dgraph Alpha.
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: dgraph-alpha
spec:
  serviceName: "dgraph-alpha"
  replicas: 3
  selector:
    matchLabels:
      app: dgraph-alpha
  template:
    metadata:
      labels:
        app: dgraph-alpha
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values: ["dgraph-alpha"]
              topologyKey: kubernetes.io/hostname
      containers:
      - name: alpha
        image: dgraph/dgraph:v20.03.1
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 7080
          name: alpha-grpc-int
        - containerPort: 8080
          name: alpha-http
        - containerPort: 9080
          name: alpha-grpc
        volumeMounts:
        - name: datadir
          mountPath: /dgraph
        resources:
          requests:
            memory: "2Gi"
            cpu: "1"
          limits:
            memory: "12Gi"
            cpu: "2"
        env:
          # This should be the same namespace as the dgraph-zero
          # StatefulSet to resolve a Dgraph Zero's DNS name for
          # Alpha's --zero flag.
          - name: POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
          - name: GODEBUG
            value: madvdontneed=1
        command:
          - bash
          - "-c"
          - |
            set -ex
            dgraph alpha --my=$(hostname -f):7080 --lru_mb 2048 --zero dgraph-zero-0.dgraph-zero.${POD_NAMESPACE}.svc.cluster.local:5080 -v=2
        livenessProbe:
          httpGet:
            path: /health?live=1
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 6
          successThreshold: 1
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 15
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 6
          successThreshold: 1
      terminationGracePeriodSeconds: 600
      volumes:
      - name: datadir
        persistentVolumeClaim:
          claimName: datadir
  updateStrategy:
    type: RollingUpdate
  volumeClaimTemplates:
  - metadata:
      name: datadir
      annotations:
        volume.alpha.kubernetes.io/storage-class: anything
    spec:
      accessModes:
        - "ReadWriteOnce"
      resources:
        requests:
          storage: 100Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dgraph-ratel
  labels:
    app: dgraph-ratel
spec:
  selector:
    matchLabels:
      app: dgraph-ratel
  template:
    metadata:
      labels:
        app: dgraph-ratel
    spec:
      containers:
      - name: ratel
        image: dgraph/dgraph:v20.03.1
        ports:
        - containerPort: 8000
        command:
          - dgraph-ratel

Hitting the Alpha load balancer with ~25-75 mutations/sec to ingest data into the graph or really any consistent flow of data into the alpha nodes

Expected behaviour and actual result.

Expected behaviour:

  • Alpha nodes release unused memory and DONT OOM kill themselves and restart. They are drinking every drop of memory available to them, which was bumped up to 12GB, and it still did not release that memory. Given that the mutation rate is only 25-75/sec it seems like 12GB/alpha Pod should be plenty of memory.
    Behavior Continued EVEN when GODEBUG=madvdontneed=1 was used

Actual Result:

  • Below shows spiking and failing of dgraph alpha nodes(Back when I put the memory limit at 6GB. Same things are happening with 12 GB

Dgraph - Grafana 2020-05-07 18-44-56

  • Continual cycle of OOM kill and crash loop backoff causing restarting
    dgraph-kubernetes-pulumi 2020-05-08 09-11-24

  • Image of the Pod cycling through OOM errors
    dgraph-kubernetes-pulumi 2020-05-08 09-12-36

  • Pprof logs do not show the same memory restraints, and only show 2GB used even when container is dying.
    Pprof inuse_objects show a rather high amount of objects though.
    I believe the issue lies with lack of GC, or potentially a Memory leak within the Alpha pods

pprof.dgraph.alloc_objects.alloc_space.inuse_objects.inuse_space.009.pb.gz
pprof.dgraph.samples.cpu.001.pb.gz

This issue is Blocking our team, so any help would be greatly appreciated

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/graphqlIssues related to GraphQL support on Dgraph.kind/bugSomething is broken.status/acceptedWe accept to investigate/work on it.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions