Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ceph RBD CSI plugin #117

Merged
merged 2 commits into from
Jun 20, 2023
Merged

Ceph RBD CSI plugin #117

merged 2 commits into from
Jun 20, 2023

Conversation

tgross
Copy link
Member

@tgross tgross commented Apr 14, 2022

A pack for deploying the Ceph RBD CSI plugin and a Ceph demo

Example runs in my Vagrant cluster environment:

render ceph output
$ nomad-pack render ceph
ceph/ceph.nomad:

job "ceph" {
  namespace   = "default"
  region      = "global"
  datacenters = ["dc1"]

  group "ceph" {
    constraint {
      attribute = "${attr.kernel.name}"
      value     = "linux"
    }

    constraint {
      attribute = "${attr.driver.docker.privileged.enabled}"
      value     = true
    }

    network {
      # we can't configure networking in a way that will both satisfy the Ceph
      # monitor's requirement to know its own IP address *and* be routable
      # between containers, without either CNI or fixing
      # https://github.com/hashicorp/nomad/issues/9781
      #
      # So for now we'll use host networking to keep this demo understandable.
      # That also means the controller plugin will need to use host addresses.
      mode = "host"
    }

    service {
      name = "ceph-mon"
      port = 3300
    }

    service {
      name = "ceph-dashboard"
      port = 5000

      check {
        type           = "http"
        interval       = "5s"
        timeout        = "1s"
        path           = "/"
        initial_status = "warning"
      }
    }

    task "ceph" {
      driver = "docker"

      config {
        image        = "ceph/daemon:latest-octopus"
        args         = ["demo"]
        network_mode = "host"
        privileged   = true

        mount {
          type   = "bind"
          source = "local/ceph"
          target = "/etc/ceph"
        }
      }
      resources {
        cpu    = 256
        memory = 600
      }

      template {

        data = <<EOT
MON_IP={{ sockaddr "with $ifAddrs := GetDefaultInterfaces | include \"type\" \"IPv4\" | limit 1 -}}{{-
range $ifAddrs -}}{{ attr \"address\" . }}{{ end }}{{ end " }}
CEPH_PUBLIC_NETWORK=0.0.0.0/0
CEPH_DEMO_UID=demo
CEPH_DEMO_BUCKET=example
EOT

        destination = "${NOMAD_TASK_DIR}/env"
        env         = true
      }

      template {

        data        = <<EOT
[global]
fsid = eb34e46e-90c1-4a72-932e-5c305c31e1ed
mon initial members = {{ env "attr.unique.hostname" }}
mon host = v2:{{ sockaddr "with $ifAddrs := GetDefaultInterfaces | include \"type\" \"IPv4\" | limit 1
-}}{{- range $ifAddrs -}}{{ attr \"address\" . }}{{ end }}{{ end " }}:3300/0

osd crush chooseleaf type = 0
osd journal size = 100
public network = 0.0.0.0/0
cluster network = 0.0.0.0/0
osd pool default size = 1
mon warn on pool no redundancy = false
osd_memory_target =  939524096
osd_memory_base = 251947008
osd_memory_cache_min = 351706112
osd objectstore = bluestore

[osd.0]
osd data = /var/lib/ceph/osd/ceph-0

[client.rgw.linux]
rgw dns name = {{ env "attr.unique.hostname" }}
rgw enable usage log = true
rgw usage log tick interval = 1
rgw usage log flush threshold = 1
rgw usage max shards = 32
rgw usage max user shards = 1
log file = /var/log/ceph/client.rgw.linux.log
rgw frontends = beast  endpoint=0.0.0.0:8080


EOT

        destination = "${NOMAD_TASK_DIR}/ceph/ceph.conf"
      }
    }
  }
}
run ceph output
$ nomad-pack run ./ceph
  Evaluation ID: 497970f1-684f-d89d-4e8e-f5c4ccd221f3
  Job 'ceph' in pack deployment 'ceph' registered successfully
Pack successfully deployed. Use ./ceph to manage this this deployed instance with plan, stop,
destroy, or info

For the Ceph RBD CSI pack, you'll need the ceph_cluster_id and
ceph_monitor_service_name variables. If you haven't set ceph_cluster_id, it
will have been automatically generated and you can find it in the Ceph
allocation file system. Get the "fsid" value here:

    nomad alloc fs :alloc_id ceph/local/ceph/ceph.conf | awk -F' = ' '/fsid/{print $2}'

To create volumes, you'll need the client.admin key value from the Ceph mon
keyring. Read that with:

   nomad alloc fs :alloc_id ceph/local/ceph/ceph.mon.keyring

You'll set this value in the volume secrets block. For example:

secrets {
  userID  = "admin"
  userKey = "AQDsIoxgHqpe..ZdIzA=="
}
render ceph_rbd_csi output
$ nomad-pack render -var ceph_cluster_id=e4008158-522b-4fd7-aed8-62b5e79dd293 ./ceph_rbd_csi
ceph_rbd_csi/node.nomad:

job "ceph_rbd_node" {
  # you can run node plugins as service jobs as well, but this ensures
  # that all nodes in the DC have a copy.
  type = "system"
  namespace   = "default"
  region      = "global"
  datacenters = ["dc1"]

  group "nodes" {

    constraint {
      attribute = "${attr.kernel.name}"
      value     = "linux"
    }
    constraint {
      attribute = "${attr.driver.docker.privileged.enabled}"
      value     = true
    }

    network {
      port "prometheus" {}
    }

    service {
      name = "prometheus"
      port = "prometheus"
      tags = ["ceph-csi"]
    }

    task "plugin" {
      driver = "docker"

      config {
        image = "quay.io/cephcsi/cephcsi:canary"

        args = [
          "--drivername=rbd.csi.ceph.com",
          "--v=5",
          "--type=rbd",
          "--nodeserver=true",
          "--nodeid=${NODE_ID}",
          "--instanceid=${POD_ID}",
          "--endpoint=${CSI_ENDPOINT}",
          "--metricsport=${NOMAD_PORT_prometheus}",
        ]

        privileged = true
        ports      = ["prometheus"]
      }

      template {
        data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT

        destination = "${NOMAD_TASK_DIR}/env"
        env         = true
      }

      csi_plugin {
        id        = "rbd.csi.ceph.com"
        type      = "node"
        mount_dir = "/csi"
      }
      resources {
        cpu    = 256
        memory = 256
      }

    }
  }
}

ceph_rbd_csi/controller.nomad:

job "ceph_rbd_controller" {

  namespace   = "default"
  region      = "global"
  datacenters = ["dc1"]

  group "controllers" {

    constraint {
      attribute = "${attr.kernel.name}"
      value     = "linux"
    }

    network {
      port "prometheus" {}
    }

    service {
      name = "prometheus"
      port = "prometheus"
      tags = ["ceph-csi"]
    }

    task "plugin" {
      driver = "docker"

      config {
        image = "quay.io/cephcsi/cephcsi:canary"

        args = [
          "--drivername=rbd.csi.ceph.com",
          "--v=5",
          "--type=rbd",
          "--controllerserver=true",
          "--nodeid=${NODE_ID}",
          "--instanceid=${POD_ID}",
          "--endpoint=${CSI_ENDPOINT}",
          "--metricsport=${NOMAD_PORT_prometheus}",
        ]

        ports      = ["prometheus"]

        # we need to be able to write key material to disk in this location
        mount {
          type     = "bind"
          source   = "secrets"
          target   = "/tmp/csi/keys"
          readonly = false
        }

        mount {
          type     = "bind"
          source   = "ceph-csi-config/config.json"
          target   = "/etc/ceph-csi-config/config.json"
          readonly = false
        }

      }

      template {
        data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT

        destination = "${NOMAD_TASK_DIR}/env"
        env         = true
      }

      # ceph configuration file
      template {

        data = <<EOF
[{
    "clusterID": "e4008158-522b-4fd7-aed8-62b5e79dd293",
    "monitors": [
        {{range $index, $service := service "ceph-mon"}}{{if gt $index 0}}, {{end}}"{{.Address}}"{{end}}
    ]
}]
EOF

        destination = "ceph-csi-config/config.json"
      }

      csi_plugin {
        id        = "rbd.csi.ceph.com"
        type      = "controller"
        mount_dir = "/csi"
      }
      resources {
        cpu    = 256
        memory = 256
      }

    }
  }
}

run ceph_rbd_csi output
$ nomad-pack run -var ceph_cluster_id=e4008158-522b-4fd7-aed8-62b5e79dd293 ./ceph_rbd_csi
  Evaluation ID: 0bcb04c4-0619-0d8c-a2c0-cdfc75aa4d7a
  Job 'ceph_rbd_controller' in pack deployment 'ceph_rbd_csi' registered successfully
  Evaluation ID: 5c4e991d-0df5-12a3-1680-a1bc33efd6be
  Job 'ceph_rbd_node' in pack deployment 'ceph_rbd_csi' registered successfully
Pack successfully deployed. Use ./ceph_rbd_csi to manage this this deployed instance with plan,
stop, destroy, or info

id        = "myvolume"
name      = "myvolume"
namespace = "default"
type      = "csi"
plugin_id = "rbd.csi.ceph.com"

capacity_min = "10GiB"
capacity_max = "30GiB"

capability {
  access_mode     = "single-node-writer"
  attachment_mode = "file-system"
}

capability {
  access_mode     = "single-node-writer"
  attachment_mode = "block-device"
}

# get this secret from the Ceph allocation:
# /etc/ceph/ceph.client.admin.keyring
secrets {
  userID  = "admin"
  userKey = "AQDsIoxgHqpe...spTbvwZdIzA=="
}

parameters {
  clusterID     = "e4008158-522b-4fd7-aed8-62b5e79dd293"
  pool          = "rbd"
  imageFeatures = "layering"
}

Resulting healthy plugins:

$ nomad plugin status rbd
ID                   = rbd.csi.ceph.com
Provider             = rbd.csi.ceph.com
Version              = canary
Controllers Healthy  = 2
Controllers Expected = 2
Nodes Healthy        = 2
Nodes Expected       = 2

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
8d50c40b  0357bda7  controllers 2        run      running  1m26s ago  21s ago
9cda73da  37514e8a  controllers 2        run      running  1m26s ago  30s ago
d8e712d6  37514e8a  nodes       2        run      running  1m26s ago  30s ago
7101dd8e  0357bda7  nodes       2        run      running  1m26s ago  30s ago

A pack for deploying the Ceph RBD CSI plugin and a Ceph demo
@mikenomitch mikenomitch merged commit 58f0692 into main Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants