Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
384 lines (258 sloc) 9.8 KB

Ceph

Node reboot

  1. Disable rebalancing temporarily

    $ ceph osd set noout
    noout is set
    $ ceph osd set norebalance
    norebalance is set
    $ ceph -s
      cluster:
        id:     xxx
        health: HEALTH_WARN
                noout,norebalance flag(s) set
    [...]
  2. Reboot the node

    $ sudo reboot
  3. When the reboot is complete enable cluster rebalancing again

    $ ceph osd unset noout
    noout is unset
    $ ceph osd unset norebalance
    norebalance is unset
    $ ceph -s
      cluster:
        id:     xxx
        health: HEALTH_OK
    [...]

Cluster start and stop

Stop

Ensure that any services/clients using Ceph are stopped and that the cluster is in a healthy state.

  1. Set OSD flags

    $ ceph osd set noout
    $ ceph osd set nobackfill
    $ ceph osd set norecover
    $ ceph osd set norebalance
    $ ceph osd set nodown
    $ ceph osd set pause
    $ ceph -s
      cluster:
      [...]
        health: HEALTH_WARN
                pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover flag(s) set
    
      services:
      [...]
        osd: x osds: y up, z in
             flags pauserd,pausewr,nodown,noout,nobackfill,norebalance,norecover
  2. Stop the management services (manager, mds, ..) (node by node)

    $ sudo systemctl stop ceph-mgr\*.service
  3. Stop the osd services (node by node)

    $ sudo systemctl stop ceph-osd\*.service
  4. Stop the monitor service (node by node)

    $ sudo systemctl stop ceph-mon\*.service

Start

  1. Start the monitor services (node by node)

    $ sudo systemctl start ceph-mon\*.service
  2. Start the osd services (node by node)

    $ systemctl start ceph-osd@DEVICE.service
  3. Start the management services (manager, mds, ..) (node by node)

    $ sudo systemctl start ceph-mgr\*.service
  4. Unset OSD flags

    $ ceph osd unset pause
    $ ceph osd unset nodown
    $ ceph osd unset norebalance
    $ ceph osd unset norecover
    $ ceph osd unset nobackfill
    $ ceph osd unset noout

Check

$ sudo systemctl status ceph\*.service
$ ceph -s
  cluster:
    id:     x
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum A,B,C
    mgr: A(active), standbys: B, C
    mds: cephfs-0/0/1 up
    osd: x osds: y up, z in

  data:
    pools:   7 pools, 176 pgs
    objects: 2816 objects, 18856 MB
    usage:   69132 MB used, 44643 GB / 44711 GB avail
    pgs:     176 active+clean

Deep scrub distribution

  • Distribution per weekday:

    $ for date in $(ceph pg dump | grep active | awk '{ print $20 })'; do date +%A -d $date; done | sort | uniq -c
  • Distribution per hours:

    $ for date in $(ceph pg dump | grep active | awk '{ print $21 }'); do date +%H -d $date; done | sort | uniq -c

Set the number of placement groups

$ ceph osd pool set {pool-name} pg_num {pg_num}
set pool x pg_num to {pg_num}
$ ceph osd pool set {pool-name} pgp_num {pgp_num}
set pool x pgp_num to {pgp_num}

The new number of PGs should also be updated in environments/ceph/configuration.yml.

1 pools have many more objects per pg than average

Logging

  • Ceph daemons are configured to log to the console instead of log files. OSDs are configured to log to MONs.

    $ docker logs ceph-mon-ceph01
  • Logs can become very big. docker logs provides some useful parameters to only show newest logs and to see new log messages when they appear.

    $ docker logs --tail 100 --follow ceph-mon-ceph01

Add new OSD

  • Add the new device to the devices list in the inventory of the corresponding host
  • Execute osism-ceph osds -l HOST on the manager node

Replace defect OSD

  • Locate defect OSD

    $ ceph osd metadata osd.22
      "bluefs_slow_dev_node": "sdk",
      "hostname": "ceph04",
    
    $ ssh ceph04
    $ dmesg -T | grep sdk | grep -i error
      ...
      blk_update_request: I/O error, dev sdk, sector 7501476358
      Buffer I/O error on dev sdk1, logical block 7470017030, async page read
      blk_update_request: I/O error, dev sdk, sector 7501476359
      Buffer I/O error on dev sdk1, logical block 7470017031, async page read
  • Find and replace actual hardware

    $ sudo udevadm info --query=all --name=/dev/sdk
    $ sudo hdparm -I /dev/sdk
  • disable defect OSD/disk

    $ ceph osd out 22
    $ sudo systemctl stop ceph-osd@sdk.service
    $ ceph osd purge osd.22
  • Prepare new OSD

    $ docker start -ai ceph-osd-prepare-ceph04-sdk
    $ sudo systemctl start ceph-osd@sdk.service
  • Add OSD to tree

    $ ceph osd df tree
       CLASS WEIGHT REWEIGHT SIZE   USE    AVAIL  %USE  VAR TYPE NAME
                7.4       -  3709G  2422G  1287G 65.30 1.06  hdd ceph04-hdd
        hdd     3.7       0      0      0      0     0    0        osd.22
        hdd     3.7 1.00000  3709G  2422G  1287G 65.30 1.08        osd.6
        ...
        hdd     0.0       0      0      0      0     0    0 osd.27
    
    $ ceph osd crush create-or-move osd.22 3.7 hdd=ceph04-hdd
    $ ceph osd df tree
       CLASS WEIGHT REWEIGHT SIZE   USE    AVAIL  %USE  VAR TYPE NAME
                7.4       -  3709G  2422G  1287G 65.30 1.06  hdd ceph04-hdd
        hdd     3.7 1.00000  3709G      0  3709G     0    0        osd.22
        hdd     3.7 1.00000  3709G  2422G  1287G 65.30 1.08        osd.6

Add new pool

$ ceph osd pool create sample 32 32
pool 'sample' created
$ ceph osd pool application enable sample rbd
enabled application 'rbd' on pool 'sample'
$ ceph auth get client.cinder
[client.cinder]
   key = ...
   caps mon = "allow r"
   caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=vms, allow rx pool=images"
exported keyring for client.cinder
$ ceph auth caps client.cinder mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=images, allow rwx pool=vms, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=sample'
updated caps for client.cinder
$ ceph auth get client.nova
[client.nova]
   key = ...
   caps mon = "allow r"
   caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=images, allow rwx pool=vms, allow rwx pool=volumes, allow rwx pool=backups"
exported keyring for client.nova
$ ceph auth caps client.nova mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=images, allow rwx pool=vms, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=sample'
updated caps for client.nova

Export image

$ rbd export --pool=volumes volume-035f3636-ad68-4562-88f5-11d7e295d03e /home/dragon/035f3636-ad68-4562-88f5-11d7e295d03e.img
$ docker cp cephclient_cephclient_1:/home/dragon/035f3636-ad68-4562-88f5-11d7e295d03e.img /tmp
$ docker exec -it cephclient_cephclient_1 rm -f /home/dragon/035f3636-ad68-4562-88f5-11d7e295d03e.img
$ rm -f /tmp/035f3636-ad68-4562-88f5-11d7e295d03e.img

Repair PGs

  • Health of Ceph cluster
$ sudo ceph status
  cluster:
    id:     0155072f-6a71-4f5c-8967-f86e5307033f
    health: HEALTH_ERR
            4 scrub errors
            Possible data damage: 1 pg inconsistent

$ sudo ceph health detail
HEALTH_ERR 4 scrub errors; Possible data damage: 1 pg inconsistent
OSD_SCRUB_ERRORS 4 scrub errors
PG_DAMAGED Possible data damage: 1 pg inconsistent
    pg 54.76 is active+clean+inconsistent, acting [39,6,15]
  • Repair the PG
$ sudo ceph pg repair 54.76
instructing pg 54.76 on osd.39 to repair
  • give the Ceph cluster some time for repair and check health
$ sudo ceph health detail
HEALTH_OK

$ sudo ceph status
  cluster:
    id:     0155072f-6a71-4f5c-8967-f86e5307033f
    health: HEALTH_OK
You can’t perform that action at this time.