CephFS FUSE on Cloudlab

  1. Spin up an instance on CloudLab. This example uses a 2-node client0-osd0 configuration.
  2. Become root on all nodes immediately sudo su -
  3. Change hostname to avoid networking confusion.
root@client0:~# hostname client0
root@client0:~# HOSTNAME=client0
root@osd0:~# hostname osd0
root@osd0:~# HOSTNAME=osd0
  1. On client0, copy these files from the shared and get a nodes.txt file with the node labels in a list.
cp /proj/skyhook-PG0/projscripts/format-sd* . ;
cp /proj/skyhook-PG0/projscripts/  . ;
cp /proj/skyhook-PG0/projscripts/ . ;
echo client0 >> nodes.txt ;
echo osd0 >> nodes.txt ;
  1. On each osd, copy only the format scripts.
cp /proj/skyhook-PG0/projscripts/format-sd* . ;
  1. On client0, configure the cluster ssh.
assumes there is a local file 'nodes.txt' with the correct node names of all machines:clientX through osdX
assumes this node has its ~/.ssh/id_rsa key present and permissions are 0600
will copy ssh keys and known host signatures to the following nodes in 10 seconds:
# client0 SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.13
# client0 SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.13
no hostkey alg
# osd0 SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.13
# osd0 SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.13
no hostkey alg
Warning: Permanently added the RSA host key for IP address '' to the list of known hosts.
id_rsa                                                                            100% 3247     3.2KB/s   00:00    
known_hosts                                                                       100% 1326     1.3KB/s   00:00    
Warning: Permanently added the RSA host key for IP address '' to the list of known hosts.
id_rsa                                                                            100% 3247     3.2KB/s   00:00    
known_hosts                                                                       100% 1768     1.7KB/s   00:00    
  1. On all nodes, make sure sda4 is the 500GB SSD (check with lsblk) and run If sda4 is not the 500GB SSD, replace the string 'sda4' in the script with the appropriate identifier.
root@client0:~# sh 
mke2fs 1.42.9 (4-Feb-2014)
Discarding device blocks: done                            
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
27869184 inodes, 111444824 blocks
5572241 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
3402 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done     

Filesystem                              Size  Used Avail Use% Mounted on
/dev/sda1                                16G  1.8G   14G  13% /
none                                    4.0K     0  4.0K   0% /sys/fs/cgroup
udev                                     94G  4.0K   94G   1% /dev
tmpfs                                    19G  1.4M   19G   1% /run
none                                    5.0M     0  5.0M   0% /run/lock
none                                     94G     0   94G   0% /run/shm
none                                    100M     0  100M   0% /run/user  100G   97G  3.9G  97% /proj/skyhook-PG0              50G  2.0G   49G   4% /share
/dev/sda4                               419G   71M  397G   1% /mnt/sda4
root@client0:~# lsblk
sda      8:0    0 447.1G  0 disk 
├─sda1   8:1    0    16G  0 part /
├─sda2   8:2    0     3G  0 part 
├─sda3   8:3    0     3G  0 part [SWAP]
└─sda4   8:4    0 425.1G  0 part /mnt/sda4
sdb      8:16   0   1.1T  0 disk 
  1. On client0, also reformat the 1TB HDD, which is sdb in the case of c220g5 nodes. Use the script to drive the reformatting process, but replace the sdc string in the script with the appropriate identifier for the 1TB hard disk for your nodes.
root@client0:~# sh 
mke2fs 1.42.9 (4-Feb-2014)
/dev/sdb is entire device, not just one partition!
Proceed anyway? (y,n) y
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
73261056 inodes, 293028246 blocks
14651412 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=4294967296
8943 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 
	4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 
	102400000, 214990848

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done     

Filesystem                              Size  Used Avail Use% Mounted on
/dev/sda1                                16G  1.8G   14G  13% /
none                                    4.0K     0  4.0K   0% /sys/fs/cgroup
udev                                     94G  4.0K   94G   1% /dev
tmpfs                                    19G  1.4M   19G   1% /run
none                                    5.0M     0  5.0M   0% /run/lock
none                                     94G     0   94G   0% /run/shm
none                                    100M     0  100M   0% /run/user  100G   97G  3.9G  97% /proj/skyhook-PG0              50G  2.0G   49G   4% /share
/dev/sda4                               419G   71M  397G   1% /mnt/sda4
/dev/sdb                                1.1T   71M  1.1T   1% /mnt/sdb
root@client0:~# lsblk
sda      8:0    0 447.1G  0 disk 
├─sda1   8:1    0    16G  0 part /
├─sda2   8:2    0     3G  0 part 
├─sda3   8:3    0     3G  0 part [SWAP]
└─sda4   8:4    0 425.1G  0 part /mnt/sda4
sdb      8:16   0   1.1T  0 disk /mnt/sdb
  1. Install some version of Ceph. Here's the line for installing from deb files saved in the shared dir. Change the path to something appropriate. This will take awhile--may want to use tmux (but would need to install it first).
apt-get update ; apt-get -f install ;  apt-get update; sudo dpkg -i /proj/skyhook-PG0/cephbits/kat_skyhook1227_ub14_g5/*.deb; sudo apt-get install -f -y; sudo dpkg -i /proj/skyhook-PG0/cephbits/kat_skyhook1227_ub14_g5/*.deb
  1. Install ceph-deploy (VERSION 1.5.37!!!!) on client0 only and run initial setup commands.
sudo apt-get -f install ;
sudo apt-get install -y python-virtualenv ;
mkdir cluster ;
cd cluster ;
virtualenv env ;
env/bin/pip install ceph-deploy==1.5.37 ;
env/bin/ceph-deploy new client0 ;
env/bin/ceph-deploy mon create-initial ;
  1. On client0, copy nodes.txt and into ~/cluster.
  2. On client0 in ~/cluster, remove 'client0' from nodes.txt.
  3. On client0 in ~/cluster, replace 'sdc' in the script with the appropriate label for the 1TB HDDs on the osd nodes (e.g. sdb on c220g5 hardware).
  4. On client0 in ~/cluster, run the script to reformat the 1TB HDDs on the osd nodes.
root@client0:~/cluster# cat 

set -e

for n in `cat nodes.txt`; do
  echo $n
  env/bin/ceph-deploy disk zap $n:sdb
  env/bin/ceph-deploy osd create $n:sdb

root@client0:~/cluster# sh 
  1. More setup stuff.
env/bin/ceph-deploy admin client0 ;
sudo chmod a+r /etc/ceph/ceph.client.admin.keyring ;
ceph osd set noscrub ;
ceph osd set nodeep-scrub ;
  1. Create MDS for filesystem.
env/bin/ceph-deploy mds create client0 ;
  1. Create metadata and data pools for the filesystem.
root@client0:~/cluster# ceph osd pool create katfs_meta 64 64 replicated
pool 'katfs_meta' created
root@client0:~/cluster# ceph osd pool create katfs_data 64 64 replicated
pool 'katfs_data' created
root@client0:~/cluster#  ceph osd pool set katfs_meta size 1
set pool 1 size to 1
root@client0:~/cluster# ceph osd pool set katfs_data size 1
set pool 2 size to 1
  1. Create the filesystem.
root@client0:~/cluster# ceph fs new katfs katfs_meta katfs_data
new fs with metadata pool 1 and data pool 2
  1. Make a directory to mount with cephfs. mkdir /mnt/katfs
  2. Check status to make sure MDS is up and active.
root@client0:~/cluster# ceph -s
    id:     e63a2e05-c752-48e2-8df3-5cdc31125072
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set
            no active mgr
    mon: 1 daemons, quorum client0
    mgr: no daemons active
    mds: katfs-1/1/1 up  {0=client0=up:active}
    osd: 1 osds: 1 up, 1 in
         flags noscrub,nodeep-scrub
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
  1. Mount cephfs in the newly made directory.
root@client0:~/cluster# ceph-fuse -k /etc/ceph/ceph.client.admin.keyring -c /etc/ceph/ceph.conf /mnt/katfs
2019-05-19 22:44:23.454117 7efd36ecf000 -1 init, newargv = 0x7efd40413120 newargc=9
ceph-fuse[9823]: starting ceph client
ceph-fuse[9823]: starting fuse
  1. Check if it worked by consulting mount.
root@client0:~/cluster# mount | grep ceph
ceph-fuse on /mnt/katfs type fuse.ceph-fuse (rw,nosuid,nodev,allow_other)
  1. Make some files in the mounted directory and check if stuff's being written by listing the objects in the meta and data pools. e.g.:
rados -p katfs_data ls -
rados -p katfs_meta ls -
