Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permission denied when creating a journal in a new OSD container #171

Closed
fungiboletus opened this issue Dec 3, 2015 · 85 comments
Closed

Permission denied when creating a journal in a new OSD container #171

fungiboletus opened this issue Dec 3, 2015 · 85 comments

Comments

@fungiboletus
Copy link

@fungiboletus fungiboletus commented Dec 3, 2015

Hi, thank you for your work on these containers. I am running on a small issue, and this is the output:

DEBUG:ceph-disk:OSD id is 0
DEBUG:ceph-disk:Initializing OSD...
INFO:ceph-disk:Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/tmp/mnt.y_qXKn/activate.monmap
got monmap epoch 1
INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 0 --monmap /var/lib/ceph/tmp/mnt.y_qXKn/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.y_qXKn --osd-journal /var/lib/ceph/tmp/mnt.y_qXKn/journal --osd-uuid 29cd5619-cb46-41da-be4e-05f86180b67c --keyring /var/lib/ceph/tmp/mnt.y_qXKn/keyring --setuser ceph --setgroup ceph
2015-12-03 16:18:44.042200 7f8c7bbb1940 -1 filestore(/var/lib/ceph/tmp/mnt.y_qXKn) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.y_qXKn/journal: (13) Permission denied
2015-12-03 16:18:44.042222 7f8c7bbb1940 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2015-12-03 16:18:44.042257 7f8c7bbb1940 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.y_qXKn: (13) Permission denied
ERROR:ceph-disk:Failed to activate
DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.y_qXKn
INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.y_qXKn

I start the OSD container using this command:

sudo docker run -d --net=host \
  --privileged=true \
  -v /var/lib/ceph/:/var/lib/ceph/ \
  -v /dev/:/dev/ \
  -e OSD_DEVICE=/dev/sda \
  -e OSD_TYPE=disk \
  -e MON_IP_AUTO_DETECT=4 \
  -e KV_TYPE=consul \
  -e KV_IP=192.168.1.6 \
  -e KV_PORT=8500 \
  -e ESD_FORCE_ZAP=1 \
  ceph/daemon osd

The ceph monitor should be running, and it is started like this:

sudo docker run -d --net=host \
  -v /var/lib/ceph/:/var/lib/ceph/ \
  -e MON_NAME=ceph_node1 \
  -e MON_IP=192.168.1.41 \
  -e CEPH_PUBLIC_NETWORK=192.168.0.0/24 \
  -e CEPH_CLUSTER_NETWORK=192.168.0.0/24 \
  -e MON_IP_AUTO_DETECT=4 \
  -e KV_TYPE=consul \
  -e KV_IP=192.168.1.6 \
  -e KV_PORT=8500 \
  ceph/daemon mon

I have the default settings from populate.sh (on a side topic, kviator wasn't working on a goland container. I used curl instead).

I also have 6 hard drive per node, how can I use multiple OSD_TYPE ? Should I start one OSD per hard drive?

Thank you

@leseb

This comment has been minimized.

Copy link
Contributor

@leseb leseb commented Dec 3, 2015

Is selinux enabled?

@leseb

This comment has been minimized.

Copy link
Contributor

@leseb leseb commented Dec 3, 2015

If yes, then run and retry to run your containers:

chcon -Rt svirt_sandbox_file_t /etc/ceph
chcon -Rt svirt_sandbox_file_t /var/lib/ceph
@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 3, 2015

The host is running Ubuntu 15.10, and it doesn't have selinux. The version of docker is 1.9.1 (a34a1d5).

@leseb

This comment has been minimized.

Copy link
Contributor

@leseb leseb commented Dec 3, 2015

Are you using the latest image? Is the Ceph version Infernalis?

@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 3, 2015

Yes, I just pulled the latest image from the docker registry. I am not sure how to check the Ceph version in the ceph/daemon image.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 3, 2015

In your report, you mention you are using ESD_FORCE_ZAP=1, which would be OSD_FORCE_ZAP=1.

You're seeing a permission denied message in creating the journal. I'm going to guess there is already a filesystem on that referenced disk /dev/sda, and without the OSD_FORCE_ZAP=1, it is not replacing it.

However, are you certain you want to use /dev/sda as your OSD? What is your main host OS drive? That would normally be your first disk /dev/sda.

@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 3, 2015

The permission denied is when creating the journal. I added the FORCE_ZAP
parameter to restart the process because indeed /dev/sda had a filsystem
after the first crash. I will check if using OSD_FORCE_ZAP helps instead of
ESD_FORCE_ZAP, it might be a typo.

Otherwise yes the nodes have PCI express SSDs, /dev/sda to /dev/sdf are
ready to be used :-)
Le 4 déc. 2015 12:04 AM, "Seán C. McCord" notifications@github.com a
écrit :

In your report, you mention you are using ESD_FORCE_ZAP=1, which would be
OSD_FORCE_ZAP=1.

You're seeing a permission denied message in creating the journal. I'm
going to guess there is already a filesystem on that referenced disk
/dev/sda, and without the OSD_FORCE_ZAP=1, it is not replacing it.

However, are you certain you want to use /dev/sda as your OSD? What is
your main host OS drive? That would normally be your first disk /dev/sda.


Reply to this email directly or view it on GitHub
#171 (comment).

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 3, 2015

If you wouldn't mind trying it for diagnostic purposes, could you manually partition and create a filesystem on /dev/sda1. Since you're running in privileged mode, those are SSDs, and you aren't using a separate disk for journaling, you might as well choose btrfs.

Anyway, create the filesystem and mount it to /var/lib/ceph/osd/ceph-0 from the host, then run with OSD_TYPE=directory. I suspect that will work, but I am curious to have that verified.

@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 4, 2015

Thanks. It looks like it works in the directory mode. I used ext4 (bad memories from my last tentative of using btrfs).

However the command docker exec <mon-container-id> ceph osd create doesn't work (ERROR: missing keyring, cannot use cephx for authentication) but I am not sure if I should worry or just ignore it.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 4, 2015

In that case (using ext4), make sure you run your journal on a different drive. I think there are also some other caveats for using ext4, too. If you're afraid of btrfs, I would recommend at least using xfs. I run a mixture of btrfs and xfs, myself.

You should definitely be able to create an OSD from a mon container. Perhaps the environment isn't being set up in your docker exec command because you are not supplying the -it switch?

@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 4, 2015

Ok, thanks for the help. I pulled the latest image and restarted using xfs, and I am now stuck with another error when starting the osd container in directory mode :

Adding bootstrap keyrings
warning: unable to create /var/run/ceph: (13) Permission denied
2015-12-04 16:17:48.258293 7fdd56828940 -1 asok(0x564d01048000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.0.asok': (2) No such file or directory
2015-12-04 16:17:48.530555 7fdd56828940 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2015-12-04 16:17:48.760239 7fdd56828940 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2015-12-04 16:17:48.761033 7fdd56828940 -1 filestore(/var/lib/ceph/osd/ceph-0) could not find -1/23c2fcde/osd_superblock/0 in index: (2) No such file or directory
2015-12-04 16:17:49.293910 7fdd56828940 -1 created object store /var/lib/ceph/osd/ceph-0 journal /var/lib/ceph/osd/ceph-0/journal for osd.0 fsid e0dcc8b8-96c4-4032-bf43-b7aef1bed3a1
2015-12-04 16:17:49.293961 7fdd56828940 -1 auth: error reading file: /var/lib/ceph/osd/ceph-0/keyring: can't open /var/lib/ceph/osd/ceph-0/keyring: (2) No such file or directory
2015-12-04 16:17:49.294107 7fdd56828940 -1 created new key in keyring /var/lib/ceph/osd/ceph-0/keyring
HEALTH_WARN 192 pgs stuck inactive; 192 pgs stuck unclean
Error EINVAL: entity osd.0 exists but key does not match
@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 4, 2015

Yeah, because you wiped the OSD store, you've lost your OSD's configuration. Just delete osd.0 from ceph (from the monitor, for instance), then recreate it and start afresh.

@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 4, 2015

I removed everything I could find (the consul configuration, all the containers including the monitor, the /var/lib/ceph directory, I formatted the hard drive), but I still had the same error. I tried a second time checking more carefully whether everything was removed, but without success.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 4, 2015

So it doesn't sound like you deleted the OSD in the Ceph database. The opposite of ceph osd create... I'd reference the doc, but it appears that site is down right now.

@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 4, 2015

Where is stored the Ceph database? Is it an online service? Because I feel like I did remove everything, and restarted from scratch.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 4, 2015

No, it's an internal KV store to Ceph. The monitors maintain it; you have to manipulate it through the ceph CLI tool.

Removing OSDs

The command I'm talking about is ceph osd rm 0... however, make sure you follow the whole procedure, in case it has been added to CRUSH, etc.

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Dec 7, 2015

had the same issue with permission denied on CoreOS

2015-12-07 03:12:01.063550 7fadd14e2940  0 set uid:gid to 64045:64045
2015-12-07 03:12:01.063632 7fadd14e2940  0 ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299), process ceph-osd, pid 30438
2015-12-07 03:12:01.064885 7fadd14e2940 -1 asok(0x564408152000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.1.asok': (13) Permission denied
2015-12-07 03:12:01.065055 7fadd14e2940  1 filestore(/var/lib/ceph/osd/ceph-1) mkfs in /var/lib/ceph/osd/ceph-1
2015-12-07 03:12:01.065082 7fadd14e2940 -1 filestore(/var/lib/ceph/osd/ceph-1) mkfs: failed to open /var/lib/ceph/osd/ceph-1/fsid: (13) Permission denied
2015-12-07 03:12:01.065091 7fadd14e2940 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2015-12-07 03:12:01.065124 7fadd14e2940 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-1: (13) Permission denied

when changing the script to use ceph-osd to use --setuser root --setgroup root it works but that doesn't seem like the right thing to do.

the SELinux commands did not work on my CoreOS distro. not sure if there is an equivalent. I could write to /etc/ceph and /var/lib/ceph to write the config and run the monitor so not sure this is even an issue.

The process is pretty confusing:
i thought running the ceph/daemon osd was creating an OSD, but from this thread I get that I need to use ceph osd create from the monitor prior to that. is that right? How does the number reported when using ceph osd create relate to the OSD_ID? I don't quite follow here: what should the number be?
Can it be the same on all nodes? or should it be unique? I am used to taking the IP of my node removing the '.' and using it as an ID, so that it is unique in the cluster and can be determined on-the-fly but that didn't work with ceph, probably the number is too big. when i used 0 or 1 i worked, but i want to eventually provision nodes without having to fix IDs ahead of time, so if it needs to be unique I need another strategy.

another issue I see is the osd daemon scans for ids in a directory, which means 1 docker container can run multiple OSDs, as opposed to running 1 OSD per container. Except for modifying the script there is no easy way to define the OSD_ID to use in a container.

The doc is great when you know what you;re looking for, but as a beginner all this is very confusing.
Thanks for enlightening me...

@leseb

This comment has been minimized.

Copy link
Contributor

@leseb leseb commented Dec 7, 2015

I haven't read the thread entirely but I assume we hit an issue because of the new user 'ceph' that runs the ceph process and owns the directories. It's weird because I tested all the scenarios prior to merging the infernalis support :/ . I don't have much time at the moment but I'll see what I can do. Sorry about that.

@fungiboletus

This comment has been minimized.

Copy link
Author

@fungiboletus fungiboletus commented Dec 7, 2015

@Ulexus I removed the monitors, a more brutal way to remove the OSDs I guess.

@leseb Ok, no problem :-) I will probably wait until next year to try again.

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Dec 7, 2015

i need to do more testing but thinking about it, it may very well be because the ceph:ceph user:group does not exist on the host, and docker runs as ROOT.

Can someone comment about the OSD_ID: after much fiddling around today, it seems like the process goes as follow:

  • start a monitor
  • create an OSD
  • create the folder on host at /var/lib/ceph/ ${CLUSTER}-${OSD_ID} with the OSD_ID returned by the ceph osd create command.
  • start the osd daemon

I noticed a few things:

  1. if I create a folder with some random number, and create the osd after, it does not show in the tree under my host but seem detached (?!?) and the cluster remains in degraded state. The osd may be up, but not 'in', or even 'in' but not clean. why? is the process above correct and mandatory? if so the script should create the osd if it doesn't exist and use the number to create the directory.

  2. the ceph/config is not up-to-date and conflicts with the ceph/daemon: the path of the keys is different. Deprecated?

  3. one really need to run the example/populate.sh to get reasonable defaults. Without it the osd daemon won't work as the keys are not populated. On CoreOS, one need a kviator image/binary on the host, that is, put the script in a docker image with kviator. I ended up having to put it in the daemon image. Would be good to integrate as part of the startup script.

  4. If I start more than one osd with a osd_directory on the same host /physical disk, the available disk space is multiplied. (this makes sense, but it should be clear that one should really not have more than one osd dir per host/physical drive, and therefore only one osd daemon container).

I'm all up for providing a pull request but I'd like to make sure I understand this correctly. I feel like I' missing a bunch of details.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 7, 2015

@streamnsight @leseb Yes, I remember the ceph:ceph execution addition. The account does not need to exist on the host. It is merely a mapping to the underlying UID using the name map inside the container. That should be fine. Even though your docker binary runs as root, the execution in the container can (and does) drop privileges when told to do so (as it now is). This is surely the problem that several people are noticing here.

@streamnsight Your summary describes the original implementation that I wrote however long ago that was. I had to externally add the OSD (usually from a monitor) because I had no facility to do that from within the OSD container. @leseb has a much better system, now, which uses bootstrap keys. These keys allow the container running the OSD to add the OSD to Ceph, allowing for a much simpler and more automatable process. The osd_directory method, however, still mostly uses my old code, which does not leverage this feature.

To your points:

  1. Yes, the OSD ids must be unique across the cluster, and the IDs of the directories must match a valid OSD in the cluster (they cannot be randomly assigned; ceph osd create must have returned the id). This presents a chicken-and-egg problem, of course, which is the reason for the more automated OSD types (like osd_disk). The tradeoff with those is that the container is responsible for creating the filesystem, which requires greater privilege.

  2. I have mostly looked at ceph/config as being deprecated, or at least not a part of the standard flow. I made it originally with this grand idea of having a floating data store which contained all the static configuration (basically, /etc/ceph/*). It never really seemed to be a great idea, however. As far as I am aware, it is not useful.

  3. The KV-store population is the biggest sticking point, to me. We have all agreed that that is the best way forward, but none of us have had the time to actually complete its integration. It is provided as an externally-executable tool, but it doesn't work automatically with any of the systems.

In this way, we're stuck with two different systems which each automate a different thing and which each have some different manual component.

  1. I had never even considered that someone may do that (though it makes perfect sense, now that you have done so). You are quite correct. There are a couple reasons why we do this. The first is that, without the --pid=host (or another way to share PID space between all of the OSDs running on a given host), you have to run all OSDs on a host within the same container, or you will have your ports exhausted very, very rapidly. The second is that the script now auto-detects the OSD id(s) based on the directory names.

The osd_directory theory of operation says that the host machine manager should handle all the mounts of the physical drives connected to the system. In that way, the OSD daemon configuration can be automated based on the directory layout it is given.

The osd_disk theory of operation says that an OSD is a self-contained entity whose lifecycle should be managed by itself. It expects to be given some "disk" (or other arbitrary block device), and it will partition it, format it, mount it, create its OSD, and bootstrap everything. Less well-defined here is the removal of the OSD from the system... but one step at a time, I guess.

We definitely appreciate any help, and thanks for your investigations thus far.

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Dec 7, 2015

Hi @Ulexus ,

Thanks for the quick feedback.
The reason I have tried ceph/config, is because ceph_mon didn't work OOTB (values not populated), so it seemed like the obvious 'missing' step to start with. If I tried running multiple OSD daemons on the same machine it's because I watched the video that starts 2 disk osds on each host...

As I said, I'm new to this and it's a bit confusing.

I use osd_directory because I don't have much choices of host, and only have 1 drive, which must also host the OS and other stuff. Not ideal, I know, but that's what I work with.

So coming back to the issues here:
1) ceph:ceph user:
I'm not sure to follow your point. You say it should work but it doesn't, so what would be the way to fix this? Create the user inside the container? Make sure it exists on the host ?(you seem to say it is not an issue, but if the dir is owned by root, how can Docker drop privileges and still have access?)

2) OSD_IDs and OSD creation/init:
Why is it a chicken and egg situation? Maybe with multiple osd_directory on the same drive, but that doesn't work well, so I think it should be prevented.
To create/init an OSD, the steps I followed can be automated easily

  • is there an OSD dir?
  • Yes? check if it's initialized, init if needed.
  • No? run ceph osd create, get the returned ID, create the dir with this ID, and init.

No need to externally run osd create or externally create the directory; less manual steps, more fool-proof.

To deal with privilege issues, it may be good to split 'init an OSD' and 'run' it. The script does both at once for disk or osd_directory but the init step should really be an ephemeral container, and then run should be a long running container that can also 'restart' an OSD if the container stop/failed for some reason.

Finally:
3) KV store defaults:
Here again, a 'config' step sounds better to me: if defaults are loaded within the script before running daemons there is no easy way to intervene and override them. After the facts, overriding a setting requires running manual ceph commands. Not sure why you say ceph/config was not a good idea.

Am I missing something important here? I'm not sure if the key management could be a source of issues in these scenarios.

Thanks for your comments
Cheers.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 7, 2015

  1. ceph:ceph user

The container already has the ceph user and group defined and mapped to a UID. At the host level, all it sees are the UID:GID, which don't need the be "created"; they're just numbers. You are definitely correct, though, the /var/lib/ceph/osd/* directories must allow read/write/execute permissions to the ceph:ceph user. On your host, that means you must chown 64045:64045 (which is what the names in the container are mapped to). I was fairly certain I had documented this somewhere... but I cannot find it, now.

  1. OSD_IDs and OSD creation/init
    OSD IDs are unique across the ceph cluster. So regardless of whether you run only a single OSD per machine or multiples, each OSD must be registered and unique.

The trick is with your third step. The cephx key for an OSD typically doesn't have access to create an OSD. You need to have access to either an osd bootstrap cephx key or an administrator cephx key. We can't presume (inside an OSD container) that these keys exist (of course, we could always just try them and see if we can).

Even then, though, we would have the problem of mapping the directory names to the OSD IDs. In your case, since you are trying to run your OSDs on the same filesystem as the rest of your container (definitely not recommended or even tested), that's not a big deal. It's just a directory name. In a standard installation, however, that is a major problem: the directory is mounted outside the scope (and hence, outside the privilege zone) of the container.

The concept of having separate executions for "init" and "run" is a good idea, and we have tossed it about before, with no real decisions being made.

  1. KV store defaults

Yes, you are probably correct. We could probably have an init target of the ceph/daemon container which formally initializes the cluster, generating the necessary keys and storing them in the KV store, as configured. It would need to tolerate being executed multiple times without blowing away any existing configuration, but that sounds like a good way of doing things.

I'm not sure I see how that could be generalized into, say, an osd_init target, since we have three different, probably conflicting, methods of handling OSDs.... but ideas are definitely welcome.

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Dec 7, 2015

@Ulexus

Thanks for the explanations.

Here is what I come up with so far:

On my fork

  • I use ceph osd create to create the osd and create the /var/lib/ceph/osd/${CLUSTER}_${OSD_ID} directory.
    I added checks to make sure the /var/lib/ceph/osd dir is owned by ceph:ceph, as well as the ${CLUSTER}-${OSD_ID} subfolders. => that gets rid of the permission denied issues.

The OSD is created, folder is created, everything seems to work to 'initialize' the OSD, but then I get the error that was also reported here:

2015-12-04 16:17:49.293961 7fdd56828940 -1 auth: error reading file: /var/lib/ceph/osd/ceph-0/keyring: can't open /var/lib/ceph/osd/ceph-0/keyring: (2) No such file or directory
2015-12-04 16:17:49.294107 7fdd56828940 -1 created new key in keyring /var/lib/ceph/osd/ceph-0/keyring
HEALTH_WARN 192 pgs stuck inactive; 192 pgs stuck unclean
Error EINVAL: entity osd.0 exists but key does not match

If I just 'restart' the osd with the same docker command, the OSD starts...
So, the key thing seems to be the issue here, but I don't quite understand the problem again: the ceph-osd --mkfs --mkkey ... command is creating a new key as I understand it.
then further in the script that key is added to the folder

ceph ${CEPH_OPTS} --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/${CLUSTER}.keyring auth add osd.${OSD_ID} -i /var/lib/ceph/osd/${CLUSTER}-${OSD_ID}/keyring osd 'allow *' mon 'allow profile osd'
chown ceph. /var/lib/ceph/osd/${CLUSTER}-${OSD_ID}/keyring
chmod 0600 /var/lib/ceph/osd/${CLUSTER}-${OSD_ID}/keyring      

So, it just seems like the osd is somehow started before this happens?!?
The second time around the key is in the folder as expected and it starts properly.

I just don't get how that happens...

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Dec 7, 2015

further down in my investigation:
the line

# Add the OSD key
ceph ${CEPH_OPTS} --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/${CLUSTER}.keyring auth add osd.${OSD_ID} -i /var/lib/ceph/osd/${CLUSTER}-${OSD_ID}/keyring osd 'allow *' mon 'allow profile osd'

in the script exits with the error Error EINVAL: entity osd.0 exists but key does not match
If i catch it, things seem to work as expected.
This seems to be due to the fact that the key was registered in a previous attempt and is not removed by ceph osd rm
After I use ceph auth del osd.id to remove the key of the osd that I removed, the scrip runs without errors and it seems like I can create the osd, the dir, the key, and start the osd all in one call to ceph/daemon osd_directory

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Dec 8, 2015

Yes, the ceph-osd ... --mkkey ... generates the key (locally). The ceph ... auth ... registers the key with the Ceph cluster with the given permissions. OSDs are intended to be durable entities; you can replace them, but they would generally scale beyond the scope of a container.

In this case, it looks like you are not retaining your OSDs across container instances. You'll need to follow all the instructions for removing an OSD, if you are starting from scratch each time.

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Dec 8, 2015

yes i was testing and didn't realize i had to remove the key too.
added error catching as a bug fix in my bug fixes PR

@tobilg

This comment has been minimized.

Copy link
Contributor

@tobilg tobilg commented Dec 23, 2015

I run in the same permission problems on CoreOS now after being able to sucessfully start my MONs:

Docker command:

docker run -d --net=host \
--privileged=true \
--pid=host \
-v /var/lib/ceph/:/var/lib/ceph/ \
-v /dev/:/dev/ \
-e OSD_DEVICE=/dev/sdb \
-e OSD_TYPE=disk \
-e KV_TYPE=etcd \
-e KV_IP=192.168.200.163 \
-e KV_PORT=4001 \
ceph/daemon osd

Log output:

INFO:ceph-disk:Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 0 --monmap /var/lib/ceph/tmp/mnt.N1pjbF/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.N1pjbF --osd-journal /var/lib/ceph/tmp/mnt.N1pjbF/journal --osd-uuid e8a13004-e8e2-44c0-9dc4-5fbb1631b9ba --keyring /var/lib/ceph/tmp/mnt.N1pjbF/keyring --setuser ceph --setgroup ceph
2015-12-23 10:15:32.503390 7f49da72e940 -1 filestore(/var/lib/ceph/tmp/mnt.N1pjbF) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.N1pjbF/journal: (13) Permission denied
2015-12-23 10:15:32.503399 7f49da72e940 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2015-12-23 10:15:32.503414 7f49da72e940 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.N1pjbF: (13) Permission denied
ERROR:ceph-disk:Failed to activate
DEBUG:ceph-disk:Unmounting /var/lib/ceph/tmp/mnt.N1pjbF
INFO:ceph-disk:Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.N1pjbF
Traceback (most recent call last):
  File "/usr/sbin/ceph-disk", line 3576, in <module>
    main(sys.argv[1:])
  File "/usr/sbin/ceph-disk", line 3530, in main
    args.func(args)
  File "/usr/sbin/ceph-disk", line 2424, in main_activate
    dmcrypt_key_dir=args.dmcrypt_key_dir,
  File "/usr/sbin/ceph-disk", line 2197, in mount_activate
    (osd_id, cluster) = activate(path, activate_key_template, init)
  File "/usr/sbin/ceph-disk", line 2360, in activate
    keyring=keyring,
  File "/usr/sbin/ceph-disk", line 1950, in mkfs
    '--setgroup', get_ceph_user(),
  File "/usr/sbin/ceph-disk", line 349, in command_check_call
    return subprocess.check_call(arguments)
  File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '--mkkey', '-i', '0', '--monmap', '/var/lib/ceph/tmp/mnt.N1pjbF/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.N1pjbF', '--osd-journal', '/var/lib/ceph/tmp/mnt.N1pjbF/journal', '--osd-uuid', 'e8a13004-e8e2-44c0-9dc4-5fbb1631b9ba', '--keyring', '/var/lib/ceph/tmp/mnt.N1pjbF/keyring', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status 1

Is the PR @streamnsight is talking about already merged, or what's the current status? Thanks in advance!

@h0tbird

This comment has been minimized.

Copy link
Contributor

@h0tbird h0tbird commented Dec 23, 2015

+1 (same problems)

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Dec 23, 2015

Yes it was merged. Seems like there is still some folder/file permission issues.
I will check on this today.

@tobilg

This comment has been minimized.

Copy link
Contributor

@tobilg tobilg commented Dec 26, 2015

@streamnsight @Ulexus That's exactly my use case as well.

@tobilg

This comment has been minimized.

Copy link
Contributor

@tobilg tobilg commented Jan 4, 2016

I was just trying to do the manual process of adding an OSD as @Ulexus lined out above, namely

$ sudo /usr/sbin/mkfs.btrfs -f /dev/sdb
$ docker exec <imageId> ceph osd create
$ sudo mkdir -p /var/lib/ceph/osd/ceph-<osdId>
$ sudo mount /dev/sdb /var/lib/ceph/osd/ceph-<osdId>

Results in the following error message:

$ docker run -it --net=host --name ceph-osd-1 --privileged=true --pid=host -v /var/lib/ceph/:/var/lib/ceph/ -e KV_TYPE=etcd -e KV_IP=192.168.200.163 -e KV_PORT=4001 ceph/daemon osd
2016-01-04T09:37:53Z coreos-hw-node-1 confd[14840]: INFO Backend set to etcd
2016-01-04T09:37:53Z coreos-hw-node-1 confd[14840]: INFO Starting confd
2016-01-04T09:37:53Z coreos-hw-node-1 confd[14840]: INFO Backend nodes set to 192.168.200.163:4001
2016-01-04T09:37:53Z coreos-hw-node-1 confd[14840]: INFO Target config /etc/ceph/ceph.conf out of sync
2016-01-04T09:37:53Z coreos-hw-node-1 confd[14840]: INFO Target config /etc/ceph/ceph.conf has been updated
Adding bootstrap keyrings
Bootstrapped OSD(s) found; using OSD directory
warning: unable to create /var/run/ceph: (13) Permission denied
2016-01-04 09:37:53.858941 7fc800c24940 -1 asok(0x5610ae636000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.1.asok': (2) No such file or directory
2016-01-04 09:37:53.859126 7fc800c24940 -1 filestore(/var/lib/ceph/osd/ceph-1) mkfs: failed to open /var/lib/ceph/osd/ceph-1/fsid: (13) Permission denied
2016-01-04 09:37:53.859134 7fc800c24940 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2016-01-04 09:37:53.859155 7fc800c24940 -1  ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-1: (13) Permission denied

If I try to chown the relevant folders, I receive a

$ sudo chown -R ceph. /var/lib/ceph/
chown: invalid user: 'ceph.'

error. I neither have an User ceph on my CoreOS system, nor a group named ceph... Where would these have been established?

The /var/lib/ceph directory has the following permissions:

$ ls -la /var/lib/ceph    
total 56
drwxr-xr-x  7 root  root  4096 Jan  4 09:30 .
drwxr-xr-x 24 root  root  4096 Jan  4 00:00 ..
drwxr-xr-x  2 root  root  4096 Dec 23 09:51 bootstrap-mds
drwxr-xr-x  2 root  root  4096 Dec 23 09:51 bootstrap-osd
drwxr-xr-x  2 root  root  4096 Dec 23 09:51 bootstrap-rgw
drwxr-xr-x  3 root  root  4096 Dec 23 09:51 mon
drwxr-xr-x  3 64045 64045 4096 Jan  4 09:30 osd

I have three MONs running on all three hosts:

$ docker exec ef77efd7a909 ceph -s
    cluster 3840a923-6e11-4784-a0b1-f9c5132da243
     health HEALTH_WARN
            64 pgs stuck inactive
            64 pgs stuck unclean
     monmap e3: 3 mons at {coreos-hw-node-1=192.168.200.161:6789/0,coreos-hw-node-2=192.168.200.162:6789/0,coreos-hw-node-3=192.168.200.163:6789/0}
            election epoch 8, quorum 0,1,2 coreos-hw-node-1,coreos-hw-node-2,coreos-hw-node-3
     osdmap e3: 2 osds: 0 up, 0 in
            flags sortbitwise
      pgmap v4: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

Any ideas?

@streamnsight

This comment has been minimized.

Copy link
Contributor

@streamnsight streamnsight commented Jan 4, 2016

That's another one of these tricks
On the host you need to use chown 64045:64045

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 4, 2016

The user 'ceph' is not defined in your host's /etc/passwd file. That's fine, but it means you will have to specify the ownership by UID. The /var/lib/ceph/osd directory itself is already properly-owned, but I would guess your OSD's directory is not:

chown -R 64045:64045 /var/lib/ceph/osd/ceph-*

I am certain I have documented this before, but again, I cannot find it, so it is obviously not adequately documented.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 4, 2016

I have updated the ceph_directory documentation to note this.

@tobilg

This comment has been minimized.

Copy link
Contributor

@tobilg tobilg commented Jan 6, 2016

I applied the chown statement and tried to run the Docker OSD container again (same as above). Output is:

$ docker logs ceph-osd-1
2016-01-06T09:43:15Z coreos-hw-node-1 confd[11933]: INFO Backend set to etcd
2016-01-06T09:43:15Z coreos-hw-node-1 confd[11933]: INFO Starting confd
2016-01-06T09:43:15Z coreos-hw-node-1 confd[11933]: INFO Backend nodes set to 192.168.200.163:4001
2016-01-06T09:43:15Z coreos-hw-node-1 confd[11933]: INFO Target config /etc/ceph/ceph.conf out of sync
2016-01-06T09:43:15Z coreos-hw-node-1 confd[11933]: INFO Target config /etc/ceph/ceph.conf has been updated
Adding bootstrap keyrings
Bootstrapped OSD(s) found; using OSD directory
warning: unable to create /var/run/ceph: (13) Permission denied
2016-01-06 09:43:15.476136 7fab2f8a4940 -1 asok(0x562cb51f4000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/ceph-osd.1.asok': (2) No such file or directory
2016-01-06 09:43:15.792641 7fab2f8a4940 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
btrfs SNAP_DESTROY failed as non-root; remount with -o user_subvol_rm_allowed
2016-01-06 09:43:16.032005 7fab2f8a4940 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2016-01-06 09:43:16.032454 7fab2f8a4940 -1 filestore(/var/lib/ceph/osd/ceph-1) could not find -1/23c2fcde/osd_superblock/0 in index: (2) No such file or directory
2016-01-06 09:43:16.162065 7fab2f8a4940 -1 created object store /var/lib/ceph/osd/ceph-1 journal /var/lib/ceph/osd/ceph-1/journal for osd.1 fsid 3840a923-6e11-4784-a0b1-f9c5132da243
2016-01-06 09:43:16.162083 7fab2f8a4940 -1 auth: error reading file: /var/lib/ceph/osd/ceph-1/keyring: can't open /var/lib/ceph/osd/ceph-1/keyring: (2) No such file or directory
2016-01-06 09:43:16.162154 7fab2f8a4940 -1 created new key in keyring /var/lib/ceph/osd/ceph-1/keyring
HEALTH_WARN 64 pgs stuck inactive; 64 pgs stuck unclean
added key for osd.1
done adding key
create-or-move updating item name 'osd.1' weight 1.82 at location {host=coreos-hw-node-1,root=default} to crush map
*** Running /etc/rc.local...
*** Booting runit daemon...
*** Runit started as PID 12080
store-daemon: starting daemon on coreos-hw-node-1...
2016-01-06 09:43:18.639157 7fdc82604940  0 ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299), process ceph-osd, pid 12089
starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
2016-01-06 09:43:18.646196 7fdc82604940  0 filestore(/var/lib/ceph/osd/ceph-1) backend btrfs (magic 0x9123683e)
2016-01-06 09:43:18.646354 7fdc82604940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2016-01-06 09:43:18.646360 7fdc82604940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2016-01-06 09:43:18.646371 7fdc82604940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: splice is supported
2016-01-06 09:43:18.689503 7fdc82604940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2016-01-06 09:43:18.689551 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: CLONE_RANGE ioctl is supported
2016-01-06 09:43:18.689567 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: failed to create simple subvolume test_subvol: (17) File exists
2016-01-06 09:43:18.689580 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: SNAP_CREATE is supported
2016-01-06 09:43:18.689612 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: SNAP_DESTROY is supported
2016-01-06 09:43:18.695735 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: START_SYNC is supported (transid 16)
2016-01-06 09:43:18.721764 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: WAIT_SYNC is supported
2016-01-06 09:43:18.721787 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: removing old async_snap_test
2016-01-06 09:43:18.724738 7fdc82604940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: SNAP_CREATE_V2 is supported
2016-01-06 09:43:18.725000 7fdc82604940 -1 filestore(/var/lib/ceph/osd/ceph-1) ERROR: /var/lib/ceph/osd/ceph-1/current/nosnap exists, not rolling back to avoid losing new data
2016-01-06 09:43:18.725006 7fdc82604940 -1 filestore(/var/lib/ceph/osd/ceph-1) Force rollback to old snapshotted version with 'osd use stale snap = true'
2016-01-06 09:43:18.725008 7fdc82604940 -1 filestore(/var/lib/ceph/osd/ceph-1) config option for --osd-use-stale-snap startup argument.
2016-01-06 09:43:18.725012 7fdc82604940 -1 osd.1 0 OSD:init: unable to mount object store
2016-01-06 09:43:18.725018 7fdc82604940 -1  ** ERROR: osd init failed: (95) Operation not supported

This error message repeats itself every second. When I tried to stop the container, Docker crashed (never experienced this before)...

So, I assume that I have to chown the /var/run/ceph as well?

@tobilg

This comment has been minimized.

Copy link
Contributor

@tobilg tobilg commented Jan 6, 2016

After manually creating and chowning /var/run/ceph I see this:

$ docker run -it --net=host --name ceph-osd-1 --privileged=true -v /var/lib/ceph/:/var/lib/ceph/ -e KV_TYPE=etcd -e KV_IP=192.168.200.163 -e KV_PORT=4001 ceph/daemon osd
2016-01-06T10:42:11Z coreos-hw-node-1 confd[12]: INFO Backend set to etcd
2016-01-06T10:42:11Z coreos-hw-node-1 confd[12]: INFO Starting confd
2016-01-06T10:42:11Z coreos-hw-node-1 confd[12]: INFO Backend nodes set to 192.168.200.163:4001
2016-01-06T10:42:11Z coreos-hw-node-1 confd[12]: INFO Target config /etc/ceph/ceph.conf out of sync
2016-01-06T10:42:11Z coreos-hw-node-1 confd[12]: INFO Target config /etc/ceph/ceph.conf has been updated
Adding bootstrap keyrings
Bootstrapped OSD(s) found; using OSD directory
*** Running /etc/rc.local...
*** Booting runit daemon...
*** Runit started as PID 36
store-daemon: starting daemon on coreos-hw-node-1...
2016-01-06 10:42:12.702337 7fcc0f3c7940  0 ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299), process ceph-osd, pid 38
starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
2016-01-06 10:42:12.711172 7fcc0f3c7940  0 filestore(/var/lib/ceph/osd/ceph-1) backend btrfs (magic 0x9123683e)
2016-01-06 10:42:12.930565 7fcc0f3c7940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
2016-01-06 10:42:12.930573 7fcc0f3c7940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2016-01-06 10:42:12.930586 7fcc0f3c7940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: splice is supported
2016-01-06 10:42:13.007575 7fcc0f3c7940  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2016-01-06 10:42:13.007625 7fcc0f3c7940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: CLONE_RANGE ioctl is supported
2016-01-06 10:42:13.065872 7fcc0f3c7940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: SNAP_CREATE is supported
2016-01-06 10:42:13.065928 7fcc0f3c7940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: SNAP_DESTROY is supported
2016-01-06 10:42:13.072511 7fcc0f3c7940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: START_SYNC is supported (transid 1513)
2016-01-06 10:42:13.100337 7fcc0f3c7940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: WAIT_SYNC is supported
2016-01-06 10:42:13.106459 7fcc0f3c7940  0 btrfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_feature: SNAP_CREATE_V2 is supported
2016-01-06 10:42:13.106821 7fcc0f3c7940 -1 filestore(/var/lib/ceph/osd/ceph-1) ERROR: /var/lib/ceph/osd/ceph-1/current/nosnap exists, not rolling back to avoid losing new data
2016-01-06 10:42:13.106828 7fcc0f3c7940 -1 filestore(/var/lib/ceph/osd/ceph-1) Force rollback to old snapshotted version with 'osd use stale snap = true'
2016-01-06 10:42:13.106829 7fcc0f3c7940 -1 filestore(/var/lib/ceph/osd/ceph-1) config option for --osd-use-stale-snap startup argument.
2016-01-06 10:42:13.106833 7fcc0f3c7940 -1 osd.1 0 OSD:init: unable to mount object store
2016-01-06 10:42:13.106838 7fcc0f3c7940 -1  ** ERROR: osd init failed: (95) Operation not supported

EDIT: I just saw #176 and I can verify that the OSD starts when I drop the --privileged=true flag...

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 6, 2016

It baffles me why dropping the --privileged flag causes it to work. This is on a BTRFS-formatted disk mounted to /var/lib/ceph/osd/ceph-1?

@tobilg

This comment has been minimized.

Copy link
Contributor

@tobilg tobilg commented Jan 6, 2016

Yes, this is correct. I followed the steps outlined above to create and mount the /dev/sdb (empty) disk.

@fmeppo

This comment has been minimized.

Copy link
Contributor

@fmeppo fmeppo commented Jan 19, 2016

If you go with the default behavior, docker has just partitioned the raw disk into /dev/sdd1 (XFS for the OSD) and /dev/sdd2 (for journal) - and /dev/sdd2 is owned by root:disk, so ceph:ceph can't write to it. ceph-disk activate runs as ceph:ceph, and follows the journal symlink in the OSD FS (to /dev/sdd2, in my case), and bombs out.

I'm trying to work around this, but it's not immediately clear to me how to either change the device's permissions (since chown didn't work) or get ceph-disk to not drop perms. I've added "disk" as a supplementary group to the "ceph" user, but to no avail. How are permissions dropped to ceph:ceph, anyway? Does the whole entrypoint.sh run with these perms?

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 19, 2016

No, the entrypoint script runs with whatever privileges the container runs. The ceph daemon itself drops privileges.

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 19, 2016

Incidentally, I'm reasonably certain the --privileged flag oddity listed above is something of a red herring. I ran into just such a permissions-type error. It turns out that ceph does operate differently based on that flag, but either way will work. The issue is that its use needs to be consistent.

@fmeppo

This comment has been minimized.

Copy link
Contributor

@fmeppo fmeppo commented Jan 19, 2016

And my debug logic was wrong, and bad. Turns out /dev/sdd actually IS getting chowned to ceph:ceph properly. No idea why I'm still getting the permission denied problem from mkfs...

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 20, 2016

I ran across that also last weekend, and I'm kicking myself for not writing down what I found. I'm pretty sure that it doesn't have anything to do with filesystem permissions, though. I spent several hours building and rebuilding an OSD, stepping through the process, verifying permissions.

@fmeppo

This comment has been minimized.

Copy link
Contributor

@fmeppo fmeppo commented Jan 20, 2016

It looks like this may be a perms issue after all. ceph-disk activate invokes a bunch of other programs, some of which (probably sgdisk or the partition probe) will cause udev to rescan partitions, and reset permissions in /dev. When ceph-disk activate runs some later ceph commands, it asks for them to run as ceph:ceph, and it can't finish activating the journal. I've tried modifying ceph-disk itself, to request ceph:disk, and things work as expected. That lends to two possible fixes:

  1. change the GID of the ceph group to match the GID of the disk group
  2. patch ceph-disk, to use a specific group name other than "ceph" (I'd go for "disk")

Thoughts?

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 20, 2016

I'm not a proponent of ceph-disk for a variety of reasons such as this: it presumes far too much.

I imagine there is a way to configure the uid/gid under which Ceph runs, which should be sufficient (just supply alternate UID/GID as environment variables for execution. That would seem to be the cleanest way to approach this.

@fmeppo

This comment has been minimized.

Copy link
Contributor

@fmeppo fmeppo commented Jan 20, 2016

Not without a patch (to ceph-disk). You'd think it would take command-line args, or something, but....nope. It checks to see if the user "ceph" exists, and if so sets both the user and group args for ceph-osd to ceph. If "ceph" doesn't exist, they go to root. Not great. Lemme work up a patch, and see where that goes...

@Ulexus

This comment has been minimized.

Copy link
Contributor

@Ulexus Ulexus commented Jan 20, 2016

/me avoids being a broken record...

Thanks!

@fmeppo

This comment has been minimized.

Copy link
Contributor

@fmeppo fmeppo commented Jan 25, 2016

Just created a pull request. This all seems to work on my end (and the OSD runs as ceph:disk, rather than ceph:ceph).

@yxwzaxns

This comment has been minimized.

Copy link
Contributor

@yxwzaxns yxwzaxns commented Jun 11, 2016

so,the problem was solved?
I also met this problem

command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 2 --monmap /var/lib/ceph/tmp/mnt.X8eUdj/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.X8eUdj --osd-journal /var/lib/ceph/tmp/mnt.X8eUdj/journal --osd-uuid e52763fe-b63a-4932-9c92-7134b23f2212 --keyring /var/lib/ceph/tmp/mnt.X8eUdj/keyring --setuser ceph --setgroup disk
2016-06-11 10:17:03.936116 7f65ea5b5800 -1 filestore(/var/lib/ceph/tmp/mnt.X8eUdj) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.X8eUdj/journal: (13) Permission denied
2016-06-11 10:17:03.936132 7f65ea5b5800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2016-06-11 10:17:03.936173 7f65ea5b5800 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.X8eUdj: (13) Permission denied
mount_activate: Failed to activate
unmount: Unmounting /var/lib/ceph/tmp/mnt.X8eUdj
command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.X8eUdj
@xiaoquqi

This comment has been minimized.

Copy link

@xiaoquqi xiaoquqi commented Jun 12, 2016

The same question, is this problem fixed or not? Is there any work around here? Thanks.

@fmeppo

This comment has been minimized.

Copy link
Contributor

@fmeppo fmeppo commented Jun 12, 2016

I believe this is an issue I found and fixed a few months ago.
Unfortunately, the move to a newer build system caused a regression for a
number of related fixes for Infernalis. Luckily, they're all still in the
Jewel branch - and the fix for this particular issue has been accepted
upstream, and can be found in 10.2 and all subsequent releases.

If you move to ceph-docker's "Jewel" releases, you should be fine. If not,
please chime in (so this can get debugged and sorted out again).

Mike Shuey

On Sun, Jun 12, 2016 at 9:15 AM, Ray Sun notifications@github.com wrote:

The same question, is this problem fixed or not? Is there any work around
here? Thanks.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#171 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ANExk5GiTP5EMPOMY6Zp1VaZoztqXst0ks5qLAZsgaJpZM4GuIf4
.

@xiaoquqi

This comment has been minimized.

Copy link

@xiaoquqi xiaoquqi commented Jun 12, 2016

Seems I still got the same error when I start osd using:

sudo docker run -d --net=host -v /etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ --privileged=true -e OSD_FORCE_ZAP=1 -e OSD_DEVICE=/dev/sdb ceph/daemon:jewel osd_ceph_disk

command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 4 --monmap /var/lib/ceph/tmp/mnt.BT8FXG/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.BT8FXG --osd-journal /var/lib/ceph/tmp/mnt.BT8FXG/journal --osd-uuid 89e240e1-17e9-4d6c-8d4f-f1a3e0278b91 --keyring /var/lib/ceph/tmp/mnt.BT8FXG/keyring --setuser ceph --setgroup disk
2016-06-12 23:37:26.180610 7f8889654800 -1 filestore(/var/lib/ceph/tmp/mnt.BT8FXG) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.BT8FXG/journal: (2) No such file or directory
2016-06-12 23:37:26.180752 7f8889654800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -2
2016-06-12 23:37:26.180918 7f8889654800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.BT8FXG: (2) No such file or directory
mount_activate: Failed to activate
unmount: Unmounting /var/lib/ceph/tmp/mnt.BT8FXG
command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.BT8FXG

@wyllys66

This comment has been minimized.

Copy link

@wyllys66 wyllys66 commented Jun 14, 2016

I dont think this issue is specific to docker, but rather is a general ceph-disk issue. I have the same problem creating disks with dmcrypt (plain keys) using ceph 10.2.1 on Ubuntu 14.04.4

Here is upstart log:

main_trigger: main_trigger: Namespace(cluster='ceph', dev='/dev/sdc1', dmcrypt=None, dmcrypt_key_dir='/etc/ceph/dmcrypt-keys', func=<function main_trigger at 0x7f67311e0d70>, log_stdout=True, prepend_to_path='/usr/bin', prog='ceph-disk', setgroup=None, setuser=None, statedir='/var/lib/ceph', sync=True, sysconfdir='/etc/ceph', verbose=True)
command: Running command: /sbin/init --version
command: Running command: /sbin/blkid -o udev -p /dev/sdc1
command: Running command: /sbin/blkid -o udev -p /dev/sdc1
main_trigger: trigger /dev/sdc1 parttype 4fbd7e29-9d25-41b8-afd0-5ec00ceff05d uuid 755c6c87-0993-47e2-9614-5bf38298f56e
command: Running command: /usr/sbin/ceph-disk --verbose activate --dmcrypt /dev/sdc1
main_trigger: 
main_trigger: main_activate: path = /dev/sdc1
get_dm_uuid: get_dm_uuid /dev/sdc1 uuid path is /sys/dev/block/8:33/dm/uuid
command: Running command: /sbin/blkid -o udev -p /dev/sdc1
command: Running command: /sbin/blkid -o udev -p /dev/sdc1
command: Running command: /sbin/blkid -o udev -p /dev/sdc1
command: Running command: /sbin/blkid -o udev -p /dev/sdc1
command: Running command: /usr/bin/ceph --name client.osd-lockbox.755c6c87-0993-47e2-9614-5bf38298f56e --keyring /var/lib/ceph/osd-lockbox/755c6c87-0993-47e2-9614-5bf38298f56e/keyring config-key get dm-crypt/osd/755c6c87-0993-47e2-9614-5bf38298f56e/luks
get_dmcrypt_key: stderr obtained 'dm-crypt/osd/755c6c87-0993-47e2-9614-5bf38298f56e/luks'

run: cryptsetup --key-file - create 755c6c87-0993-47e2-9614-5bf38298f56e /dev/sdc1 --key-size 256
run: 
run: 
command_check_call: Running command: /bin/chown ceph:ceph /dev/mapper/755c6c87-0993-47e2-9614-5bf38298f56e
command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/mapper/755c6c87-0993-47e2-9614-5bf38298f56e
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
mount: Mounting /dev/mapper/755c6c87-0993-47e2-9614-5bf38298f56e on /var/lib/ceph/tmp/mnt.1cqcTR with options noatime,inode64
command_check_call: Running command: /bin/mount -t xfs -o noatime,inode64 -- /dev/mapper/755c6c87-0993-47e2-9614-5bf38298f56e /var/lib/ceph/tmp/mnt.1cqcTR
activate: Cluster uuid is 74c33b34-ece5-11e3-aed4-000c2970ff98
command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
activate: Cluster name is ceph
activate: OSD uuid is 755c6c87-0993-47e2-9614-5bf38298f56e
allocate_osd_id: Allocating OSD id...
command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise 755c6c87-0993-47e2-9614-5bf38298f56e
command: Running command: /bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.1cqcTR/whoami.20927.tmp
activate: OSD id is 3
activate: Initializing OSD...
command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/tmp/mnt.1cqcTR/activate.monmap
got monmap epoch 1
command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs --mkkey -i 3 --monmap /var/lib/ceph/tmp/mnt.1cqcTR/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.1cqcTR --osd-journal /var/lib/ceph/tmp/mnt.1cqcTR/journal --osd-uuid 755c6c87-0993-47e2-9614-5bf38298f56e --keyring /var/lib/ceph/tmp/mnt.1cqcTR/keyring --setuser ceph --setgroup ceph
2016-06-07 13:50:02.003675 7fc73a1d7800 -1 filestore(/var/lib/ceph/tmp/mnt.1cqcTR) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.1cqcTR/journal: (13) Permission denied
2016-06-07 13:50:02.003741 7fc73a1d7800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -13
2016-06-07 13:50:02.003798 7fc73a1d7800 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.1cqcTR: (13) Permission denied
mount_activate: Failed to activate
unmount: Unmounting /var/lib/ceph/tmp/mnt.1cqcTR
command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.1cqcTR
Traceback (most recent call last):
  File "/usr/sbin/ceph-disk", line 9, in <module>
    load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4964, in run
    main(sys.argv[1:])
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4915, in main
    args.func(args)
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3269, in main_activate
    reactivate=args.reactivate,
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3026, in mount_activate
    (osd_id, cluster) = activate(path, activate_key_template, init)
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3202, in activate
    keyring=keyring,
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 2695, in mkfs
    '--setgroup', get_ceph_group(),
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 439, in command_check_call
    return subprocess.check_call(arguments)
  File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '--mkkey', '-i', '3', '--monmap', '/var/lib/ceph/tmp/mnt.1cqcTR/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.1cqcTR', '--osd-journal', '/var/lib/ceph/tmp/mnt.1cqcTR/journal', '--osd-uuid', '755c6c87-0993-47e2-9614-5bf38298f56e', '--keyring', '/var/lib/ceph/tmp/mnt.1cqcTR/keyring', '--setuser', 'ceph', '--setgroup', 'ceph']' returned non-zero exit status 1

Traceback (most recent call last):
  File "/usr/sbin/ceph-disk", line 9, in <module>
    load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4964, in run
    main(sys.argv[1:])
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4915, in main
    args.func(args)
  File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4352, in main_trigger
    raise Error('return code ' + str(ret))
ceph_disk.main.Error: Error: return code 1
@russell-islam

This comment has been minimized.

Copy link

@russell-islam russell-islam commented Jun 30, 2016

Same issue here:
command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 0 --monmap /var/lib/ceph/tmp/mnt.onP01K/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.onP01K --osd-journal /var/lib/ceph/tmp/mnt.onP01K/journal --osd-uuid 7902072a-e34d-41d6-b091-bdc624640650 --keyring /var/lib/ceph/tmp/mnt.onP01K/keyring --setuser ceph --setgroup disk
mount_activate: Failed to activate
unmount: Unmounting /var/lib/ceph/tmp/mnt.onP01K
command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.onP01K
Traceback (most recent call last):
File "/usr/sbin/ceph-disk", line 9, in
load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4994, in run
main(sys.argv[1:])
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4945, in main
args.func(args)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3299, in main_activate
reactivate=args.reactivate,
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3056, in mount_activate
(osd_id, cluster) = activate(path, activate_key_template, init)
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3232, in activate
keyring=keyring,
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2725, in mkfs
'--setgroup', get_ceph_group(),
File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2672, in ceph_osd_mkfs
raise Error('%s failed : %s' % (str(arguments), error))
ceph_disk.main.Error: Error: ['ceph-osd', '--cluster', 'ceph', '--mkfs', '--mkkey', '-i', '0', '--monmap', '/var/lib/ceph/tmp/mnt.onP01K/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.onP01K', '--osd-journal', '/var/lib/ceph/tmp/mnt.onP01K/journal', '--osd-uuid', '7902072a-e34d-41d6-b091-bdc624640650', '--keyring', '/var/lib/ceph/tmp/mnt.onP01K/keyring', '--setuser', 'ceph', '--setgroup', 'disk'] failed : 2016-06-30 12:56:44.787350 7f2423018800 -1 filestore(/var/lib/ceph/tmp/mnt.onP01K) mkjournal error creating journal on /var/lib/ceph/tmp/mnt.onP01K/journal: (2) No such file or directory
2016-06-30 12:56:44.787447 7f2423018800 -1 OSD::mkfs: ObjectStore::mkfs failed with error -2
2016-06-30 12:56:44.787522 7f2423018800 -1 ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.onP01K: (2) No such file or directory

@ghost

This comment has been minimized.

Copy link

@ghost ghost commented Jul 22, 2018

+1
same problem on the lastest image ceph/daemon 2018-07-22

@phoerious

This comment has been minimized.

Copy link

@phoerious phoerious commented Dec 20, 2018

I am seeing the same when I try to activate OSDs, but for me it's a Permission denied error, not "no such file or directory".
I was able to deploy a Ceph cluster using ceph-deploy osd create on the host, but using the Docker image, I keep seeing this error and couldn't find a way to work around it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.