WIP: first iteration of a unique daemon image #78

leseb · 2015-06-10T14:50:24Z

The idea is pretty straighforward we simply pass a new env var and boot
a monitor like this:

sudo docker run -d --net=host -v /etc/ceph:/etc/ceph -e CEPH_DAEMON=MON -e MON_IP=192.168.0.20 -e CEPH_NETWORK=192.168.0.0/24 ceph/daemon

So far, I've been able to successfully bootstrap MON, MDS and RGW.
I had to fix some MDS issue. Because we use set -e we can not really
trap a command return code in a variable, the script will exist before
that since the command potentially returns something different than 0.

For the OSD, it's probably me...

I couldn't find a better name than "daemon" for now.
Let's first discuss the implementation.
I also added some meaningful log messages for the OSD.

Signed-off-by: Sébastien Han seb@redhat.com

leseb · 2015-06-10T15:00:34Z

This probably needs a bit more documentation...

Ulexus · 2015-06-10T15:09:59Z

Thanks @leseb . "daemon" works for me.

A few ideas:

CEPH_DAEMON should accept a space-delimited list of daemons to run. Yes, recommended practice is to not combine them, but (among other things), this would make your demo package easier.
For the two machine-bound services (mon and osd), I wonder if we could build on the OSD's autodetection concept by checking to see if directories for mon and osd services exist in standard locations. In this way, we could have the daemons start automatically based on directory structure, allowing a fully generic host service definition (i.e., "start Ceph on this server").

The bootstrapping, to me, offers an interesting quandary. If we are not instructed to start a mon, but we are not bootstrapped (no ceph config or keys), should the container simply die? Should it attempt to create a mon and bootstrap? Perhaps if there is no daemon specified, it could bootstrap a mon?

leseb · 2015-06-10T15:56:24Z

I was initially thinking of allowing multiple daemons running into a single container. Later I remembered that it is not what we recommend. The implementation might be tricky as well lead to undesired behaviours.
Moreover I don't think this will help the demo container as the entrypoint is really unique. The way services are configured is intended for a demo nothing else. (low pg/pgp, hardcode pool name etc...)
I don't really want to mix what we consider for production usage and what we recommend as a sandbox.
This is why I'd like to keep both 'daemon' and 'demo' separated. This will probably avoid confusions too.

So yes, in the end I'm more incline to force users to run micro service containers instead of running multiple daemons. Given that containers are really lightweight I don't any reason why someone would want to run more than one daemons inside a container.

There is definitely room for improvement for the OSD part as I believe the current state is too complex and not user-friendly at all. We should think of a new design I guess.

Regarding the bootstrapping, we could let daemons die but a check to verify that a communication can be established with a monitor is probably better. This will avoid unpleasant debugging. I can work on something.
Finally, should we bootstrap a mon if non exist, I'd say why not. Generally I'd rather return an error if the user didn't follow the proper steps. So the error provides some guidance on how to do thing properly. Doing thing under the hood by workarounding users mistake is not really a good idea. If it fails the user will learn why and will set it up properly.

Ulexus · 2015-06-10T16:12:07Z

Doing things under the hood by workarounding users mistake is not really a good idea

That's a good way of thinking about it. Yes, I would have to agree.

You're also correct about the multiple daemons (aside from the OSD and workarounds required there...but that doesn't have any bearing on the daemon selection).

Another thing to consider with the single-container thing: we don't have to worry about backward compatibility, so we should probably reorganize the osd entrypoint, since it's rather messy. Maybe the mon, too. I've been thinking of the various ways to integrate these with etcd/confd (in a flexible and not-mandatory way).

I'll definitely say that I am presently caught up with the concept of directory-described execution... and I think we could definitely provide means of pulling down configs and keys (from etcd, consult, S3, URL, etc.).

Ulexus · 2015-06-10T16:20:53Z

(perhaps?) Configuration and key extraction procedure:

Check for local <file> in /etc/ceph/
Check for CONFIG_METHOD (defaults to none)
Attempt to pull <file> via CONFIG_METHOD handler
(For <file> other than ceph.conf and ceph.client.admin.keyring) Call ceph auth get-or-create
Otherwise fail or bootstrap, as appropriate

Ulexus · 2015-06-10T16:24:58Z

I think the current OSD detection routine is actually fairly good; it should just be simplified and better documented. (referring to the directory detection).

Bootstrapping OSDs, though, is a bit painful, at the moment. Specifically, the need to create the OSD outside of the container is counter-intuitive. If the client.admin keyring is available, though, we should be able to have the script fully bootstrap an OSD (including creation of that OSD). This would come close to @hookenz addition concept: create and mount the directory, and the container takes care of everything else.

Even better if we had a small execution wrapper instead of the current startup script, which could continually monitor the osd directory structure to add (and maybe remove) OSDs as they appear.

leseb · 2015-06-10T16:36:48Z

Just did some cleanup and added some functions to check several things (ceph.conf and admin key).

If we focus on the CONFIG_METHOD, I agree that we should be compliant with several config stores.
It looks like you want to use these stores to 'store' keys and ceph.conf.
I also think that we could use this for configuration options too, that is more difficult to do actually.

I can try to prototype a default push config or maybe simply relies on the monitors store just like @hunter suggested in #34.
I think we can start with 'default' which I'd like to call file and then try to implement ceph-monitor.
Or just use the monitor store as a default store. However this will require access to the ceph cluster (conf and key) which ends up being a chicken-and-egg problem :-(

leseb · 2015-06-10T16:48:48Z

Can we maybe first merge the daemon prototype and then in another PR I'll work on the config store backends?
I think it's just too much to do in one shoot.

Ulexus · 2015-06-10T17:16:57Z

Oh, certainly.

On Wed, Jun 10, 2015, 12:48 Leseb notifications@github.com wrote:

Can we maybe first merge the daemon prototype and then in another PR I'll
work on the config store backends?
I think it's just too much to do in one shoot.

—
Reply to this email directly or view it on GitHub
#78 (comment).

Seán C McCord
CyCore Systems, Inc
888-240-0308

hookenz · 2015-06-10T22:23:51Z

This looks interesting.

By the way, why do you pass CEPH_DAEMON= as an environment variable and not as a command to run?
And why are you using environment variables for CEPH_NETWORK when you could read this from ceph.conf?

hunter · 2015-06-11T08:15:14Z

Looks like a nice approach.

Is the assumption that multiple OSDs will be run from a single container to get around the issues with inter-host communication?

leseb · 2015-06-11T08:49:52Z

@hookenz mainly because we need to configure the container upfront, if we simply run the command we need to know the id of the daemon. Does it answer your question because I'm not sure if I got it correctly.
We use the CEPH_NETWORK var only for the monitors while building the first monitor and its ceph.conf. Since monitors can't run without the --net=host flag we need to specify the host network to the container.

@hunter to be honest I'm not sure what's the plan. However what I'm sure is that I'd like to refactor the OSD part and restart something from scratch. For example, I think we should start using ceph-disk to bootstrap them.

hunter · 2015-06-11T09:54:58Z

Agreed. Since the OSDs are such a critical part of running a scalable Ceph cluster it's important that we build something that's going to be flexible for mounting, hot swapping, journals, etc. ceph-disk will probably help with a number of those things. I'll test out a few ideas when I get a spare minute.

leseb · 2015-06-11T10:03:19Z

I started to play with ceph-disk inside a container, results seem to be rather random. From what I observed privileged mode is needed.

leseb · 2015-06-11T13:24:58Z

@Ulexus I just added the support for bootstrap keys since it's part of the best practices to used them instead of always requiring the admin key.

hookenz · 2015-06-12T20:38:56Z

I don't think you should default to starting any daemon if not specified. That might lead to rogue or orphaned mon's if your not careful. Better to just exit with error and some help text.

leseb · 2015-06-13T21:45:08Z

@hookenz no worries I don't :)

leseb · 2015-06-15T12:39:11Z

@Ulexus can you do a last round on this one?
I think we are good to merge it.

Ulexus · 2015-06-15T15:26:49Z

daemon/README.md

+* Run multiple OSDs within the same container
+
+To run multiple OSDs within the same container, simply bind-mount each OSD datastore directory:
+* `docker run -v /osds/1:/var/lib/ceph/osd/ceph-1 -v /osds/2:/var/lib/ceph/osd/ceph-2`


Or, since we are now providing the OSD directory option as a top-level feature, the example here should probably reflect that: export the osd directory

Ulexus · 2015-06-15T15:44:06Z

Architecturally, I don't know that we particularly need to separate the execution commands for OSD_DEVICE and OSD_DIRECTORY. We should be able to simply check the /var/lib/ceph/osd tree on start to see if there exist OSDs in that directory, then check for OSD_DEVICE, then fail. That also means that you can bootstrap an OSD using --privileged, but on subsequent runs, have the host mount the OSD and run without --privileged.

hunter · 2015-06-15T15:45:36Z

daemon/README.md

+-v /var/lib/ceph/:/var/lib/ceph/ \
+-v /dev/:/dev/ \
+-e CEPH_DAEMON=OSD \
+-e OSD_DEVICE=/dev/vdd \


This doesn't support multiple devices?

What about the docker run 'devices' option? Redundant with mounting /dev but any point in also using?

I've tried the --device option without luck. Even doing -v /dev/vdb didn't work. The only way for me to get it working was to use -v /dev:/dev...

I'm guessing ceph-disk needs access to the other parts of /dev? /dev/disk/by-*

Correct. sgdisk uses disk's uuid.

The idea is pretty straighforward we simply pass a new env var and boot a monitor like this: `sudo docker run -d --net=host -v /etc/ceph:/etc/ceph -e CEPH_DAEMON=MON -e MON_IP=192.168.0.20 -e CEPH_NETWORK=192.168.0.0/24 ceph/daemon` So far, I've been able to successfully bootstrap MON, MDS and RGW. I had to fix some MDS issue. Because we use `set -e` we can not really trap a command return code in a variable, the script will exist before that since the command potentially returns something different than 0. For the OSD, it's probably me... I couldn't find a better name than "daemon" for now. Let's first discuss the implementation. I also added some meaningful log messages for the OSD. Signed-off-by: Sébastien Han <seb@redhat.com>

Ulexus · 2015-06-15T15:46:27Z

One other thing: is there any particular reason to require the execution be an environment variable instead of a parameter? If we are calling this the entrypoint script, docker automatically passes the arguments to its execution on. Hence, we could simply execute docker run ceph/daemon osd instead of docker run -e CEPH_DAEMON=CEPH_OSD_DEVICE ceph/daemon. The former seems more natural, to me.

leseb · 2015-06-15T15:48:52Z

I agree it makes more sense to run docker run ceph/daemon osd unfortunately I don't know how to do this :). Assistance required on this. :)
Is it simply a $1?

Ulexus · 2015-06-15T15:49:38Z

Ah, no problem. I say we go ahead and merge this but not publish it to Docker Hub yet. Then we can work on it... WIP, after all.

WIP: first iteration of a unique daemon image

hunter · 2015-06-15T15:52:51Z

daemon/README.md

+* use `OSD_DIRECTORY` where you specify an OSD mount point to your container
+
+
+### Ceph disk ###


It seems Ceph disk is for single use only? Looks like it checks the partition and exits (unless a zap is needed). Should it be documented that the directory mode should be used on following runs?

Yes we should clarify that preparation steps are needed before running the container and exposing the OSD directory.

Since #78, we introduced the daemon container that centralizes all the Ceph daemons instead of having separate container images. We now only have one. Signed-off-by: Sébastien Han <seb@redhat.com>

Docker under v1.12 can't build with symbolic link file

leseb force-pushed the single-container branch from 14c5c37 to 665cea1 Compare June 10, 2015 14:51

leseb force-pushed the single-container branch from 665cea1 to d0bdf83 Compare June 10, 2015 16:27

leseb force-pushed the single-container branch from d0bdf83 to 137bfbf Compare June 10, 2015 16:42

leseb force-pushed the single-container branch 2 times, most recently from 9f71a2d to a517cac Compare June 10, 2015 16:58

hunter mentioned this pull request Jun 11, 2015

Usage of confd to make the repo more universal #34

Closed

leseb force-pushed the single-container branch from a517cac to d983118 Compare June 11, 2015 08:51

leseb force-pushed the single-container branch from d983118 to 491f327 Compare June 11, 2015 11:41

leseb force-pushed the single-container branch 6 times, most recently from 153e117 to 278ed44 Compare June 11, 2015 17:47

leseb mentioned this pull request Jun 12, 2015

Refactor OSD setup #81

Closed

leseb force-pushed the single-container branch 6 times, most recently from 84e444a to af6f846 Compare June 15, 2015 12:38

Ulexus reviewed Jun 15, 2015
View reviewed changes

hunter reviewed Jun 15, 2015
View reviewed changes

leseb force-pushed the single-container branch from af6f846 to b6e64d9 Compare June 15, 2015 15:45

leseb added a commit that referenced this pull request Jun 15, 2015

Merge pull request #78 from ceph/single-container

5bb877d

WIP: first iteration of a unique daemon image

leseb merged commit 5bb877d into master Jun 15, 2015

hunter reviewed Jun 15, 2015
View reviewed changes

leseb mentioned this pull request Jun 17, 2015

Remove all the roles in favour of the daemon container #94

Merged

leseb deleted the single-container branch December 17, 2015 18:00

mkkie pushed a commit to mkkie/ceph-container that referenced this pull request Nov 1, 2017

Merge pull request ceph#78 from mkkie/pull_latest_codes

f06bb08

Docker under v1.12 can't build with symbolic link file

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: first iteration of a unique daemon image #78

WIP: first iteration of a unique daemon image #78

leseb commented Jun 10, 2015

leseb commented Jun 10, 2015

Ulexus commented Jun 10, 2015

leseb commented Jun 10, 2015

Ulexus commented Jun 10, 2015

Ulexus commented Jun 10, 2015

Ulexus commented Jun 10, 2015

leseb commented Jun 10, 2015

leseb commented Jun 10, 2015

Ulexus commented Jun 10, 2015

hookenz commented Jun 10, 2015

hunter commented Jun 11, 2015

leseb commented Jun 11, 2015

hunter commented Jun 11, 2015

leseb commented Jun 11, 2015

leseb commented Jun 11, 2015

hookenz commented Jun 12, 2015

leseb commented Jun 13, 2015

leseb commented Jun 15, 2015

Ulexus Jun 15, 2015

Ulexus commented Jun 15, 2015

hunter Jun 15, 2015

leseb Jun 15, 2015

hunter Jun 15, 2015

leseb Jun 15, 2015

Ulexus commented Jun 15, 2015

leseb commented Jun 15, 2015

Ulexus commented Jun 15, 2015

hunter Jun 15, 2015

leseb Jun 15, 2015

		* use `OSD_DIRECTORY` where you specify an OSD mount point to your container


		### Ceph disk ###

WIP: first iteration of a unique daemon image #78

WIP: first iteration of a unique daemon image #78

Conversation

leseb commented Jun 10, 2015

leseb commented Jun 10, 2015

Ulexus commented Jun 10, 2015

leseb commented Jun 10, 2015

Ulexus commented Jun 10, 2015

Ulexus commented Jun 10, 2015

Ulexus commented Jun 10, 2015

leseb commented Jun 10, 2015

leseb commented Jun 10, 2015

Ulexus commented Jun 10, 2015

hookenz commented Jun 10, 2015

hunter commented Jun 11, 2015

leseb commented Jun 11, 2015

hunter commented Jun 11, 2015

leseb commented Jun 11, 2015

leseb commented Jun 11, 2015

hookenz commented Jun 12, 2015

leseb commented Jun 13, 2015

leseb commented Jun 15, 2015

Ulexus Jun 15, 2015

Choose a reason for hiding this comment

Ulexus commented Jun 15, 2015

hunter Jun 15, 2015

Choose a reason for hiding this comment

leseb Jun 15, 2015

Choose a reason for hiding this comment

hunter Jun 15, 2015

Choose a reason for hiding this comment

leseb Jun 15, 2015

Choose a reason for hiding this comment

Ulexus commented Jun 15, 2015

leseb commented Jun 15, 2015

Ulexus commented Jun 15, 2015

hunter Jun 15, 2015

Choose a reason for hiding this comment

leseb Jun 15, 2015

Choose a reason for hiding this comment