cassandra cookbook and related fixes #81

Closed
pcn opened this Issue Dec 9, 2011 · 5 comments

Projects

None yet

3 participants

@pcn
pcn commented Dec 9, 2011

I needed to add this:

:commitlog_dir => { :uid => :user, :gid => :group, },
:saved_caches_dir => { :uid => :user, :gid => :group, },

to the top of standard_dirs.rb to get the cassandra server minimally functional. Is there supposed to be a way to indicate additional "STANDARD_DIRS" are necessary from within the recipe that's using it?

Also, in access.properties.erb the first line interpolated by chef needs to not try to iterate over @acls.keys.each_pair. It needs to look like this:

<%- @acls.each_pair do |keyspace, cfs| %>

The documentation for the format of the data bag has a ruby data structure. Could something like the following be added to clarify the structure if it's edited by hand?

# An example of the resulting json data looks like this:
#     "TestCassandra": {
#       "authentication": {
#         "use_md5": true,
#         "users": {
#           "ab": "cd",
#           "test": "testpass"
#         }
#       },
#       "authority": {
#         "keyspace": {
#           "abc": {
#             "rw": [
#               "ab"
#             ],
#             "ro": [
#               "test"
#             ]
#           },
#           "_": {
#             "rw": [
#               "ab"
#             ],
#             "ro": [
#               "test"
#             ]
#           }
#         }
#       }
#     },
@mrflip
Member
mrflip commented Dec 15, 2011

What we should probably do is have standard_dirs accept either

  • a symbol for a thing it knows about -- in which case the STANDARD_DIRS constant rules;
  • a symbol for a thing it doesn't, in which case you'll get [root, root, 755] and you'll like it; or
  • a hash, which overrides those defaults.

I'd rather just replace it with the JSON altogether, esp. because I suspect it's not all the way up-to-date. Can you send a pull request with just the JSON -- best of all if it's just a simplified version of what you actually use?

@pcn
pcn commented Dec 15, 2011

You said:

I'd rather just replace it with the JSON altogether, esp. because I suspect it's not all the way up-to-date. Can you send a pull request with just the JSON -- best of all if it's just a simplified version of what you actually use?

I think the JSON that I used is in the referenced pull request. The issue of standard_dirs is still in play.

What is the API you imagine for extending the directories that can be handled by standard_dirs? I'm still a ruby newbie, but it looks like something that could be merged into STANDARD_DIRS is what you're talking about? So for this case:

standard_dirs('cassandra') do
  directories   [:conf_dir, :log_dir, :lib_dir, :pid_dir, :data_dirs, :commitlog_dir, :saved_caches_dir]
  group         'root'
end

in the infochimps cookbook repo standard_dirs doesn't know what to do with :commitlog_dir or :saved_caches_dir, right?

So reading your thoughts, this means that:

  • a symbol for a thing it knows about -- in which case the STANDARD_DIRS constant rules;

Meaning that for :conf_dir, :log_dir, :lib_dir, :pid_dir, and :data_dirs, The Right Thing(tm) will be done as it is now if I understand?

  • a symbol for a thing it doesn't, in which case you'll get [root, root, 755] and you'll like it; or

In this case, for e.g. :commitlog_dir it wouldn't know what to do so it will lookup node['cassandra'][:commitlog_dir], find the path, and not know anything else about it, so make the directory root, root, 0755. Is that right?

  • a hash, which overrides those defaults.

In this case, instead of passing in just :commitlog_dir, the recipe would pass in something like {:commitlog_dir =:> {:user => 'cassandra', :group => 'cassandra', :mode => '0755' }}. They key :commitlog_dir would be used to augment STANDARD_DIRS, so that the rest of the function would work by looking up the directory name via node[:cassandra][:commitlog_dir], and applying the user, group, and mode provided in the hash.

Do I understand what you're proposing? If so, that seems like for the "symbol for a thing it doesn't [know about]" should also emit a warning since that's probably the desired action. However, it seems that since this is in the cassandra recipe:

default[:cassandra][:user]              = 'cassandra'
default[:cassandra][:group]             = 'nogroup'

it's worth thinking about also looking for node[:cassandra][:user] and node[:cassandra][:group] since this is often defined, and more often than not applications making directories will want to use a user+group that is already defined in the package. Do you think this convention is established firmly and broadly enough that it'd be useful to encourage it and use it?

@mrflip
Member
mrflip commented Dec 15, 2011

I missed the other pull request -- thanks, that was great.


Most of your recap sounds spot on. For case three, however, I was thinking something like

standard_dirs('cassandra') do
  directories   [:conf_dir, :log_dir, :lib_dir, :pid_dir, :data_dirs, 
    { :name => :commitlog_dir, :user => 'root', group => 'databases', },
    { :name => saved_caches_dir, :mode => 700 } ]
end

Not the prettiest thing in the world, but will work. I would leave STANDARD_DIRS alone... the hash would just apply to this call.

I actually think there's enough cases of 'a journal or commitlog directory' and of 'a set of caches even more or differently ephemeral than what goes in tmp' to make them STANDARD_DIRS.

  • does the saved_caches_dir properly like other cache_dirs? (that is, would you put it in /var/cache/cassandra if bulk storage weren't a concern)?
  • I mildly prefer the name journal_dir to commitlog_dir. Actually, it's more that I hate 'commitlog' (it makes me think it's a log) and only slightly dislike journal_dir. If anyone suggests an alternative, or upvotes journal_dir we can add it to the set of standard dirs.

Yes, I'd like to have it discover the user and group from the node metadata -- volume_dirs already does this, so just need to move to that helper. (It looks for node[:cassandra][:server][:user], then node[:cassandra][:user], then falls back.)

@mrflip
Member
mrflip commented Dec 15, 2011

upon reflection, I think standard_dirs should be the express lane only, and there should be another with simple sugar:

The standard_dirs definition is only for things blessed in the STANDARD_DIRS collection; throws an error otherwise. It uses that hash and appropriate node metadata to define directories. Any overrides apply to everything in the directories line of this invocation:

standard_dirs('cassandra') do
  directories   [:conf_dir, :log_dir, :lib_dir, :pid_dir, :data_dirs]
  owner 'bob'  # overrides for all.
end

This is the helper for everything else. It will use node metadata to, for example, find the user and group:

standard_dirs('cassandra') do
  path     :commitlog_dir
  owner  :user    # default is still 'root'
  group   :group # default is still 'root' 
end

standard_dirs('cassandra') do
  path     :saved_caches_dir
  owner  :user
  mode   '0700'
end

Need to take a hard look at volume_dirs, standard_dirs and this last one and make sure they all have the same mouthfeel.

@pcn
pcn commented Dec 16, 2011

OK, not speaking to volume_dirs (not using that feature at this time, so I don't have any experience to speak from) this proposal sounds like it'd make a lot more sense since it provides chef-like syntax.

@temujin9 temujin9 closed this Apr 19, 2012
@mrflip mrflip pushed a commit that referenced this issue Apr 28, 2012
Philip (flip) Kromer Merge branch 'master' of github.com:infochimps-labs/ironfan
* 'master' of github.com:infochimps-labs/ironfan:
  Removing completed TODO items related to Cluster Chef -> Ironfan rename
  Removing completed TODO item
  From conversation with Flip: the remainder of #81 should be a TODO item, not an issue
  Make ironfan work with string (in addition to array) for cluster_path, to conform to chef semantics (fixes #130)
68fcc8f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment