Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bricks_waiting_to_join - undefined method `empty?' for nil:NilClass #60

Closed
dpattmann opened this issue May 25, 2016 · 17 comments
Closed
Labels

Comments

@dpattmann
Copy link
Contributor

Hi,

after the first chef-run I get this error message from my chef-client. (12.7.2)

  ================================================================================
  Recipe Compile Error in /var/chef/cache/cookbooks/gluster/recipes/server.rb
  ================================================================================

  NoMethodError
  -------------
  undefined method `empty?' for nil:NilClass

  Cookbook Trace:
  ---------------
    /var/chef/cache/cookbooks/gluster/recipes/server_extend.rb:40:in `block in from_file'
    /var/chef/cache/cookbooks/gluster/recipes/server_extend.rb:1:in `each'
    /var/chef/cache/cookbooks/gluster/recipes/server_extend.rb:1:in `from_file'
    /var/chef/cache/cookbooks/gluster/recipes/server.rb:24:in `from_file'

  Relevant File Content:
  ----------------------
  /var/chef/cache/cookbooks/gluster/recipes/server_extend.rb:

   33:        unless brick_in_volume?(peer_name, brick, volume_name)
   34:          node.default['gluster']['server']['volumes'][volume_name]['bricks_waiting_to_join'] << " #{peer_name}:#{brick}"
   35:        end
   36:      end
   37:    end
   38:  
   39:    replica_count = volume_values['replica_count']
   40>>   next if node['gluster']['server']['volumes'][volume_name]['bricks_waiting_to_join'].empty?
   41:    # The number of bricks in bricks_waiting_to_join has to be a modulus of the replica_count we are using for our gluster volume
   42:    if (brick_count % replica_count) == 0
   43:      Chef::Log.info("Attempting to add new bricks into volume #{volume_name}")
   44:      execute "gluster volume add-brick #{volume_name} #{node['gluster']['server']['volumes'][volume_name]['bricks_waiting_to_join']}" do
   45:        action :run
   46:      end
   47:      node.set['gluster']['server']['volumes'][volume_name]['bricks_waiting_to_join'] = ''
   48:    elsif volume_values['volume_type'] == 'striped'
   49:      Chef::Log.warn("#{volume_name} is a striped volume, adjusting replica count to match new number of bricks")


  Running handlers:
[2016-05-25T15:43:08+00:00] ERROR: Running exception handlers
  Running handlers complete
[2016-05-25T15:43:08+00:00] ERROR: Exception handlers complete
  Chef Client failed. 1 resources updated in 07 seconds
[2016-05-25T15:43:08+00:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2016-05-25T15:43:08+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2016-05-25T15:43:08+00:00] ERROR: undefined method `empty?' for nil:NilClass
[2016-05-25T15:43:09+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

I fixed this error by setting bricks_waiting_to_join for each volume to'', but it looks like a bug 🐛

@shortdudey123
Copy link
Owner

That attribute is defined at https://github.com/shortdudey123/chef-gluster/blob/master/recipes/server_extend.rb#L21 and that code being skipped most likely means 1 of 2 things: 1) peer_names or peers is empty or 2) the chef server does not contain node entries for any of the peers.

Can you confirm if one of these is true?

I am thinking that code needs to be moved outside the peers.each block otherwise chef runs will bomb w/o a peer list

@dpattmann
Copy link
Contributor Author

Hi @shortdudey123,

I'm sorry, but peer_names is not set and peers is set to 4 FQDNs.

I attached my my role settings.

    "gluster": {
      "version": "3.7",
      "server": {
        "peer_retries": 10,
        "peer_retry_delay": 60,
        "disks": [
          "/dev/xvdb"
        ],
        "brick_mount_path": "/data",
        "volumes": {
          "gv0": {
            "peers": [
              "gluster-01.example.org",
              "gluster-02.example.org",
              "gluster-03.example.org",
              "gluster-04.example.org"
            ],
            "replica_count": 2,
            "volume_type": "distributed-replicated",
            "size": "10G",
            "bricks_waiting_to_join": ""
          }
     }

@dpattmann
Copy link
Contributor Author

For me it looks like https://github.com/shortdudey123/chef-gluster/blob/master/recipes/server_extend.rb#L20-L22 isn't working correctly.

My fix for this error would be this.

diff --git a/recipes/server_extend.rb b/recipes/server_extend.rb
index 65e4e4e..7201939 100644
--- a/recipes/server_extend.rb
+++ b/recipes/server_extend.rb
@@ -17,10 +17,6 @@ node['gluster']['server']['volumes'].each do |volume_name, volume_values|
       next
     end

-    unless node.default['gluster']['server']['volumes'][volume_name].attribute?('bricks_waiting_to_join')
-      node.default['gluster']['server']['volumes'][volume_name]['bricks_waiting_to_join'] = ''
-    end
-
     peer_bricks = chef_node['gluster']['server']['volumes'][volume_name]['bricks'].select { |brick| brick.include? volume_name }
     brick_count += (peer_bricks.count || 0)
     peer_bricks.each do |brick|
@@ -37,7 +33,7 @@ node['gluster']['server']['volumes'].each do |volume_name, volume_values|
   end

   replica_count = volume_values['replica_count']
-  next if node['gluster']['server']['volumes'][volume_name]['bricks_waiting_to_join'].empty?
+  next unless node['gluster']['server']['volumes'][volume_name].attribute?('bricks_waiting_to_join')
   # The number of bricks in bricks_waiting_to_join has to be a modulus of the replica_count we are using for our gluster volume
   if (brick_count % replica_count) == 0
     Chef::Log.info("Attempting to add new bricks into volume #{volume_name}")

@shortdudey123
Copy link
Owner

You can't remove the initialization of it, otherwise other stuff will break (`node['gluster']['server']['volumes'][volume_name]['bricks_waiting_to_join'] would be nil when used later)

[1] pry(main)> nil << 'test'
NoMethodError: undefined method `<<' for nil:NilClass
from (pry):1:in `__pry__'
[2] pry(main)> 

When the code gets to https://github.com/shortdudey123/chef-gluster/blob/master/recipes/server_extend.rb#L24, what is the value of node['gluster']['server']['volumes']['gv0']['bricks_waiting_to_join']

@dpattmann
Copy link
Contributor Author

Chef fails during the compiling phase, so at this time no code is executed and the variable is not set.

  ================================================================================
  Recipe Compile Error in /var/chef/cache/cookbooks/gluster/recipes/server.rb
  ================================================================================

@shortdudey123
Copy link
Owner

right but the exception happened at server_extend.rb:40 so line 24 was either run or skipped due to no chef nodes :)

@dpattmann
Copy link
Contributor Author

I know it's on line 40 😉 but I don't understand why it works at the first chef-run and then breaks.

In my node attributes, if I use knife node show NODENAME -l I don't see bricks_waiting_to_join is set to anything, but it should be set during the first chef-run, right?

@shortdudey123
Copy link
Owner

hmm thats true, let me do some testing and see if i can see what happening

When you do the 2nd chef run, have all 4 nodes in the peers list successfully converged at least once?

@dpattmann
Copy link
Contributor Author

dpattmann commented May 30, 2016

Yes they have.

@shortdudey123
Copy link
Owner

ok, i will try and work on trying to replicate this weekend

@dpattmann
Copy link
Contributor Author

Any updates here. :-)

@shortdudey123
Copy link
Owner

Have not had a chance to test. Will try and replicate this week

@vchung-nz
Copy link
Contributor

vchung-nz commented Jul 5, 2016

I am thinking that code needs to be moved outside the peers.each block otherwise chef runs will bomb w/o a peer list

Moving the initialisation code to just before peers.each block did indeed fix the problem for me. And functionally it is the same before, so should be safe to change.

If you look a bit further up in your chef-client output, you should see a few "WARN: Unable to find a chef node for ..." just before the Recipe Compile Error. (At least that is the case for me).

I think that part failed on subsequence runs because peers has been defined, but during the compile phrase Chef::Node.load does not actually run anything ( https://github.com/shortdudey123/chef-gluster/blob/master/recipes/server_extend.rb#L14 ). So the initialisation block is skipped.

@shortdudey123
Copy link
Owner

@vchung-nz make sense, can you open a PR with your fix? I think that would work better than the diff that @dpattmann has above, since that will keep it from being nil in the end

@shortdudey123
Copy link
Owner

@dpattmann can you try out master? should be good now

@dpattmann
Copy link
Contributor Author

@shortdudey123 LGTM 😄 👍

@shortdudey123
Copy link
Owner

cool :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants