Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

init process does not start, but I can start exhibitor from the command line #51

Closed
alkhatib opened this issue Jan 27, 2014 · 11 comments
Closed

Comments

@alkhatib
Copy link

Trying this out on an AWS instance.

I made modifications to the default.rb, mainly the settings for s3 and changed the user to use from zookeeper to ec2-user

On a side note, I had to install "patch" on the machine before the recipe was successfully installed.

Then once it reaches the end, it hangs on:

x.x.x.x   * service[exhibitor] action start
x.x.x.x     - start service service[exhibitor]
x.x.x.x
x.x.x.x   * service[exhibitor] action stop (up to date)

x.x.x.x   * service[exhibitor] action start
x.x.x.x     - start service service[exhibitor]

When i check the running processes on the machine, exhibitor is not running, but the check-local-zk.py script is.

Everytime I try to start the service:

sudo start -v -n exhibitor

I get the same behaviour and the following output in /var/log/messages

.
.
.
Jan 27 21:09:36 ip-x init: exhibitor main process ended, respawning
Jan 27 21:09:36 ip-x init: exhibitor main process (29668) terminated with status 1
Jan 27 21:09:36 ip-x init: exhibitor main process ended, respawning
Jan 27 21:09:36 ip-x init: exhibitor main process (29671) terminated with status 1
Jan 27 21:09:36 ip-x init: exhibitor respawning too fast, stopped

I tried this with the latest commit:

*   8622711 (HEAD, origin/master, origin/HEAD, master) Merge pull request #50 from N3TWORK/opsworks

If I copy the exec line from the init script, and try it on the command line it works just fine.

exec sudo -u $USER bash -c "java -Dlog4j.configuration=file:///opt/exhibitor/log4j.properties -jar /opt/exhibitor/1.5.0.jar \
>     --configtype s3 --defaultconfig /var/chef/cache/defaultconfig.exhibitor --hostname x.x.x.x --port 8080 --s3config config:name --s3credentials /opt/exhibitor/exhibitor.s3.properties --s3region us-east-1 "

Any help would be appreciated.

@mwhooker
Copy link
Contributor

It might be best to try the latest tagged version, 1.4.9, until we can troubleshoot.

thanks for the report.

@alkhatib
Copy link
Author

Would the latest 1.4.9 be the same version that is in the chef community
cookbook?

Because that is the one I initially tried before trying the latest commit.
And I got the same problem.

Thanks
On 27 Jan 2014 16:54, "Matthew Hooker" notifications@github.com wrote:

It might be best to try the latest tagged version, 1.4.9https://github.com/SimpleFinance/chef-zookeeper/tree/v1.4.9until we can troubleshoot.

thanks for the report.


Reply to this email directly or view it on GitHubhttps://github.com//issues/51#issuecomment-33427540
.

@mwhooker
Copy link
Contributor

yeah, 1.4.9 will be latest community cb. That's good to know.

Debugging now, I'll update you soon

@mwhooker
Copy link
Contributor

Having trouble reproducing.

check-local-zk.py blocks the init script until it can reach zookeeper. It tries for 5 minutes. When you run exhibitor manually, it doesn't run that script, which is why it's starting okay.

Could you run that script after starting exhibitor manually and tell me what the output is?

It would also be useful to see the changes you made to the attributes.

@alkhatib
Copy link
Author

Sure, I did this on a new machine Amazon Linux machine:
Added ec2-user to the sudoers file:

sudo visudo
+ ec2-user ALL=(ALL)      ALL

Did a yum update and installed patch which is needed by the recipe (Maybe a dependency needs to be defined somewhere?):
sudo yum update sudo yum install patch

Make the parent directory that contains my snapshot,logs/transaction directories:
mkdir ~/zookeeper

My only changes were made to default.rb and my diff of the default.rb file:

nottaway:zookeeper aalkhatib$ git diff
diff --git a/attributes/default.rb b/attributes/default.rb
index 8045329..8ff515f 100644
--- a/attributes/default.rb
+++ b/attributes/default.rb
@@ -2,8 +2,8 @@ default[:zookeeper][:version] = "3.4.5"
 default[:zookeeper][:mirror] = "http://mirrors.ibiblio.org/apache/zookeeper/zookeeper-#{default[:zookeeper][:version]}/zookeeper-#{default[:zookeeper][:version]}.tar.gz"
 default[:zookeeper][:checksum] = 'e92b634e99db0414c6642f6014506cc22eefbea42cc912b57d7d0527fb7db132'
 default[:zookeeper][:install_dir] = "/opt/zookeeper"
-default[:zookeeper][:user] = "zookeeper"
-default[:zookeeper][:group] = "zookeeper"
+default[:zookeeper][:user] = "ec2-user"
+default[:zookeeper][:group] = "ec2-user"

 default[:gradle][:version] = "1.5"
 default[:gradle][:mirror] = "http://services.gradle.org/distributions/gradle-#{default[:gradle][:version]}-bin.zip"
@@ -14,9 +14,9 @@ default[:exhibitor][:install_dir] = "/opt/exhibitor"

 default[:exhibitor][:script_dir] = '/usr/local/bin/'

-default[:exhibitor][:snapshot_dir] = "/tmp/zookeeper"
-default[:exhibitor][:transaction_dir] = "/tmp/zookeeper"
-default[:exhibitor][:log_index_dir] = "/tmp/zookeeper_log_indexes"
+default[:exhibitor][:snapshot_dir] = "/home/ec2-user/zookeeper/snapshots"
+default[:exhibitor][:transaction_dir] = "/home/ec2-user/zookeeper/txn"
+default[:exhibitor][:log_index_dir] = "/home/ec2-user/zookeeper/logs"
 default[:exhibitor][:log_to_syslog] = "1"

 # Port for the HTTP Server
@@ -24,15 +24,13 @@ default[:exhibitor][:opts][:port] = "8080"
 default[:exhibitor][:opts][:hostname] =  node[:ipaddress]
 default[:exhibitor][:opts][:defaultconfig] = "#{node[:exhibitor][:install_dir]}/exhibitor.properties"

-default[:exhibitor][:opts][:configtype] = "file"
+default[:exhibitor][:opts][:configtype] = "s3"

 default[:exhibitor][:loglevel] = "info"

 # For --configtype s3, set:
-# [:exhibitor][:s3key] = "key"
-# [:exhibitor][:s3secret] = "secret"
-# [:exhibitor][:opts][:s3config] = "bucket:config-key"
-# [:exhibitor][:opts][:s3region] = "region" # i.e. us-east-1
+default[:exhibitor][:s3key] = "MY-KEY"
+default[:exhibitor][:s3secret] = "MY-SECRET"
+default[:exhibitor][:opts][:s3config] = "MYBUCKET:KEY"
+default[:exhibitor][:opts][:s3region] = "us-east-1" # i.e. us-east-1

 # For --contiftype file
 default[:exhibitor][:opts][:fsconfigdir] = "/tmp"

Then I start the bootstrap:

knife cookbook upload zookeeper
Uploading zookeeper      [1.4.9]
Uploaded 1 cookbook.
knife bootstrap x.x.x.x --ssh-user ec2-user --run-list "recipe[zookeeper]" --sudo

@alkhatib
Copy link
Author

So it actually hung at:

  * remote_file[/var/chef/cache/zookeeper-3.4.5.tar.gz] action create

I checked the machine and no chef process was running, so I just restarted it, and will update when it is finished.

@alkhatib
Copy link
Author

Ok, so the bootstrap hung again at the point where it was starting the process and waitiing for the check-local-zk.py script to return.

In /var/log/messages:

Jan 28 16:02:48 init: exhibitor main process (31063) terminated with status 1
Jan 28 16:02:48 init: exhibitor main process ended, respawning
Jan 28 16:07:49 init: exhibitor post-start process (31064) terminated with status 1
Jan 28 16:07:49 init: exhibitor main process (31077) terminated with status 1

Then when I started the zookeeper instance manualy by copy/pasting the command in /etc/init/exhibitor.conf:

sudo -u $USER bash -c "java -Dlog4j.configuration=file:///opt/exhibitor/log4j.properties -jar /opt/exhibitor/1.5.0.jar \
    --configtype s3 --defaultconfig /opt/exhibitor/exhibitor.properties --hostname x.x.x.x --port 8080 --s3config bucket:name --s3credentials /opt/exhibitor/exhibitor.s3.properties --s3region us-east-1 | logger -t zookeeper"

That worked and I was able to connect to zookeeper, and the s3 configuration was updated, and that machine joined my ensemble.
and the check-local-zookeeper.py exits with a 0 status

[ec2-user@machine ~]$ /usr/local/bin/check-local-zk.py
[ec2-user@machine ~]$ echo $?
0

Killing the exhibitor/zookeeper I started and trying to start it as a service:

sudo start -v exhibitor

/var/log/messages:
Jan 28 16:26:32 init: exhibitor main process (31611) terminated with status 1
Jan 28 16:26:32 init: exhibitor main process ended, respawning

and the script:
python /usr/local/bin/check-local-zk.py
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>
<urlopen error [Errno 111] Connection refused>

If you need anymore information, or would like me to do anything let me know.

@alkhatib
Copy link
Author

alkhatib commented Feb 3, 2014

Any other information I can provide to help debug this issue?

@mwhooker
Copy link
Contributor

mwhooker commented Feb 4, 2014

let me try with an Amazon Linux image

@mwhooker
Copy link
Contributor

mwhooker commented Feb 4, 2014

dunno why this didn't click, but it's probably because this cookbook supports upstart, which is an ubuntu thing. You might try applying the patches in #19

I'll see if I can add support for runit

@mwhooker
Copy link
Contributor

mwhooker commented Feb 4, 2014

closing. Please track in #54

@mwhooker mwhooker closed this as completed Feb 4, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants