Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CoreOS/SystemD Service Start Fix #327

Merged
merged 1 commit into from Mar 2, 2016

Conversation

akutz
Copy link
Member

@akutz akutz commented Mar 2, 2016

This patch is a fix for issue #326, where CoreOS is placing SSH sessions into a CGroup, thus killing any REX-Ray daemons not specifically launched via SystemD.

This patch actually adjusts REX-Ray's behavior for not just CoreOS systems, but any Linux using SystemD. All SystemD-based systems now prefer SystemD for service-control-management (SCM) commands.

1. Verifying the version

core@CoreOS ~ $ rexray version
Binary: /opt/bin/rexray
SemVer: 0.3.2-rc5+2+dirty
OsArch: Linux-x86_64
Branch: bugfix/coreos-start-service
Commit: 13be71f6ad962f859bc5b284f18bf69179954737
Formed: Wed, 02 Mar 2016 12:26:23 UTC

2. Become the root user

core@CoreOS ~ $ sudo su -

3. Starting REX-Ray via rexray start as root

CoreOS ~ # env REXRAY_MOCKDRIVERS=true REXRAY_STORAGEDRIVERS=mockStorageDriver rexray start 
● rexray.service - rexray
   Loaded: loaded (/etc/systemd/system/rexray.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2016-03-02 12:43:43 UTC; 9ms ago
 Main PID: 754 (rexray)
   Memory: 1.7M
      CPU: 5ms
   CGroup: /system.slice/rexray.service
           └─754 /opt/bin/rexray start -f

Mar 02 12:43:43 CoreOS systemd[1]: Started rexray.
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=debug msg="updated log level" logLevel=debug
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=debug msg="invoking service start" os.Args=[/opt/bin/rexray start -f]

4. Dropping to a normal user session on CoreOS

CoreOS ~ # exit
logout

5. Checking the status of REX-Ray via rexray status as a normal user

core@CoreOS ~ $ rexray status
● rexray.service - rexray
   Loaded: loaded (/etc/systemd/system/rexray.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2016-03-02 12:43:43 UTC; 6s ago
 Main PID: 754 (rexray)
   Memory: 4.1M
      CPU: 16ms
   CGroup: /system.slice/rexray.service
           └─754 /opt/bin/rexray start -f

Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg="os driver initialized" moduleName=default-docker provider=linux
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=debug msg="checking volume path cache setting" pathCache=true
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg=vdm.List driverName=docker moduleName=default-docker
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg="listing volumes" driverName=docker moduleName=default-docker
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg=sdm.GetVolume driverName=mockStorageDriver moduleName=default-docker volumeID= volumeName=
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg=odm.GetMounts deviceName= driverName=linux moduleName=default-docker mountPoint=
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=debug msg="docker voldriver spec file" path="/etc/docker/plugins/rexray.spec"
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg="started module" address="unix:///run/docker/plugins/rexray.sock" name=default-docker typeName=docker
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg="service sent registered modules start signals"
Mar 02 12:43:43 CoreOS rexray[754]: time="2016-03-02T12:43:43Z" level=info msg="service successfully initialized, waiting on stop signal"

6. Verifying REX-Ray's PID

core@CoreOS ~ $ ps alx | grep rexray
4     0   754     1  20   0  42592 20892 futex_ Ssl  ?          0:00 /opt/bin/rexray start -f
0   500   775   739  20   0   6740   796 pipe_w S+   pts/0      0:00 grep --colour=auto rexray

7. Exiting the SSH session to CoreOS

core@CoreOS ~ $ exit
logout

Connection to 192.168.120.17 closed.
[0]akutz@pax:rexray$ 

8. Reconnecting to CoreOS via SSH and verifying REX-Ray is still running

[0]akutz@pax:rexray$ ssh coreos
Last login: Wed Mar  2 12:43:36 2016 from 192.168.120.3
CoreOS stable (835.13.0)
core@CoreOS ~ $ ps alx | grep rexray
4     0   754     1  20   0  42592 20892 futex_ Ssl  ?          0:00 /opt/bin/rexray start -f
0   500   811   807  20   0   6740   912 pipe_w S+   pts/0      0:00 grep --colour=auto rexray
core@CoreOS ~ $ 

@akutz akutz added the bugfix label Mar 2, 2016
@akutz akutz self-assigned this Mar 2, 2016
@akutz akutz added this to the 0.3.2 milestone Mar 2, 2016
@akutz
Copy link
Member Author

akutz commented Mar 2, 2016

Hi @clintonskitson,

I'm assigning this to you for testing. I validated it on CoreOS (SystemD) and CentOS 7 (SystemD) to verify that REX-Ray now uses SystemD for starting, stopping, and checking the service status. I then verified that on Ubuntu 12.04 (SystemV) REX-Ray still does things the old way for SCM commands.

@akutz akutz assigned clintkitson and unassigned akutz Mar 2, 2016
@akutz
Copy link
Member Author

akutz commented Mar 2, 2016

Hi @clintonskitson,

You can file your PR for the doc fix to add to this PR if you like. That way both changes are documented in a single PR.

@clintkitson
Copy link
Member

Ok, testing now.

@clintkitson
Copy link
Member

@akutz This tests out good for me, great job. I have another doc PR which is a bit broader so I think it makes sense to file it separately. I can move it here if you disagree.

@clintkitson clintkitson assigned akutz and unassigned clintkitson Mar 2, 2016
@akutz
Copy link
Member Author

akutz commented Mar 2, 2016

Hi @clintonskitson,

I only mentioned collapsing the commits into this PR because when I checked this morning I did not see your PR listed. I'm sure it's fine separately as well. I'm going to first create a PR bumping the release train to RC6 for active dev, and then I'll merge this.

This patch is a fix for issue rexray#326, where CoreOS is placing SSH sessions
into a CGroup, thus killing any REX-Ray daemons not specifically
launched via SystemD.

This path actually adjusts REX-Ray's behavior for not just CoreOS
systems, but any Linux using SystemD. All SystemD-based systems now
prefer SystemD for service-control-management (SCM) commands.
@akutz akutz force-pushed the bugfix/coreos-start-service branch from 6078ded to 47f2500 Compare March 2, 2016 17:31
akutz added a commit that referenced this pull request Mar 2, 2016
@akutz akutz merged commit 54e188c into rexray:release/0.3.2 Mar 2, 2016
@akutz akutz deleted the bugfix/coreos-start-service branch March 2, 2016 17:32
akutz added a commit to akutz/rexray that referenced this pull request Mar 2, 2016
ENHANCEMENTS
- Improved installation documentation rexray#331

FIXES
- Fixes issue with daemon process getting cleaned as part of SystemD
  Cgroup rexray#327
@akutz akutz mentioned this pull request Mar 2, 2016
akutz added a commit to akutz/rexray that referenced this pull request Mar 4, 2016
This patch marks the release of REX-Ray 0.3.2!

NEW FEATURES
* Support for Docker 1.10 and Volume Plugin Interface 1.2 rexray#273
* Stale PID File Prevents Service Start rexray#258
* Module/Personality Support rexray#275
* Isilon Preemption rexray#231
* Isilon Snapshots rexray#260
* boot2Docker Support rexray#263
* ScaleIO Dynamic Storage Pool Support rexray#267

ENHANCEMENTS
* Improved installation documentation rexray#331
* ScaleIO volume name limitation rexray#304
* Docker cache volumes for path operations rexray#306
* Config file validation rexray#312
* Better logging rexray#296
* Documentation Updates rexray#285

BUG FIXES
* Fixes issue with daemon process getting cleaned as part of
  SystemD Cgroup rexray#327
* Fixes regression in 0.3.2 RC3/RC4 resulting in no log file rexray#319
* Fixes no volumes returned on empty list rexray#322
* Fixes "Unsupported FS" when mounting/unmounting with EC2 rexray#321
* ScaleIO re-authentication issue rexray#303
* Docker XtremIO create volume issue rexray#307
* Service status is reported correctly rexray#310

UPDATES
* <del>Go 1.6 rexray#308</del>

THANK YOU
* Dan Forrest
* Kapil Jain
* Alex Kamalov
This was referenced Mar 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants