Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get real user name from username and email (for submission as real user with drmaa) #4096

Merged
merged 23 commits into from Jun 27, 2017
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
4f31290
- get real user name from username and email
bernt-matthias May 22, 2017
b07ebda
- added path check to external chown script
bernt-matthias May 23, 2017
6841faa
- removed my paths
bernt-matthias May 23, 2017
d25e552
- removed uneccessary import
bernt-matthias May 23, 2017
3a16b36
made determination of actual user configurable
bernt-matthias May 24, 2017
5ee0d7e
moved sudo calls to galaxy.ini
bernt-matthias May 26, 2017
01cffa6
imroved external_chown_script now uses Popen and checks return code
bernt-matthias May 26, 2017
64f5e2e
Fix linting issues
rhpvorderman Jun 12, 2017
e197cad
fix typo
rhpvorderman Jun 13, 2017
71cd8dd
Add system_user_pwent function
rhpvorderman Jun 13, 2017
32d7a30
Add system_user_pwent function to upload and jobs
rhpvorderman Jun 13, 2017
2ec8054
Change paramter typo
rhpvorderman Jun 13, 2017
e81a2ac
Fix removed duplication in external chown script
rhpvorderman Jun 13, 2017
c275d25
Fix linting
rhpvorderman Jun 13, 2017
aa200fe
implemented the changes requested by @mvdbeek:
bernt-matthias Jun 13, 2017
e8f44de
bugfix missing colon
bernt-matthias Jun 13, 2017
ae1903d
shlex.split also for external kill and run scripts
bernt-matthias Jun 13, 2017
7fbd3f2
added missing empty line
bernt-matthias Jun 13, 2017
074b90d
bug fixes
Jun 13, 2017
f83871d
moved assign_all_groups parameter to config
Jun 13, 2017
351adc6
Merge remote-tracking branch 'upstream/dev' into dev
bernt-matthias Jun 14, 2017
57c1124
fixed typos in galaxy.ini
bernt-matthias Jun 15, 2017
698075c
Merge remote-tracking branch 'upstream/dev' into dev
bernt-matthias Jun 19, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
21 changes: 16 additions & 5 deletions lib/galaxy/jobs/__init__.py
Expand Up @@ -1743,7 +1743,14 @@ def _change_ownership( self, username, gid ):
job = self.get_job()
# FIXME: hardcoded path
external_chown_script = self.get_destination_configuration("external_chown_script", None)
cmd = [ '/usr/bin/sudo', '-E', external_chown_script, self.working_directory, username, str( gid ) ]

cmd = [ '/usr/bin/sudo', '-E', ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded sudo is actually a remnant of the original implementation. Ideally, if one needs sudo it should just be part of the value of the corresponding *_script call, at which point you'd have full control over the environment variables passed or not passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont get what you are suggesting. We could have a little bash helper/wrapper:

dosudo.sh:

#!/bin/bash
sudo -E AND_WHATEVER_IS_NEEDED "$@"

From python we could then call: dosudo.sh externalhelper.py ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got it. I moved the sudo calls to galaxy.ini

for v in ["PATH", "LD_LIBRARY_PATH", "PKG_CONFIG_PATH"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be necessary, as 'sudo -E' does preserve the environment.
This is a small snippet that you can use to verify this:

import os
import subprocess


print('without -E:')
subprocess.call(['sudo', './echo.sh'], shell=False)
print('with -E:')
subprocess.call(['sudo', '-E', './echo.sh'], shell=False)

Write a echo.sh script and echo some environment variable only available to the user.
You'll see that only with '-E' you get the environment variable echoed correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The security policy of our cluster has the env_reset option enabled (see man 5 sudoers) which in particular resets PATH. This seems to be the default on CentOS 6.8. In addition the LD_LIBRARY_PATH and PKG_CONFIG_PATH are reset. I'm not sure if PKG_CONFIG_PATH is actually needed - I just added everything that was removed.

I think that the policy of sudo to reset the PATH and LIBRARY_PATH (etc) is actually a good idea. Just imagine a script with sudo rights that calls a binary B oder uses a library L. If you have the possibility to modify the PATH or LIBRARY_PATH before sudoing you have essentially the right to do everything.

I think this modification is likely not of interest for many (maybe only me). It could also be that it is completely wrong to solve my problem like this. The question would be how to proceed here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, could you ask your admins to make an exception for the external_chown_script.py ?
If I look at https://unix.stackexchange.com/a/13246 it appear you can specify the env vars to keep for a specific path. In a sense you're already circumventing this policy by injecting the the environment variables anyway ... . If not I guess we can again make the list of env vars configurable in the galaxy.ini file, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PATH we also thought that it is sufficient to hard code the path to the python binary in the script. But then there is still the problem with the python shared object file which is not found. Maybe one of you has an idea? Maybe I could build a special statically linked python binary just for these scripts. Could be a better idea than adding a potential security hole to galaxy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I don't think forcing users to build special version of the python interpreter just for this purpose is great. I kind of like injecting PATH (and LD_LIBRARY_PATH -urgs-, if necessary) instead of propagating all variables with sudo -E.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be an option to call

/path/to/python external_chown_script.py ...

instead of

external_chown_script.py ... (using the shellbang)

Then we could just use the python of galaxy's virtual environment if this path is accessible from the code. But this still leaves the problem of the library path (I never understood why the python binary but not the shared object is copied to the venv).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer that this list be configurable, but see also my comment about not defaulting to using sudo at all.

try:
cmd.append('%s=%s'%(v, os.environ[ v ]))
except KeyError:
pass
cmd.extend( [ external_chown_script, self.working_directory, username, str( gid ) ] )
log.debug( '(%s) Changing ownership of working directory with: %s' % ( job.id, ' '.join( cmd ) ) )
p = subprocess.Popen( cmd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE )
# TODO: log stdout/stderr
Expand All @@ -1770,10 +1777,14 @@ def reclaim_ownership( self ):
def user_system_pwent( self ):
if self.__user_system_pwent is None:
job = self.get_job()
try:
self.__user_system_pwent = pwd.getpwnam( job.user.email.split('@')[0] )
except:
pass

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra whitespace here and below isn't needed (sorry for the nitpick).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the helper/wrapper it would be configurable (at a single place)

# try to get user from email address/username
for cand in [job.user.email.split('@')[0], job.user.username]:
Copy link
Member

@mvdbeek mvdbeek May 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this to be configurable via a variable in galaxy.ini (sth. like real_system_username, with options 'user_email' and 'username'), close to the external_chown_script variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds possible.

The question is if this is necessary, since testing both adds only a little computation (I guess its the same as the if-statement that would be necessary when we make this configurable).

For me both possibilities would be fine.

If we leave it as is I could imagine in case that both ways to determine a user are successful a test checking if they yield the same user would be a good idea.

Otherwise: Can somebody point me to examples showing me how to access configuration variables?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried about corner cases like users who register an email address whose first part matches another users' username ... who then would get billed for computation time.
And explicit is better then implicit.

Copy link
Member

@mvdbeek mvdbeek May 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For adding options to galaxy.ini you'd write your option in the galaxy.ini.sample (with a bit of documentation) and then you reference it in the config object here. This will then be available as app.config.your_new_option (in this class it's self.app.config.your_new_option)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Will do it this way.

try:
self.__user_system_pwent = pwd.getpwnam( cand )
except KeyError:
pass

return self.__user_system_pwent

@property
Expand Down
19 changes: 17 additions & 2 deletions lib/galaxy/jobs/runners/drmaa.py
Expand Up @@ -316,7 +316,15 @@ def stop_job( self, job ):
self.ds.kill( ext_id )
else:
# FIXME: hardcoded path
subprocess.Popen( [ '/usr/bin/sudo', '-E', kill_script, str( ext_id ), str( self.userid ) ], shell=False )

command = [ '/usr/bin/sudo', '-E', ]
for v in ["PATH", "LD_LIBRARY_PATH", "PKG_CONFIG_PATH"]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I don't understand why this is necessary, 'sudo -E' should really preserve the environment.

try:
command.append('%s=%s'%(v, os.environ[ v ]))
except KeyError:
pass
command.extend( [ kill_script, str( ext_id ), str( self.userid ) ])
subprocess.Popen( command, shell=False )
log.debug( "(%s/%s) Removed from DRM queue at user's request" % ( job.get_id(), ext_id ) )
except drmaa.InvalidJobException:
log.debug( "(%s/%s) User killed running job, but it was already dead" % ( job.get_id(), ext_id ) )
Expand Down Expand Up @@ -362,11 +370,18 @@ def external_runjob(self, external_runjob_script, jobtemplate_filename, username
"""
script_parts = external_runjob_script.split()
script = script_parts[0]
command = [ '/usr/bin/sudo', '-E', script]
command = [ '/usr/bin/sudo', '-E', ]
for v in ["PATH", "LD_LIBRARY_PATH", "PKG_CONFIG_PATH"]:
try:
command.append('%s=%s'%(v, os.environ[ v ]))
except KeyError:
pass
command.append( script )
for script_argument in script_parts[1:]:
command.append(script_argument)

command.extend( [ str(username), jobtemplate_filename ] )
command.append( "--assign_all_groups" )
Copy link
Member

@mvdbeek mvdbeek Jun 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add this within the galaxy.ini (drmaa_external_runjob_script = sudo -E scripts/drmaa_external_runner.py --assign_all_groups), so I would prefer to not have this hardcoded here.

log.info("Running command %s" % command)
p = subprocess.Popen(command,
shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
Expand Down
18 changes: 16 additions & 2 deletions lib/galaxy/tools/actions/upload_common.py
Expand Up @@ -282,8 +282,22 @@ def create_paramfile( trans, uploaded_datasets ):
"""
def _chown( path ):
try:
pwent = pwd.getpwnam( trans.user.email.split('@')[0] )
cmd = [ '/usr/bin/sudo', '-E', trans.app.config.external_chown_script, path, pwent[0], str( pwent[3] ) ]
# get username from email/username
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of duplication here that should be moved to a single function or method called by all who need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Would the helper script solve this problem already? Otherwise could you be more precise what should be packed in a function (just getting the user or also calling the external scripts). Where would such a function fit? I.e. in which python module?

pwent = None
for cand in [ trans.user.email.split('@')[0], trans.user.username]:
try:
pwent = pwd.getpwnam( cand )
except KeyError:
pass
assert pwent != None

cmd = [ '/usr/bin/sudo', '-E', ]
for v in ["PATH", "LD_LIBRARY_PATH", "PKG_CONFIG_PATH"]:
try:
cmd.append('%s=%s'%(v, os.environ[ v ]))
except KeyError:
pass
cmd.extend( [ trans.app.config.external_chown_script, path, pwent[0], str( pwent[3] ) ] )
log.debug( 'Changing ownership of %s with: %s' % ( path, ' '.join( cmd ) ) )
p = subprocess.Popen( cmd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE )
stdout, stderr = p.communicate()
Expand Down
22 changes: 21 additions & 1 deletion scripts/external_chown_script.py
@@ -1,14 +1,34 @@
#!/usr/bin/env python
import os
import os.path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This import is not required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I removed it.

import sys

# you may configure the paths below which modifications are allowed.
# should contain the full paths to job_working_directory and new_file_path.
# if set to None every file can be modified by the script.
# this can increase security in particular if write access to this
# script is removed by the admin.
# ALLOWED_PATHS = [ job_working_directory, new_file_path ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you quote job_working_directory and new_file_path ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

ALLOWED_PATHS = None

def validate_paramters():
if len(sys.argv) < 4:
sys.stderr.write("usage: %s path user_name gid\n" % sys.argv[0])
exit(1)

path = sys.argv[1]
path = os.path.abspath( sys.argv[1] )
if ALLOWED_PATHS == None:
allowed = True
else:
allowed = False
for p in ALLOWED_PATHS:
if path.startswith( p ):
allowed = True
break
if not allowed:
sys.stderr.write( "owner and group modifications in %s are not allowed\n" %path )
sys.exit( 1 )

galaxy_user_name = sys.argv[2]
gid = sys.argv[3]

Expand Down