Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Various bits of hardening #4604

Merged
merged 23 commits into from Sep 25, 2017

Conversation

Projects
None yet
7 participants
@erasche
Copy link
Member

erasche commented Sep 13, 2017

i.e. patching completely theoretical vulnerabilities with no known / actionable exploits.

  • replaced yaml.load with yaml.safe_load
  • rewrote most os.system/os.exec/subprocess.Popen/etccalls to removeshell=True` and to use command lists rather than strings (and thus have everything properly quoted)
  • converted urllib requests to requests since this is included in galaxy and does not support urls like file://. Requests only supports http:// and https:// which provides a good measure of safety for us.
@natefoo

This comment has been minimized.

Copy link
Member

natefoo commented Sep 13, 2017

This is fantastic. Thank you @erasche!

rewrote most os.system/os.exec/subprocess.Popen/etc calls to remove shell=True and to use command lists rather than strings (and thus have everything properly quoted)

Yessssssssssss. Plus with shell=False nothing actually has to be quoted.

As a general suggestion to anyone using such calls in the future (we need to put a coding guide or something in the developer docs), if I'm going to print the command line as a log message I'll do something like:

from six.moves import shlex_quote
log.debug("Executing: %s", ' '.join([shlex_quote(x) for x in cmd]))
subprocess.Popen(cmd, shell=False, ...)

So that if someone copies and pastes the command line it'll work as written.

@@ -693,8 +694,7 @@ def process_split_file(data):
else:
commands = Sequence.get_split_commands_sequential(is_gzip(input_name), input_name, output_name, start_sequence, sequence_count)
for cmd in commands:
if 0 != os.system(cmd):
raise Exception("Executing '%s' failed" % cmd)
subprocess.check_call(cmd, shell=True)

This comment has been minimized.

Copy link
@bgruening

bgruening Sep 13, 2017

Member

remove shell=True?

This comment has been minimized.

Copy link
@erasche

erasche Sep 14, 2017

Author Member

I'm not going to do this one now. There are multiple + varying numbers of pipes in the commands passed to it, this is non-trivial to refactor currently.

if result != 0:
raise Exception('Result %s from %s' % (result, cmd))
cmd = ['egrep', '-v', '-h' '^@'] + split_files[1:] + ['>>', output_file]
subprocess.check_call(cmd, shell=True)

This comment has been minimized.

Copy link
@bgruening

bgruening Sep 13, 2017

Member

remove shell=True?

This comment has been minimized.

Copy link
@erasche

erasche Sep 13, 2017

Author Member

This one wasn't due to >>, which I need to rewrite more code to solve.

@@ -382,8 +383,7 @@ def __prepare_input_files_locally(self, job_wrapper):
prepare_input_files_cmds = getattr(job_wrapper, 'prepare_input_files_cmds', None)
if prepare_input_files_cmds is not None:
for cmd in prepare_input_files_cmds: # run the commands to stage the input files
if 0 != os.system(cmd):
raise Exception('Error running file staging command: %s' % cmd)
subprocess.check_call(cmd, shell=True)

This comment has been minimized.

Copy link
@bgruening

bgruening Sep 13, 2017

Member

remove shell=True?

This comment has been minimized.

Copy link
@erasche

erasche Sep 13, 2017

Author Member

prepare_input_files_cmds comes from arbitrary places, will need to audit all possible code paths to get here..

@nsoranzo
Copy link
Member

nsoranzo left a comment

Thanks @erasche!

I haven't finished to review everything, but didn't want to leave it unreviewed for long.

except:
print("#Unable to connect to " + site)
continue
text = page.read()
text = page.text

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

It would be nice to have the same implementation as above, i.e.

text = requests.get(site).text

in line 23 and remove this line.

# Format is <version x.y.z>+htslib-<a.b.c>
if p.returncode == 0:
try:
output = subprocess.check_output(['samtools', '--version-only'], stderr=subprocess.PIPE)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

According to https://docs.python.org/2/library/subprocess.html#subprocess.check_output : "Do not use stderr=PIPE with this function as that can deadlock based on the child process error volume. Use Popen with the communicate() method when you need a stderr pipe."

You can probably remove the stderr parameter.

This comment has been minimized.

Copy link
@erasche

erasche Sep 14, 2017

Author Member

Ah, ok. I was trying to silently consume / discard STDERR since some seems to have been sent to a tool which failed as a result.


output = subprocess.Popen(['samtools'], stderr=subprocess.PIPE, stdout=subprocess.PIPE).communicate()[1]
output = subprocess.check_output(['samtools'], stderr=subprocess.PIPE)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

This has the same problem as above, BUT also the original code was retrieving the stderr (using communicate()[1]), not the stdout, because that's where the old samtools prints its help.
Also subprocess.check_output() will raise a subprocess.check_output() because the old samtools < 1.0 will return 1 if called without arguments.

This comment has been minimized.

Copy link
@erasche

erasche Sep 14, 2017

Author Member

I missed the 1, thanks, great catch!

return True
else:
return False
mime_type = subprocess.check_output(['file', '--mime-type', filename]).strip()

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

.strip() can probably be removed.

# Format is <version x.y.z>+htslib-<a.b.c>
if p.returncode == 0:
try:
output = subprocess.check_output(['samtools', '--version-only'], stderr=subprocess.PIPE)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

Remove , stderr=subprocess.PIPE

response = urllib2.urlopen(full_url)
ldda_ids = loads(response.read())["ids"]
response = requests.get(full_url).text
ldda_ids = json.loads(response)["ids"]
response.close()

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

Remove this line.

@@ -2762,8 +2762,8 @@ def lucene_search(trans, cntrller, search_term, search_url, **kwd):
message = escape(kwd.get('message', ''))
status = kwd.get('status', 'done')
full_url = "%s/find?%s" % (search_url, urllib.urlencode({"kwd" : search_term}))
response = urllib2.urlopen(full_url)
ldda_ids = loads(response.read())["ids"]
response = requests.get(full_url).text

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

s/text/json()/ and remove the call to json.loads() below?

@@ -3,7 +3,7 @@
"""
import cgi
import os
import urllib
import requests

from paste.httpexceptions import HTTPNotFound, HTTPBadGateway

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member
import os

import requests
from paste.httpexceptions import HTTPNotFound, HTTPBadGateway
@@ -1,9 +1,9 @@
import logging
import os
import requests

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

Should go in its own section between import tempfile and from galaxy import util

@@ -21,6 +20,8 @@
from tool_shed.util import (basic_util, commit_util, common_util, encoding_util,
hg_util, metadata_util, repository_util, shed_util_common as suc, xml_util)

import requests

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

This should go at line 10.

@nsoranzo
Copy link
Member

nsoranzo left a comment

Final review comments, thanks again @erasche!

ofilename = dataset.file_name
log.exception('Command "%s" failed. Could not convert the Jupyter Notebook to HTML, defaulting to plain text.', cmd)
log.exception('Command "%s" failed. Could not convert the Jupyter Notebook to HTML, defaulting to plain text.', map(shlex_quote, cmd))

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

I think you have to also add ' '.join(...) around map(shlex_quote, cmd)

cmd = '%s %s' % (self.command, tj.id)
log.debug('Transfer command is: %s' % cmd)
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
cmd = self.command.append(tj.id)

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

Sorry, I've made a mistake suggesting you to use .append() here, because this modifies self.command, which is reused in this for loop.

@@ -16,7 +16,7 @@

import os
import sys
import urllib2
import requests

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

Should be in its own section after the standard library imports.

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 18, 2017

Member

This is still to be fixes, should be:

import os
import sys
from xml import etree

import requests

sys.path.insert(1, os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir, 'lib')))

This comment has been minimized.

Copy link
@erasche

erasche Sep 18, 2017

Author Member

argh, sorry. I even ran tox linting locally this time (though I have extra files in my galaxy directory and mostly end up just using travis to do the linting for me.)

@@ -8,7 +8,7 @@
import sys
import time
from ftplib import FTP
from urllib2 import urlopen
import requests

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

Should be right before from BeautifulSoup import BeautifulSoup

else:
break
src = requests.get(download_url, stream=True)
with open(file_path, 'wb') as handle:

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 14, 2017

Member

You need to remove the lines 114-115:

                 if dst:
                     dst.close()

Also I'd keep the variable name dst (not really important any way).

@nsoranzo
Copy link
Member

nsoranzo left a comment

Thanks for the fixes @erasche, there is a unit test failing:

https://travis-ci.org/galaxyproject/galaxy/jobs/276755018#L1616

should be easy to fix by using subprocess.check_output in lib/galaxy/tools/deps/mulled/mulled_build_channel.py

@@ -16,7 +16,7 @@

import os
import sys
import urllib2
import requests

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 18, 2017

Member

This is still to be fixes, should be:

import os
import sys
from xml import etree

import requests

sys.path.insert(1, os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir, 'lib')))
@@ -21,13 +21,13 @@
import os
import sys
import time
import subprocess

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 18, 2017

Member

Should go between os and sys, sorry!

@erasche

This comment has been minimized.

Copy link
Member Author

erasche commented Sep 18, 2017

gosh I miss gofmt...

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Sep 18, 2017

I've tried to restart the API tests, but the builds keep failing because of integrity script errors:

Exception: Failed to write job script, could not verify job script integrity.

see e.g. https://jenkins.galaxyproject.org/job/docker-api/8485/testReport/junit/api.test_tools/ToolsTestCase/test_multirun_in_repeat/

I suspect this change is making the code faster, so we may need to increase DEFAULT_INTEGRITY_SLEEP or DEFAULT_INTEGRITY_COUNT.

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Sep 22, 2017

@erasche This PR has a conflict now, if you have time to take a look.
@jmchilton Any advice for the API tests?

@erasche

This comment has been minimized.

Copy link
Member Author

erasche commented Sep 22, 2017

yep, no problem @nsoranzo. I'll finish this up monday and can hopefully make some progress on that integrity issue (unless john has some magic solution).

I'd be interested to hear what the goal of that integrity check is. Mostly when I think "please integrity test this script to ensure it was written correctly" I think of things like the following:

function do_it() {
   ... all of the complex logic you want to ensure is correct
}
# Finally, call it at the end, here we know it was written fully and the stream wasn't interrupted. 
do_it

is it a different kind of integrity here?

@nsoranzo

This comment has been minimized.

Copy link
Member

nsoranzo commented Sep 25, 2017

@erasche The code that checks the job script integrity is at https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/jobs/runners/util/job_script/__init__.py#L113 , basically it is meant to verify that the job script has been written to disk. I had to disable it on my server (where Galaxy is on NFS) because it was failing many jobs, which instead just run fine without it.

@erasche

This comment has been minimized.

Copy link
Member Author

erasche commented Sep 25, 2017

So it's testing write + immediate readback? Ok, interesting, thanks.

erasche added some commits Sep 13, 2017

Only permit yaml.safe_loading of data
Event trusted data, belt + suspenders method.
Convert urllib to requests
Which has built-in protection against urls like "file:///tmp/a"
These we will revert to their original yaml.load
It would be nice to fix them eventually, but we'll settle for an
incremental improvement.

@galaxyproject galaxyproject deleted a comment from jmchilton Sep 25, 2017

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Sep 25, 2017

I'd increase the default script integrity count - I guess by speeding it up you shaved some total time off the check. It is bonkers to me that the script is still having problems - seems impossible that anything should work on a filesystem that fails the integrity check.

@@ -119,7 +119,7 @@ def _handle_script_integrity(path, config):
sleep_amt = getattr(config, "check_job_script_integrity_sleep", DEFAULT_INTEGRITY_SLEEP)
for i in range(count):
try:
proc = subprocess.Popen([path], shell=True, env={"ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ": "1"})
proc = subprocess.Popen([path], env={"ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ": "1"})
proc.wait()
if proc.returncode == 42:

This comment has been minimized.

Copy link
@nsoranzo

nsoranzo Sep 25, 2017

Member

You can probably use subprocess.call() while you are here:

            returncode = subprocess.call([path], env={"ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ": "1"})
            if returncode == 42:

@jmchilton jmchilton merged commit 0c6dfd7 into galaxyproject:dev Sep 25, 2017

6 checks passed

api test Build finished. 292 tests run, 4 skipped, 0 failed.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
framework test Build finished. 161 tests run, 0 skipped, 0 failed.
Details
integration test Build finished. 46 tests run, 0 skipped, 0 failed.
Details
lgtm analysis: JavaScript No alert changes
Details
toolshed test Build finished. 579 tests run, 0 skipped, 0 failed.
Details

@erasche erasche deleted the erasche:hardening branch Sep 26, 2017

@jmchilton

This comment has been minimized.

Copy link
Member

jmchilton commented Sep 26, 2017

This PR broke a bunch of workflow management Selenium test cases - probably pasting a workflow URL doesn't work anymore? I'm investigating. https://jenkins.galaxyproject.org/job/selenium/614/.

@@ -4,7 +4,7 @@
import logging
import os
import sgmllib
import urllib2
import requests

This comment has been minimized.

Copy link
@jmchilton

jmchilton Sep 26, 2017

Member

There is a requests controller - so this isn't getting the requests top-level package. I'll open a PR with a fix.

This comment has been minimized.

Copy link
@erasche

erasche Sep 26, 2017

Author Member

Oh jeez, ok, good catch. Thanks John

jmchilton added a commit to jmchilton/galaxy that referenced this pull request Sep 26, 2017

Fix invalid requests imports introduced in galaxyproject#4604.
Even when we remove the requests controller the pyc files will stick around - so probably best to keep these absolute imports for these controllers indefinitely. This broke workflow import from URL coming through the GUI and so it broke a bunch of Selenium tests.

dannon added a commit that referenced this pull request Sep 26, 2017

Merge pull request #4699 from jmchilton/fix_import
Fix invalid requests imports introduced in #4604.
@mvdbeek

This comment has been minimized.

Copy link
Member

mvdbeek commented Sep 28, 2017

Any chance this is causing

galaxy.jobs.runners: ERROR: (107) Unhandled exception calling queue_job
Traceback (most recent call last):
  File "/galaxy/lib/galaxy/jobs/runners/__init__.py", line 104, in run_next
    method(arg)
  File "/galaxy/lib/galaxy/jobs/runners/local.py", line 87, in queue_job
    command_line, exit_code_path = self.__command_line(job_wrapper)
  File "/galaxy/lib/galaxy/jobs/runners/local.py", line 76, in __command_line
    self.write_executable_script(job_file, job_file_contents)
  File "/galaxy/lib/galaxy/jobs/runners/__init__.py", line 327, in write_executable_script
    write_script(path, contents, self.app.config, mode=mode)
  File "/galaxy/lib/galaxy/jobs/runners/util/job_script/__init__.py", line 110, in write_script
    _handle_script_integrity(path, config)
  File "/galaxy/lib/galaxy/jobs/runners/util/job_script/__init__.py", line 141, in _handle_script_integrity
    raise Exception("Failed to write job script, could not verify job script integrity.")
Exception: Failed to write job script, could not verify job script integrity.

in random jenkins jobs (https://jenkins.galaxyproject.org/job/docker-api/8660/testReport/junit/api.test_tools/ToolsTestCase/test_combined_mapping_and_subcollection_mapping/, https://jenkins.galaxyproject.org/job/docker-api/8660/testReport/junit/api.test_tools/ToolsTestCase/test_identifier_with_multiple_normal_datasets/) ? / Any idea how we can get around this ?

proc = subprocess.Popen([path], shell=True, env={"ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ": "1"})
proc.wait()
if proc.returncode == 42:
returncode = subprocess.call([path], env={"ABC_TEST_JOB_SCRIPT_INTEGRITY_XYZ": "1"})

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Sep 28, 2017

Member

Any chance the exception happens before the inner try/except clause, and we just fail directly ?

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Sep 28, 2017

Member

If the script doesn't exist we should get a OSError: [Errno 2] No such file or directory, which would be caught later on and look like the inetgrity verification failed.

This comment has been minimized.

Copy link
@erasche

erasche Sep 28, 2017

Author Member

possibly? It would be nice to have a logging message on the exception.

This comment has been minimized.

Copy link
@erasche

erasche Sep 28, 2017

Author Member

I'll open a PR.

This comment has been minimized.

Copy link
@mvdbeek

mvdbeek Sep 28, 2017

Member

#4718 -- if it fails with anything except for OSError we should have a better idea, if it doesn't then this was probably the issue, and it would also explain why this test never worked for me on NFS

This comment has been minimized.

Copy link
@erasche

@erasche erasche referenced this pull request Sep 28, 2017

Closed

Integrity exceptions #4719

natefoo added a commit to natefoo/galaxy that referenced this pull request Oct 13, 2017

Fix invalid requests imports introduced in galaxyproject#4604.
Even when we remove the requests controller the pyc files will stick around - so probably best to keep these absolute imports for these controllers indefinitely. This broke workflow import from URL coming through the GUI and so it broke a bunch of Selenium tests.

nsoranzo added a commit that referenced this pull request Oct 13, 2017

Merge pull request #4801 from natefoo/backport-hardening-16.07
[16.07] Various bits of hardening from #4604

nsoranzo added a commit that referenced this pull request Oct 14, 2017

Merge pull request #4804 from natefoo/backport-hardening-16.10
[16.10] Various bits of hardening from #4604

natefoo added a commit to natefoo/galaxy that referenced this pull request Oct 15, 2017

Fix invalid requests imports introduced in galaxyproject#4604.
Even when we remove the requests controller the pyc files will stick around - so probably best to keep these absolute imports for these controllers indefinitely. This broke workflow import from URL coming through the GUI and so it broke a bunch of Selenium tests.

natefoo added a commit to natefoo/galaxy that referenced this pull request Oct 15, 2017

Fix invalid requests imports introduced in galaxyproject#4604.
Even when we remove the requests controller the pyc files will stick around - so probably best to keep these absolute imports for these controllers indefinitely. This broke workflow import from URL coming through the GUI and so it broke a bunch of Selenium tests.

nsoranzo added a commit that referenced this pull request Oct 17, 2017

Merge pull request #4807 from natefoo/backport-hardening-17.01
[17.01] Various bits of hardening from #4604

nsoranzo added a commit that referenced this pull request Oct 17, 2017

Merge pull request #4816 from natefoo/backport-hardening-17.05
[17.05] Various bits of hardening from #4604

nsoranzo added a commit that referenced this pull request Oct 18, 2017

Merge pull request #4824 from natefoo/backport-hardening-17.09
[17.09] Various bits of hardening from #4604

@martenson martenson removed the status/WIP label Jul 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.