Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error uploading large files into histories and libraries #47

Closed
hackdna opened this issue Apr 22, 2016 · 9 comments
Closed

Error uploading large files into histories and libraries #47

hackdna opened this issue Apr 22, 2016 · 9 comments

Comments

@hackdna
Copy link

hackdna commented Apr 22, 2016

AMI: ami-b45e59de, Galaxy 16.01.

Steps to reproduce

in Python interpreter:

from bioblend.galaxy.objects import GalaxyInstance
gi_aws = GalaxyInstance("galaxy-dev.aws.stemcellcommons.org", "api_key")
h_aws = gi_aws.histories.create()
h_aws.upload_dataset('/path/to/multigigabytefile')

Observed results

in Python interpreter:

ConnectionError: Unexpected response from galaxy: 504: <html>
<head><title>504 Gateway Time-out</title></head>
<body bgcolor="white">
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.4.6 (Ubuntu)</center>
</body>
</html>

Nginx log:

2016/04/22 19:03:19 [error] 12825#0: *14311 upstream timed out (110: Connection timed out) while sending request to upstream, client: 134.174.183.88, server: , request: "POST /api/tools HTTP/1.1", upstream: "http://127.0.0.1:8080/api/tools", host: "galaxy-dev.aws.stemcellcommons.org"

Galaxy log:

134.174.183.88 - - [22/Apr/2016:19:01:56 +0000] "POST /api/tools HTTP/1.0" 500 - "-" "python-requests/2.9.1"
Error - <class 'webob.request.DisconnectionError'>: The client disconnected while sending the POST/PUT body (1183825676 more bytes were expected)
URL: http://galaxy-dev.aws.stemcellcommons.org/api/tools
File '/mnt/galaxy/galaxy-app/lib/galaxy/web/framework/middleware/error.py', line 151 in __call__
  app_iter = self.application(environ, sr_checker)
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/paste/recursive.py', line 85 in __call__
  return self.application(environ, start_response)
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/paste/httpexceptions.py', line 640 in __call__
  return self.application(environ, start_response)
File '/mnt/galaxy/galaxy-app/lib/galaxy/web/framework/base.py', line 126 in __call__
  return self.handle_request( environ, start_response )
File '/mnt/galaxy/galaxy-app/lib/galaxy/web/framework/base.py', line 153 in handle_request
  trans = self.transaction_factory( environ )
File '/mnt/galaxy/galaxy-app/lib/galaxy/web/framework/webapp.py', line 66 in <lambda>
  self.set_transaction_factory( lambda e: self.transaction_chooser( e, galaxy_app, session_cookie ) )
File '/mnt/galaxy/galaxy-app/lib/galaxy/web/framework/webapp.py', line 97 in transaction_chooser
  return GalaxyWebTransaction( environ, galaxy_app, self, session_cookie )
File '/mnt/galaxy/galaxy-app/lib/galaxy/web/framework/webapp.py', line 193 in __init__
  self.error_message = self._authenticate_api( session_cookie )
File '/mnt/galaxy/galaxy-app/lib/galaxy/web/framework/webapp.py', line 308 in _authenticate_api
  api_key = self.request.params.get('key', None)
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/webob/request.py', line 853 in params
  params = NestedMultiDict(self.GET, self.POST)
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/webob/request.py', line 789 in POST
  self.make_body_seekable()
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/webob/request.py', line 943 in make_body_seekable
  self.copy_body()
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/webob/request.py', line 963 in copy_body
  did_copy = self._copy_body_tempfile()
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/webob/request.py', line 980 in _copy_body_tempfile
  data = input.read(min(todo, 65536))
File '/mnt/galaxy/galaxy-app/.venv/local/lib/python2.7/site-packages/webob/request.py', line 1549 in readinto
  + "(%d more bytes were expected)" % self.remaining
DisconnectionError: The client disconnected while sending the POST/PUT body (1183825676 more bytes were expected)

CGI Variables
-------------
  CONTENT_LENGTH: '4628837012'
  CONTENT_TYPE: 'multipart/form-data; boundary=cf3af77a8a3445f98ce91060761385a7'
  HTTP_ACCEPT: '*/*'
  HTTP_ACCEPT_ENCODING: 'gzip, deflate'
  HTTP_CONNECTION: 'close'
  HTTP_HOST: 'galaxy-dev.aws.stemcellcommons.org'
  HTTP_USER_AGENT: 'python-requests/2.9.1'
  HTTP_X_FORWARDED_FOR: '134.174.183.88'
  HTTP_X_FORWARDED_HOST: 'galaxy-dev.aws.stemcellcommons.org'
  ORGINAL_HTTP_HOST: 'galaxy_app'
  ORGINAL_REMOTE_ADDR: '127.0.0.1'
  PATH_INFO: '/api/tools'
  REMOTE_ADDR: '134.174.183.88'
  REQUEST_METHOD: 'POST'
  SERVER_NAME: '127.0.0.1'
  SERVER_PORT: '8080'
  SERVER_PROTOCOL: 'HTTP/1.0'

WSGI Variables
--------------
  application: <paste.recursive.RecursiveMiddleware object at 0x7fcce108fcd0>
  is_api_request: True
  paste.expected_exceptions: [<class 'paste.httpexceptions.HTTPException'>]
  paste.httpexceptions: <paste.httpexceptions.HTTPExceptionHandler object at 0x7fcce108fc50>
  paste.httpserver.proxy.host: 'dummy'
  paste.httpserver.proxy.scheme: 'http'
  paste.httpserver.thread_pool: <paste.httpserver.ThreadPool object at 0x7fcce0aabb90>
  paste.recursive.forward: <paste.recursive.Forwarder from />
  paste.recursive.include: <paste.recursive.Includer from />
  paste.recursive.include_app_iter: <paste.recursive.IncluderAppIter from />
  paste.recursive.script_name: ''
  paste.throw_errors: True
  request_id: 'af50af2608bc11e6b6a90a4d98d6c597'
  webob._body_file: (<_io.BufferedReader>, <socket._fileobject object at 0x7fccc43f62d0 length=4628837012>)
  webob._parsed_query_vars: (GET([]), '')
  wsgi process: 'Multithreaded'

Expected results

Large file upload should succeed

Notes

Large file uploads work in a non-cloud Galaxy 16.01 instance, not running behind Nginx.
/etc/nginx/sites-enabled/default.server:

    # This file is maintained by CloudMan.
    # Changes will be overwritten!


    upstream galaxy_app {
        server 127.0.0.1:8080;
    }
    server {
        listen                  80;
        client_max_body_size    10G;
        proxy_read_timeout      1200s;

        include /etc/nginx/sites-enabled/*.locations;
    }

I've tried adding proxy_send_timeout 1200s; (and restarting Nginx from the command line to preserve the config file edits) but still got the same error though after a much longer period of time.

@afgane
Copy link
Contributor

afgane commented Apr 23, 2016

@natefoo Would you have any hints about this?

@hackdna
Copy link
Author

hackdna commented Apr 26, 2016

It looks like nginx is accepting the incoming data transfer over the network and saving it to disk and the error occurs during the subsequent forwarding of the request to Galaxy. There doesn't seem to be a particular threshold for file size (for example, 2GB and 2.7GB file can be uploaded successfully).

@ngehlenborg
Copy link

@afgane and @natefoo: Do you have any insights into what could be causing this issue or pointers for @hackdna? We would greatly appreciate your help with this as it turns out to be a major blocker for our Cloudman-based Refinery stack on AWS.

@hackdna
Copy link
Author

hackdna commented Apr 26, 2016

Also, this might be related to issue #48.

@afgane
Copy link
Contributor

afgane commented Apr 27, 2016

Just an FYI, I'm traveling for couple of weeks so unfortunately won't be
able to look at any of this during that time.
On Apr 26, 2016 23:58, "Ilya Sytchev" notifications@github.com wrote:

Also, this might be related to issue #48
#48.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#47 (comment)

@hackdna
Copy link
Author

hackdna commented Apr 27, 2016

Thanks for heads up. Do you think this would be worth sending to the Galaxy dev list?

@afgane
Copy link
Contributor

afgane commented Apr 27, 2016

It certainly wouldn't hurt. Phrased as a more general question than being
cloud-specific (even if just in the subject) may lead to more attention.

Sorry for the delay.
On Apr 27, 2016 5:01 PM, "Ilya Sytchev" notifications@github.com wrote:

Thanks for heads up. Do you think this would be worth sending to the
Galaxy dev list?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#47 (comment)

@ngehlenborg
Copy link

Thanks for letting us know and enjoy your travels.

@hackdna
Copy link
Author

hackdna commented May 5, 2016

Just a heads up, we were able to use upload_file_from_url() as a workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants