New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to upload large files into Galaxy #999
Comments
@hackdna: Can you try to run this on an AWS instance? |
@flekschas: could you paste Celery and Galaxy logs? |
Celeryd-w1.log
I killed Celery and cleared the log that's why we see Warm shutdown. @hackdna Where can I find Galaxy logs. |
Need to see the |
Celeryd-w2.log
I am not running Galaxy as a daemon, hence no logs. But as I said, Galaxy didn't do anything the whole time. |
It looks like the failure occurred in |
FastQC workflow also failed on refinery-dev instance during import into Galaxy (import into Galaxy never went beyond "Pending" as far as I can tell). I was able to import the file in question into a Galaxy history on the same server and to successfully run FastQC on it. Notes:
|
Refinery app log:
Celery log:
Galaxy log:
|
Thanks @hackdna. Appears to be an issue with bioblend. Do you agree? |
The error (HTTP 500) comes from Galaxy and is simply being reported back by urllib2. |
Galaxy may not be able to handle file uploads larger than a certain size: https://biostar.usegalaxy.org/p/10044/ |
One potential workaround: In addition, since it's possible to upload to Amazon's Simple Storage Service (S3) in parallel, using Galaxy CloudMan may be a faster alternative. We are investigating incorporating easy access to S3 buckets for Galaxy instances on the Amazon Elastic Compute Cloud (EC2). But you don't need to wait for the pretty interface, you can already access contents of S3 buckets by pasting links to their contents in the "URL/Text:" field of the "Upload File" tool." |
@hackdna: I was able to upload the file into a history in the same Galaxy instance directly (by providing the corresponding HTTP URL in the Galaxy upload tool), so at least in principle the file is not too big for this Galaxy instance. |
Yes, that would be a workaround. However, this is making a GET request from Galaxy which is fundamentally different from making a POST request to Galaxy (the source of error in question). |
Error is reproducible on Galaxy 16.01 (CloudMan) but with different message. Galaxy log:
|
Opened a CloudMan issue: galaxyproject/cloudman#47 |
One workaround is to use upload_file_from_url(). |
Relevant issue for refactoring use of |
This can be closed upon merging: #1092 |
Commit: 8562f47
Dataset: http://stemcellcommons.org/sites/default/files/isa/isa_13293_727513.zip
Input file: http://stemcellcommons.org/sites/default/files/xf_bioassay_files/2S_HL60_ACAGTG_lane8_read2.fastq.gz
Steps to reproduce
Observed behavior
Download took about 10 mins. Afterwards the system got stuck in exporting the file to Galaxy. I cleared the queue after more than one hour. VHeadline was running at 100% CPU while on the VM no single process was using more than 3% CPU. On my host, Galaxy was doing nothing as well.
Expected behavior
The export to Galaxy shouldn't take that long.
The text was updated successfully, but these errors were encountered: