Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test bulk uploading #43

Closed
kevinkle opened this issue Jun 22, 2017 · 17 comments
Closed

test bulk uploading #43

kevinkle opened this issue Jun 22, 2017 · 17 comments
Labels

Comments

@kevinkle
Copy link
Member

No description provided.

@kevinkle
Copy link
Member Author

Came across an error when testing this:

webserver_1              | [uwsgi-body-read] Timeout reading 49152 bytes. Content-Length: 27981462718 consumed: 17121280 left: 27964341438

Looks like unbit/uwsgi#363 documents a fix involving increasing the --socket-timeout. Will have to fix this in superphy/backend.

@kevinkle
Copy link
Member Author

The main reason for this timing out in the first place seems to be the copy command flask does to move all uploaded files to /datastore. This occurs on the local box when we move files from buffer to another disk for storage (as /datastore is defined on a separate, larger disk).

While we'll move to Warehouse for long-term storage, and store raw bytes of the file in blazegraph, we may need to create a task or other method for copying files over for the initial analyses modules to run on them.

@kevinkle kevinkle added the bug label Jun 26, 2017
@kevinkle
Copy link
Member Author

Looks like #43 (comment) is accurate.

@kevinkle
Copy link
Member Author

  proxy_connect_timeout       1800;
  proxy_send_timeout          1800;
  proxy_read_timeout          1800;
  send_timeout                1800;

didn't fix it

[uwsgi-body-read] Timeout reading 65536 bytes. Content-Length: 27981462718 consumed: 87556096 left: 27893906622

@kevinkle
Copy link
Member Author

If I remember correctly, when we switched from bare-metal to docker, we switched uwsgi to listen to a socket instead of a port. This was around the time uploading 5353 files stopped working.

@kevinkle
Copy link
Member Author

From the docs its looks like the defaults for uwsgi reads/sends (from nginx.conf) is 60s.

http://nginx.org/en/docs/http/ngx_http_uwsgi_module.html#uwsgi_read_timeout

In nginx.conf, setting

  uwsgi_send_timeout          1800;
  uwsgi_read_timeout          1800;

@kevinkle
Copy link
Member Author

Timeout seems fixed, but now jobs are being enqueued before Flask returns a blob id. Maybe this has to do with how they are generated?

@kevinkle
Copy link
Member Author

From the logs, it looks like Flask is taking a long time in the for loop to enqueue all the files (the spfy() call). Hence, the front-end isn't getting a blob id back yet.

@kevinkle
Copy link
Member Author

With the blob ids, it takes ~10 seconds to check jobs for all 5353 files (5353*10 = 53,530 jobs). will need to increase the polling time in reactapp for large tasks

@kevinkle
Copy link
Member Author

Blob ID is being generated okay, but every few GETS will return a 500 error.

Looks like this is tied to the code generating our redis connections for checking statuses: redis/hiredis#58

superphy/spfy#94 should be addressed asap too, as not doing so is slowing down responses for large files.

server is getting:

webserver_1              | spawned uWSGI worker 3 (pid: 46, cores: 1)
webserver_1              | [2017-06-30 17:41:23,015] ERROR in app: Exception on /api/v0/results/blob8845864416295104592 [GET]
webserver_1              | Traceback (most recent call last):
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
webserver_1              |     response = self.full_dispatch_request()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
webserver_1              |     rv = self.handle_user_exception(e)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask_cors/extension.py", line 161, in wrapped_function
webserver_1              |     return cors_after_request(app.make_response(f(*args, **kwargs)))
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
webserver_1              |     reraise(exc_type, exc_value, tb)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
webserver_1              |     rv = self.dispatch_request()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
webserver_1              |     return self.view_functions[rule.endpoint](**req.view_args)
webserver_1              |   File "./routes/ra_statuses.py", line 75, in job_status_reactapp
webserver_1              |     return job_status_reactapp_grouped(job_id)
webserver_1              |   File "./routes/ra_statuses.py", line 55, in job_status_reactapp_grouped
webserver_1              |     job = fetch_job(key)
webserver_1              |   File "./routes/job_utils.py", line 14, in fetch_job
webserver_1              |     job = q.fetch_job(job_id)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/queue.py", line 110, in fetch_job
webserver_1              |     job = self.job_class.fetch(job_id, connection=self.connection)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 294, in fetch
webserver_1              |     job.refresh()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 395, in refresh
webserver_1              |     obj = decode_redis_hash(self.connection.hgetall(key))
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/client.py", line 1861, in hgetall
webserver_1              |     return self.execute_command('HGETALL', name)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/client.py", line 578, in execute_command
webserver_1              |     connection.send_command(*args)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/connection.py", line 563, in send_command
webserver_1              |     self.send_packed_command(self.pack_command(*args))
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/connection.py", line 538, in send_packed_command
webserver_1              |     self.connect()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/connection.py", line 442, in connect
webserver_1              |     raise ConnectionError(self._error_message(e))
webserver_1              | ConnectionError: Error 99 connecting to redis:6379. Cannot assign requested address.
webserver_1              | [pid: 45|app: 0|req: 239/405] 172.18.0.1 () {44 vars in 771 bytes} [Fri Jun 30 17:40:23 2017] GET /api/v0/results/blob8845864416295104592 => generated 291 bytes in 59408 msecs (HTTP/1.1 500) 4 headers in 150 bytes (1 switches on core 0)
webserver_1              | 172.18.0.1 - - [30/Jun/2017:17:41:23 +0000] "GET /api/v0/results/blob8845864416295104592 HTTP/1.1" 500 291 "http://localhost:8090/results" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36" "-"
webserver_1              | [2017-06-30 17:41:23,022] ERROR in app: Exception on /api/v0/results/blob8845864416295104592 [GET]
webserver_1              | Traceback (most recent call last):
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
webserver_1              |     response = self.full_dispatch_request()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
webserver_1              |     rv = self.handle_user_exception(e)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask_cors/extension.py", line 161, in wrapped_function
webserver_1              |     return cors_after_request(app.make_response(f(*args, **kwargs)))
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
webserver_1              |     reraise(exc_type, exc_value, tb)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
webserver_1              |     rv = self.dispatch_request()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/flask/app.py", line 1598, in dispatch_request
webserver_1              |     return self.view_functions[rule.endpoint](**req.view_args)
webserver_1              |   File "./routes/ra_statuses.py", line 75, in job_status_reactapp
webserver_1              |     return job_status_reactapp_grouped(job_id)
webserver_1              |   File "./routes/ra_statuses.py", line 55, in job_status_reactapp_grouped
webserver_1              |     job = fetch_job(key)
webserver_1              |   File "./routes/job_utils.py", line 14, in fetch_job
webserver_1              |     job = q.fetch_job(job_id)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/queue.py", line 110, in fetch_job
webserver_1              |     job = self.job_class.fetch(job_id, connection=self.connection)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 294, in fetch
webserver_1              |     job.refresh()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/rq/job.py", line 395, in refresh
webserver_1              |     obj = decode_redis_hash(self.connection.hgetall(key))
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/client.py", line 1861, in hgetall
webserver_1              |     return self.execute_command('HGETALL', name)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/client.py", line 578, in execute_command
webserver_1              |     connection.send_command(*args)
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/connection.py", line 563, in send_command
webserver_1              |     self.send_packed_command(self.pack_command(*args))
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/connection.py", line 538, in send_packed_command
webserver_1              |     self.connect()
webserver_1              |   File "/opt/conda/envs/backend/lib/python2.7/site-packages/redis/connection.py", line 442, in connect
webserver_1              |     raise ConnectionError(self._error_message(e))
webserver_1              | ConnectionError: Error 99 connecting to redis:6379. Cannot assign requested address.
webserver_1              | [pid: 44|app: 0|req: 167/406] 172.18.0.1 () {44 vars in 771 bytes} [Fri Jun 30 17:40:43 2017] GET /api/v0/results/blob8845864416295104592 => generated 291 bytes in 39447 msecs (HTTP/1.1 500) 4 headers in 150 bytes (1 switches on core 0)
webserver_1              | 172.18.0.1 - - [30/Jun/2017:17:41:23 +0000] "GET /api/v0/results/blob8845864416295104592 HTTP/1.1" 500 291 "http://localhost:8090/results" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.109 Safari/537.36" "-"
webserver_1              | worker 1 killed successfully (pid: 44)
webserver_1              | uWSGI worker 1 cheaped.

@kevinkle
Copy link
Member Author

kevinkle commented Jul 4, 2017

Should be addressed alongside superphy/spfy#94

@kevinkle
Copy link
Member Author

kevinkle commented Jul 4, 2017

As of superphy/spfy#138 this is fixed on the backend and you can submit all 5353 test genomes using the subtyping task. Will create a new task in reactapp for bulk uploading before closing the issue.

@kevinkle
Copy link
Member Author

kevinkle commented Jul 5, 2017

#43 (comment) is added in #50

@kevinkle
Copy link
Member Author

kevinkle commented Jul 5, 2017

Merged in superphy/spfy#143 . Closing issue.

@kevinkle kevinkle closed this as completed Jul 5, 2017
@kevinkle
Copy link
Member Author

kevinkle commented Jul 5, 2017

Need to make some fixes.

@kevinkle kevinkle reopened this Jul 5, 2017
@kevinkle
Copy link
Member Author

kevinkle commented Jul 5, 2017

Edge cases fixed in superphy/spfy#144

@kevinkle
Copy link
Member Author

This was addressed locally, but fixes are needed to make this work in production. Tracking on superphy/spfy#188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant