Collections API: Update Celery to be Synchronous #2167

rajadain · 2017-08-18T17:45:21Z

Overview

Update Celery to use the new synchronous geoprocessing service, instead of the older Spark JobServer setup. See original issue and commit messages for details.

Connects #2102
Connects #2104

Testing Instructions

Ensure your Worker VM has been provisioned at least once on this feature/collections-api branch
~~Get the latest version of mmw-geoprocessing and run cibuild to generate api-assembly-3.0.0-alpha.jar~~
Check out this branch
~~Copy that new JAR into /opt/geoprocessing/mmw-geoprocessing-3.0.0-alpha.jar in the Worker VM, and then reload the Worker VM~~
Provision the Worker
Open the app :8000/
Draw any shape, and proceed to Analyze. Ensure that all analyses complete successfully.
Proceed to TR-55. Draw modifications, move the precipitation slider. Ensure that all model runs complete successfully.

Since the new geoprocessing service is run as the `mmw` user in the Worker VM, that user must have access to AWS credentials. Instead of mounting the developer's credentials into `/aws`, they are now mounted into the `mmw` user's home folder. Both `~/.aws/credentials` and `~/.aws/config` must be 644.

kellyi · 2017-08-21T16:50:26Z

Just made a new mmw-geoprocessing release -- https://github.com/WikiWatershed/mmw-geoprocessing/releases/tag/3.0.0-alpha-2 -- so we can add a commit here to pull the latest geoprocessing service when provisioning the worker on this branch.

kellyi · 2017-08-21T16:59:08Z

src/mmw/apps/modeling/geoprocessing.py

-    combines it with input data, and submits it to Spark JobServer.
-
-    This task must always be succeeded by `finish` below.
+    combines it with input data, and submits it to Geoprocessing Service.


Typo: to Geoprocessing Service -> to the Geoprocessing Service.

kellyi · 2017-08-21T17:23:56Z

src/mmw/apps/modeling/geoprocessing.py

    """
-    Start a geoproessing operation.
+    Run a geoproessing operation.


Typo: geoproessing -> geoprocessing

kellyi · 2017-08-21T19:19:11Z

Not sure why but I keep getting this error from the geoprocessing logs when trying to analyze:

Error during processing of request: 'Unable to load AWS credentials from any provider in the chain'. Completing
 with 500 Internal Server Error response. To change default exception handling behavior, provide a custom Exception
Handler.

rajadain · 2017-08-21T20:19:47Z

deployment/ansible/roles/model-my-watershed.geoprocessing/tasks/main.yml

@@ -1,4 +1,8 @@
 ---
+- name: Update CA Certificates
+  shell: "update-ca-certificates -f"


We have to do this for AWS credentials to work with OpenJDK. Otherwise we get this error:

Unable to execute HTTP request: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty

Does this make sense, or is it something completely out of left field? @hectcastro @tnation14

I got this from https://stackoverflow.com/a/29313285/6995854

This issue is unfamiliar to me; we use OpenJDK with the AWS SDK on a few other projects and I don't think I've seen it before. Before I give the OK on this solution I'd like to take a deeper look at the error locally. I think I remember how to reproduce this from our conversation on Friday.

Sounds good. Let me know if you'd like to pair on this, I'm available for that this week. Alternatively, just check out this branch, remove this commit, and follow the testing instructions to see the issue.

Confirmed that this is an instance of ca-certificates-java bug 1396760. This is actually an issue that should be fixed in azavea.java; I opened azavea/ansible-java#26 to track the issue, but I'm going to fix it right now.

rajadain · 2017-08-22T14:41:58Z

keep getting this error from the geoprocessing logs

paired with @kellyi yesterday, and got that error to abate with edb8d8b. Still waiting on confirmation from ops folks if that is the right thing to do.

tnation14 · 2017-08-22T18:51:58Z

Once azavea/ansible-java#27 is closed, this PR should use azavea.java version ~~6.0.1~~ 0.6.1

rajadain · 2017-08-22T20:43:05Z

@tnation14 We're currently at 0.5.0. Would we now go to 6.0.1?

tnation14 · 2017-08-22T21:07:03Z

I meant 0.6.1, sorry.

We add a task `run` and a helper method `geoprocess`. The `run` task converts the input into the desired format, and `geoprocess` communicates with the geoprocessing service and returns results. `run` is a combination of `start` and `finish`: it checks whether a result is cacheable and cached or not, and if so returns that. Otherwise it runs `geoprocess`. `geoprocess` is similar to `sjs_submit` in the sense that it is POSTing to an endpoint. Unlike `sjs_submit`, which gets back a job id, `geoprocess` receives the actual results and returns them. `run` is designed to replace `start` and `finish` tasks in Celery chains. So if a previous celery chain was: chain(geoprocessing.start.s(data), geoprocessing.finish.s(), mytasks.process_results.s()) It will now be: chain(geoprocessing.run.s(data), mytasks.process_results.s())

We likely do not need to use `choose_worker` anymore, since each request is independent and can be run on any worker (in the right colored stack). However, this probably needs some more thought, and thus will be addressed in the separate issue #2117.

These old async operations are no longer used.

This version includes RasterGroupedCount and RasterGroupedAverage operations, which make it sufficient for Analyze and TR-55 tasks.

We need azavea/ansible-java#27 to solve AWS access issues with OpenJDK.

rajadain · 2017-08-22T22:26:31Z

Alright, this should be ready for another look.

hectcastro · 2017-08-23T00:09:04Z

src/mmw/apps/modeling/mapshed/tasks.py

-        finish.s().set(exchange=exchange,
-                       routing_key=worker) |
+        run.s(opname, data, wkaoi).set(exchange=exchange,
+                                       routing_key=worker) |


I don't think that we need any of the exchange and routing_key stuff now that SJS is out of the picture. The main reason for it was to pin tasks to the worker where things started so that they'd finish there too. It may be another task, but probably worth doing.

@hectcastro We do have #2117 to address that, but it'd be helpful if you could comment on the statement I made in there about dark stack routing - that was my recollection, but I wasn't confident it was accurate.

kellyi · 2017-08-23T15:34:34Z

I destroyed then rebuilt my worker VM on this branch and everything seems to be working well!

It feels pretty quick for small AOIs; going to try out a few larger AOIs and some simultaneously requests just to see how that works.

kellyi

+1. This is working well! I destroyed then rebuilt my worker vm on this branch and everything worked correctly.

The responses feel pretty quick now, and we'll be tweaking the geoprocessing API performance in future work, too.

rajadain · 2017-08-23T19:32:34Z

Thanks for the reviews, everyone!

rajadain assigned kellyi Aug 18, 2017

rajadain requested a review from kellyi August 18, 2017 17:45

rajadain added the in progress label Aug 18, 2017

kellyi reviewed Aug 21, 2017

View reviewed changes

rajadain force-pushed the tt/collections-api-sync-celery branch from e3a94fa to edb8d8b Compare August 21, 2017 20:18

rajadain commented Aug 21, 2017

View reviewed changes

This was referenced Aug 22, 2017

Link JAVA SSL certificates after installing OpenJDK azavea/ansible-java#26

Closed

Install CA Certificates after installing OpenJDK azavea/ansible-java#27

Merged

rajadain added 5 commits August 22, 2017 18:23

Replace start and finish with run

30327aa

We likely do not need to use `choose_worker` anymore, since each request is independent and can be run on any worker (in the right colored stack). However, this probably needs some more thought, and thus will be addressed in the separate issue #2117.

Remove unused tasks and methods

88cf528

These old async operations are no longer used.

Use latest alpha of geoprocessing service

df38e84

This version includes RasterGroupedCount and RasterGroupedAverage operations, which make it sufficient for Analyze and TR-55 tasks.

Update Ansible Java to latest with bugfix

4e77070

We need azavea/ansible-java#27 to solve AWS access issues with OpenJDK.

rajadain force-pushed the tt/collections-api-sync-celery branch from edb8d8b to 4e77070 Compare August 22, 2017 22:23

hectcastro reviewed Aug 23, 2017

View reviewed changes

kellyi approved these changes Aug 23, 2017

View reviewed changes

kellyi assigned rajadain and unassigned kellyi Aug 23, 2017

rajadain merged commit b06a9e9 into feature/collections-api Aug 23, 2017

hectcastro removed the in progress label Aug 23, 2017

rajadain deleted the tt/collections-api-sync-celery branch August 23, 2017 19:32

rajadain mentioned this pull request Aug 24, 2017

Collections API: Update Celery to Use Real Values #2104

Closed

rajadain mentioned this pull request Oct 16, 2017

Release 1.20.0 #2304

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collections API: Update Celery to be Synchronous #2167

Collections API: Update Celery to be Synchronous #2167

rajadain commented Aug 18, 2017 •

edited

Loading

kellyi commented Aug 21, 2017

kellyi Aug 21, 2017

kellyi Aug 21, 2017

kellyi commented Aug 21, 2017

rajadain Aug 21, 2017

tnation14 Aug 22, 2017

rajadain Aug 22, 2017

tnation14 Aug 22, 2017

rajadain commented Aug 22, 2017

tnation14 commented Aug 22, 2017 •

edited

Loading

rajadain commented Aug 22, 2017

tnation14 commented Aug 22, 2017

rajadain commented Aug 22, 2017

hectcastro Aug 23, 2017

mmcfarland Aug 23, 2017

kellyi commented Aug 23, 2017 •

edited

Loading

kellyi left a comment

rajadain commented Aug 23, 2017

Collections API: Update Celery to be Synchronous #2167

Collections API: Update Celery to be Synchronous #2167

Conversation

rajadain commented Aug 18, 2017 • edited Loading

Overview

Testing Instructions

kellyi commented Aug 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kellyi commented Aug 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rajadain commented Aug 22, 2017

tnation14 commented Aug 22, 2017 • edited Loading

rajadain commented Aug 22, 2017

tnation14 commented Aug 22, 2017

rajadain commented Aug 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kellyi commented Aug 23, 2017 • edited Loading

kellyi left a comment

Choose a reason for hiding this comment

rajadain commented Aug 23, 2017

rajadain commented Aug 18, 2017 •

edited

Loading

tnation14 commented Aug 22, 2017 •

edited

Loading

kellyi commented Aug 23, 2017 •

edited

Loading