Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create geoprocessing API django app #2201

Merged
merged 8 commits into from
Aug 31, 2017
Merged

Conversation

arottersman
Copy link

@arottersman arottersman commented Aug 28, 2017

Overview

  • Creates apps.geoprocessing_api
  • /api/modeling/start/analyze/... ---> /api/analyze/...
  • /api/modeling/start/rwd/ ---> /api/rwd/
  • Add /api/jobs/<$job_uuid>/ (imports view from apps.modeling)
  • Updates client to use new routes
  • Modifies analyze and rwd views to correctly parse json encoding

EDIT: Part II

  • Rename not public /api urls to mmw ; /api/modeling --> /mmw/modeling
  • Modify analyze request input dbe7976
  • Add location header with /jobs url to poll f2362b9

Connects #2184

Demo

analyze_payload.json.txt
analyze_geojson.json.txt

http POST 'http://localhost:8000/api/analyze/animals/' < analyze_geojson.json
HTTP/1.1 200 OK
Allow: POST, OPTIONS
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/json
Date: Mon, 28 Aug 2017 21:34:44 GMT
Location: http://localhost:8000/api/jobs/e19e6117-4f53-423a-b475-c1bfc6582c1c/
Server: nginx/1.12.0
Transfer-Encoding: chunked
Vary: Accept-Encoding
Vary: Accept, Cookie

{
    "job": "e19e6117-4f53-423a-b475-c1bfc6582c1c",
    "status": "started"
}
http GET 'http://localhost:8000/api/jobs/e19e6117-4f53-423a-b475-c1bfc6582c1c/'
HTTP/1.1 200 OK
Content-Encoding: gzip
Content-Type: application/json
Date: Mon, 28 Aug 2017 21:35:47 GMT
Server: nginx/1.12.0
Transfer-Encoding: chunked
Vary: Accept-Encoding
Vary: Accept, Cookie

{
    "error": "",
    "finished": "2017-08-28T21:34:44.207Z",
    "job_uuid": "e19e6117-4f53-423a-b475-c1bfc6582c1c",
    "result": "{\"survey\": {\"displayName\": \"Animals\", \"name\": \"animals\", \"categories\": [{\"aeu\": 0, \"typ
e\": \"Sheep\"}, {\"aeu\": 0, \"type\": \"Horses\"}, {\"aeu\": 0, \"type\": \"Turkeys\"}, {\"aeu\": 0, \"type\": \"Ch
ickens, Layers\"}, {\"aeu\": 0, \"type\": \"Cows, Beef\"}, {\"aeu\": 0, \"type\": \"Pigs/Hogs/Swine\"}, {\"aeu\": 0, 
\"type\": \"Cows, Dairy\"}, {\"aeu\": 0, \"type\": \"Chickens, Broilers\"}]}}",                                     
    "started": "2017-08-28T21:34:44.141Z",
    "status": "complete"
}

Testing Instructions

You will need

  • rwd-docker running

App still works?

  • Pull this branch, bundle
  • Confirm you can run both rwd endpoints from draw. The request should be going to api/geoprocessing /api, and not api/modeling
  • Analyze an AOI and confirm each of the analyze tasks still works. Their requests should also poll /api/geoprocessing /api
  • Confirm TR-55 still works

API Accepts JSON now?
Using the json in the Demo section or similarly formatted json of your choosing, exercise the API:

http POST 'http://localhost:8000/api/analyze/animals/' < analyze_geojson.json

error = save_job_error.s(job.id)
@decorators.api_view(['GET'])
@decorators.permission_classes((AllowAny, ))
def get_job(request, job_uuid, format=None):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This got reordered in the file, so the diff looks off. It should be implemented identically to before.

@mmcfarland
Copy link
Contributor

Running through the functionality and things are posting and polling to the right URLs, but I'm getting this on the raster analyze steps (the vectors are coming back).

File "/opt/app/apps/geoprocessing_api/tasks.py", line 124, in analyze_soil
    raise Exception('[analyze_soil] {}'.format(result['error']))
Exception: [analyze_soil] Geoprocessing Error.
Details: HTTP method not allowed, supported methods: GET

Seems like it may be related to the underlying GT implementation. I'll try a reprovision and see if that sorts things out for this branch - but just checking that those work correctly on your branch.

@arottersman
Copy link
Author

Seems like it may be related to the underlying GT implementation. I'll try a reprovision and see if that sorts things out for this branch - but just checking that those work correctly on your branch.

They do. @rajadain helped me work through the same error. If you haven't destroyed and rebuilt your worker, spark jobserver is likely still running on it. You'll have to vagrant ssh worker -c 'sudo stop spark-jobserver'

Copy link
Contributor

@mmcfarland mmcfarland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good baseline. Some issues I've noticed, but will let you sort out whether to include them here or in subsequent issues:

  • URL names (this was discussed internally); I think that api should be reserved for just this app
  • Route names /api/geoprocessing/start/analyze/<thing> is pretty verbose. /api/analyze/<thing> or similar would seem sufficient
  • We'll likely want to rethink how we name the input. If all we're sending is geometry, we may want to accept formal geojson and not a json document with an area_of_interest key.
  • Errors to the api (like 404) return HTML and should return JSON
  • Initial responses for submitting an async job should have a mechanism to programmatically call in for the results. We do something similar with a Location header elsewhere.

@arottersman
Copy link
Author

URL names (this was discussed internally); I think that api should be reserved for just this app

I'll fix that here. I liked your idea of using mmw/ for all the other routes.

Route names /api/geoprocessing/start/analyze/ is pretty verbose. /api/analyze/ or similar would seem sufficient

I'll fix that here, too

We'll likely want to rethink how we name the input. If all we're sending is geometry, we may want to accept formal geojson and not a json document with an area_of_interest key.

I agree. This seems like it should be its own issue, but I'm going to go for it. Hopefully it will make working on the tasks to come a little cleaner.

Errors to the api (like 404) return HTML and should return JSON

I think this could be done with #2200. I'll update the issue to that effect.

Initial responses for submitting an async job should have a mechanism to programmatically call in for the results. We do something similar with a Location header elsewhere.

Seems straightforward enough to add here.

@rajadain
Copy link
Member

Taking a look at this.

Copy link
Member

@rajadain rajadain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. This is working correctly. Two minor comments:

  • geoprocessing in the API feels heavy, as does having a geoprocessing_api app. Not sure what captures the entire intent in a shorter form, but geo and geop could be potential candidates.
  • Currently the Analyze and RWD tasks are grouped together. Consider splitting in to analyze/tasks and rwd/tasks, since there is no overlap in the code. Also, consider moving calcs to analyze/calcs to better imply the coupling there.

def point_source_pollution(geojson):
"""
Given a GeoJSON shape, retrieve point source pollution data
from the `ms_pointsource` or `ms_pointsource_drb table to display
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing end tick on ms_pointsource_drb

@arottersman
Copy link
Author

geoprocessing in the API feels heavy, as does having a geoprocessing_api app. Not sure what captures the entire intent in a shorter form, but geo and geop could be potential candidates.

Could we potentially just call it api (eg /api/analyze, /api/rwd? @mmcfarland are there plans for another API separate from our "geoprocessing" one?

@mmcfarland
Copy link
Contributor

Could we potentially just call it api (eg /api/analyze ....

Yep, I made the same comment in my PR review above.

@arottersman
Copy link
Author

Currently the Analyze and RWD tasks are grouped together. Consider splitting in to analyze/tasks and rwd/tasks, since there is no overlap in the code. Also, consider moving calcs to analyze/calcs to better imply the coupling there.

This is a nice idea, but for now I don't mind the flatness. Seems to go nicely with views.

@arottersman
Copy link
Author

I added three commits to address feedback. See their commit messages for details. I've also tried to keep the PR description at the top up to date.

Quick note on restructuring the expected request input: I left the rwd JSON as is for now. I'll open a card for dealing with it separately. The analyze input seemed weirder/more worthy of an expedited update.

@arottersman
Copy link
Author

Fixups are to fix failing tests.

@rajadain
Copy link
Member

Some flake8 checks are failing:

/opt/app/apps/geoprocessing_api/views.py:7:1: F401 'json' imported but unused
/opt/app/apps/geoprocessing_api/views.py:170:1: E302 expected 2 blank lines, found 1
/opt/app/apps/geoprocessing_api/views.py:177:80: E501 line too long (99 > 79 characters)
/opt/app/apps/modeling/views.py:543:1: E302 expected 2 blank lines, found 1
/opt/app/apps/modeling/views.py:546:8: E111 indentation is not a multiple of four
/opt/app/apps/modeling/views.py:585:80: E501 line too long (81 > 79 characters)
/opt/app/apps/modeling/views.py:592:80: E501 line too long (85 > 79 characters)
flake8 check failed

Copy link
Contributor

@mmcfarland mmcfarland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments about fn arguments and about reverting the change to the app accessible URLs. Having trouble sorting out some geop credentials, but intend to give this a good testing.

@@ -45,7 +46,8 @@ def start_rwd(request, format=None):
{
'job': task_list.id,
'status': 'started',
}
},
headers={'Location': reverse('get_job', args=[task_list.id])}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to test in dev, but we'll want to make sure that this picks up the https of staging/production.

url(r'^micro/', include(apps.water_balance.urls)),
url(r'^user/', include(apps.user.urls))
url(r'^mmw/user/', include(apps.user.urls))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should go beyond what was previously under /api, as these modifications actually change the URL at which the app is served and from where links have been added to webpages and bookmarked, etc. The redirects would likely keep it functioning, but I think changing the app URLs is a step too far.

For example, a project page is now http://localhost:8000/mmw/project/82/scenario/159

@@ -539,6 +540,27 @@ def get_layer_shape(table_code, id):
else:
return None

def parse_area_of_interest(area_of_interest, geojson=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could use some comments about the arguments. It's not clear why there's a flag for geoJson but the function raises an error if there is invalid geojson provided, despite the flag. If we have two workflows that use this function with different types, it may be worth modifying the calls so they are both passing a single type. Goes for the related function below.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add comment to these two and the parse_input in the geoprocessing_api.

It's not clear why there's a flag for geoJson but the function raises an error if there is invalid geojson provided, despite the flag.

The function always expects area_of_interest to be valid geojson. The geojson flag controls whether the function returns a geojson string or its dict representation. It comes from the original modeling.views.parse_input implementation; I'll see if we can take it out.

@@ -85,7 +90,9 @@ def start_analyze_soil(request, format=None):
@decorators.permission_classes((AllowAny, ))
def start_analyze_animals(request, format=None):
user = request.user if request.user.is_authenticated() else None
area_of_interest, __ = parse_input(request.data['analyze_input'])

wkaoi_str = request.query_params.get('wkaoi', None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems strange to parse out a wkaoi_str if we're going to ignore the results of the lookup done in parse_input. Can this be removed in all the non-raster jobs since wkaoi_str has a default value of None in the function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where no area of interest was sent, we use the wkaoi_str to look up the aoi, whether the job is raster or not.

Even though we don't benefit from the wkaoi caching for these jobs, I think we should still accept the wkaoi id as input. This makes our requests on the app more uniform and makes it so the user doesn't have to keep track of which jobs are raster and which aren't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'm not advocating for changing the request shape, but the method reads like it does not utilize the argument based on it ignoring the wkaoi return. I think an improvement might be to rename things from parse_* to load_* or similar, since it's actually fetching data, not just parsing input. Also, the second return variable which is only occasionally used appears to either be None or the id passed in as an argument initially. Seems like this can be removed (since the caller necessarily has it), so it's clear that this function only returns an area of interest, regardless of what mixture of inputs is received. It reads like it returns something special only if wkaoi is provided.

Alice Rottersman added 3 commits August 31, 2017 13:52
* Create `apps.geoprocessing_api`. It is dependent on `apps.modeling`
* Move the analyze views, tasks, calcs, and tests out of `apps.modeling` and into
`apps.geoprocessing_api`
* Move the rwd views and tasks out of `apps.modeling` and into
`apps.geoprocessing_api`

* All analyze urls are now `api/geoprocessing/start/analyze/...`
* Rwd url is now `api/geoprocessing/start/rwd/

* Add `api/geoprocessing/jobs/<job_uuid>`, which uses `modeling.views.get_job`
* Parse the analyze and rwd requests such that they could be either
  form-urlencoded (from the app) or json (easier for an api user)
* Update the client to call new geoprocessing routes for analyze and rwd
@arottersman arottersman changed the base branch from feature/collections-api to develop August 31, 2017 17:53
@arottersman
Copy link
Author

arottersman commented Aug 31, 2017

  • Rebased and retargeted on develop
  • 79478b2 Revert change to /user endpoints
  • eb6404c Refactor parse_input

@mmcfarland
Copy link
Contributor

Changes look good, but looks like /project didn't get reverted. App is still serving it with mmw,

http://localhost:8000/mmw/project/88/scenario/168

url(r'^api/bigcz/', include(apps.bigcz.urls)),
url(r'^api/geocode/', include(apps.geocode.urls)),
url(r'^api/modeling/', include(apps.modeling.urls)),
url(r'^mmw/bigcz/', include(apps.bigcz.urls)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will require a corresponding change in data_catalog/models.js which is still calling /api/bigcz/search. If you switch to ?bigcz=true, the search won't work.

On the topic, perhaps mmw/bigcz/... seems a little off. I'd be open to mmw, api, bigcz being the high level non-html rendering routes. But it's an internal app URL, so not critical.

@arottersman
Copy link
Author

Thanks for catching those and the rwd input!

  • Was maybe a little overzealous with that /mmw route prefix. Changed bigcz to just bigcz, and fixed /project
  • rwd now takes pure json, not string encoded. Adjusted the client to send json.

Copy link
Contributor

@mmcfarland mmcfarland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, thanks for working through those changes. This is a great baseline to build on now.

@mmcfarland mmcfarland removed their assignment Aug 31, 2017
Alice Rottersman added 5 commits August 31, 2017 17:22
* We were prefixing almost all URLs with `/api`. We'd like a way to
  distinguish the public facing API urls from our internal only ones,
  so rename all non-geprocessing-api urls to have the prefix `/mmw`
* /api/bigcz --> /bigcz
* Now that geoprocessing is the lone api endpoint, make its urls just `/api/..`
* Remove `/start` from urls. It adds length with little benefit
* Add a Location header with the `/jobs/$job-id` url to poll
* Previously the analyze requests took a json body either like:
      { "area_of_interest": { "type": "MultiPolygon" ... more geojson }}
   or like:
      { "wkaoi": "mywkaoi__id" }

* Now the input should be either GeoJson in the request body:
      { "type": "MultiPolygon", ... more geojson }
   or take a query param with the wkaoi:
      `api/analyze/?wkaoi=mywkaoi__id`
* Update the RWD view to take a json object (same keys as form we were
  submitting)

eg
  {
      "location": [39.11112, -73.123],
      "snappingOn": true,
      "dataSource": "nhd"
  }
* Update client to send json instead of form
* `ms_pointsource_drb` was missing its closing tick mark
@arottersman
Copy link
Author

Thanks @mmcfarland (and @rajadain) for the multiple thorough reviews. I've squashed things down and will merge when this has built.

@arottersman arottersman merged commit e02fd99 into develop Aug 31, 2017
@arottersman arottersman deleted the arr/welcome-to-the-geop-api branch August 31, 2017 21:40
@rajadain rajadain mentioned this pull request Oct 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants