BiG-CZ: Cache CUAHSI Requests #2260

rajadain · 2017-09-15T16:02:10Z

Overview

CUAHSI searches use two endpoints: one for fetching services, and another for fetching series within them. The services rarely update, so we cache them for a week. The series update more frequently, so we cache them for 5 minutes. Each cache key is composed of bigcz_{name of request method}_{sorted hashset of arguments} so new requests should work as expected, and existing requests should be cached for the given TTL.

This work complements the filters PR #2258 which triggers searches on click. This should make enabling and disabling filters a lot quicker.

Connects #1932

Demo

Testing Instructions

Check out this branch, go to :8000/?bigcz
Run a WDC search. It'll take it's time the first time.
Try and run the same search again. It should be noticeably faster.
Ensure the results are the same as before.

The suds library interacts with the CUAHSI SOAP API by reading their WSDL and creating dynamic classes on the fly for requests and responses. Since these classes are created at runtime, they are not serializable, since pickle (or json or anything else) will not know how to deserialize them, as their class definitions will not be available. In preparation for caching these values, they must be made serializable. To this end, we convert the results into Python Dicts, which are serializable. This uses the suds `asdict` method, which converts a suds Object to a Python Dict. Arrays are converted to Lists, literals used immediately, and sub-Objects are converted recursively to Dicts.

A CUAHSI search request uses two endpoints: one for fetching services, and the other for fetching series. The services endpoint has values which rarely change, so we cache them for a week. The series endpoint has more frequently changing values, so those are cached for 5 minutes.

arottersman

Tested, and seems to be working well.

	Initial	Cached
No filter	1.92s	297.97ms
Date filter	959.02s	662.26ms
Gridded services	2.93s	636.92ms

mmcfarland · 2017-09-19T16:08:37Z

src/mmw/apps/bigcz/clients/cuahsi/search.py

@@ -228,6 +237,7 @@ def get_series_catalog_in_box(box, from_date, to_date, networkIDs):
    to_date = to_date or DATE_MAX

    result = make_request(client.service.GetSeriesCatalogForBox2,


Can you speculate on the size of these responses so we have an idea of the impact to the cache storage we have available?

vagrant@services:~$ redis-cli -n 1 GET ":1:bigcz_GetServicesInBox2_-536472219463548920" > GetServicesInBox2.txt vagrant@services:~$ redis-cli -n 1 GET ":1:bigcz_GetSeriesCatalogForBox2_6025281192626754923" > GetSeriesCatalogForBox2.txt vagrant@services:~$ du -sh *.txt 392K GetSeriesCatalogForBox2.txt # Cached for 5 minutes 144K GetServicesInBox2.txt # Cached for 1 week

This is for the Philadelphia HUC-12 for a search query of "water" with 214 results.

It should be noted that the original issue only asked to cache GetServicesInBox2. I'm also caching the other one to make interacting with it faster.

mmcfarland

I was skeptical of this approach since it requires the bbox (ie, aoi shape) to be the same in order to get the effect. However, tweaking the filters resulted in noticeably faster searches for having the initial services cached. It's too bad we can't cache the services based off of a spatial index that would allow us to cache things at a larger geographic level.

rajadain · 2017-09-20T19:41:20Z

Thanks for taking a look!

rajadain added 2 commits September 15, 2017 11:56

rajadain added the BigCZ label Sep 15, 2017

rajadain assigned arottersman Sep 15, 2017

rajadain requested a review from arottersman September 15, 2017 16:02

rajadain added the in progress label Sep 15, 2017

arottersman approved these changes Sep 19, 2017

View reviewed changes

arottersman assigned rajadain and unassigned arottersman Sep 19, 2017

mmcfarland reviewed Sep 19, 2017

View reviewed changes

rajadain mentioned this pull request Sep 19, 2017

BiG-CZ: WDC / CUAHSI Details View #2259

Merged

rajadain assigned mmcfarland Sep 20, 2017

mmcfarland approved these changes Sep 20, 2017

View reviewed changes

mmcfarland removed their assignment Sep 20, 2017

rajadain merged commit fe10d23 into develop Sep 20, 2017

hectcastro removed the in progress label Sep 20, 2017

rajadain deleted the tt/bigcz-cache-cuahsi-requests branch September 20, 2017 19:41

rajadain mentioned this pull request Oct 16, 2017

Release 1.20.0 #2304

Closed

emiliom mentioned this pull request Mar 23, 2018

CUAHSI WDC catalog API search enhancements BiG-CZ/BiG-CZ-Portal#10

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BiG-CZ: Cache CUAHSI Requests #2260

BiG-CZ: Cache CUAHSI Requests #2260

rajadain commented Sep 15, 2017

arottersman left a comment •

edited

Loading

mmcfarland Sep 19, 2017

rajadain Sep 19, 2017 •

edited

Loading

rajadain Sep 19, 2017

mmcfarland left a comment

rajadain commented Sep 20, 2017

		@@ -228,6 +237,7 @@ def get_series_catalog_in_box(box, from_date, to_date, networkIDs):
		to_date = to_date or DATE_MAX

		result = make_request(client.service.GetSeriesCatalogForBox2,

BiG-CZ: Cache CUAHSI Requests #2260

BiG-CZ: Cache CUAHSI Requests #2260

Conversation

rajadain commented Sep 15, 2017

Overview

Demo

Testing Instructions

arottersman left a comment • edited Loading

Choose a reason for hiding this comment

mmcfarland Sep 19, 2017

Choose a reason for hiding this comment

rajadain Sep 19, 2017 • edited Loading

Choose a reason for hiding this comment

rajadain Sep 19, 2017

Choose a reason for hiding this comment

mmcfarland left a comment

Choose a reason for hiding this comment

rajadain commented Sep 20, 2017

arottersman left a comment •

edited

Loading

rajadain Sep 19, 2017 •

edited

Loading