Filter on parameters.site#388
Conversation
|
I don't have Add-Ons/events/runs, certainly not enough to test the filter, so pushing this to staging first is the right move for testing before opening another PR to merge into prod |
Indexing strategy review: B-tree vs GINTL;DR: the B-tree partial expression index is the right call here. GIN would not help this query and would cost more. Why B-tree fitsI verified the SQL Django generates for WHERE (parameters -> 'site') = '"..."'::jsonbThe index is on exactly that expression ( Why GIN does not help this queryA GIN index on the whole When to revisitSwitch to GIN if any of these become true:
Tradeoff summary
Minor notes (not index-related)
|
|
Here's what this query looks like on staging: https://api.staging.documentcloud.org/api/addon_runs/?addon=48&expand=~all&site=https://www.muckrock.com Note that if you take away the |
|
I will just note that we may want the Klaxon front end to normalize url, like muckrock.com doesn't return any results while www.muckrock.com would. Some domains don't have a www, so this may be nuanced. This is why icontains may be necessary in the API itself. |
muckrock.com and www.muckrock.com are technically different URLs. I don't think it should try to normalize them. Substring matching may still make sense though if that is the behavior we want. |
|
I think the place we're going to have trouble is URLs with query params, especially for stuff like analytics. If I mean to watch |
Closes #387
This will mostly apply to Klaxon and Scraper, but other add-ons that use
sitecan use it, too. The index is partial, so it only covers addons withsiteand ignores others.