-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indexing on web_search? #140
Comments
That didn't work, so I've tried doing something like this instead: add_index :cases, "to_tsvector('english', case_number || ' ' || short_title)", using: :gin, name: 'cases_keywords_idx', algorithm: :concurrently But the search is still very, very slow (taking 10 seconds or longer). I've done some profiling and it's all down to the bottleneck created by the special two-table, four-column query I mentioned in my previous message. Here's an example query that lasts ~10 seconds: SELECT "cases".* FROM "cases" INNER JOIN "case_events" ON "case_events"."case_id" = "cases"."id" WHERE (to_tsvector('english', "cases"."case_number"::text) @@ websearch_to_tsquery('english', 'police'::text) OR to_tsvector('english', "cases"."short_title"::text) @@ websearch_to_tsquery('english', 'police'::text) OR to_tsvector('english', "case_events"."party_name"::text) @@ websearch_to_tsquery('english', 'police'::text) OR to_tsvector('english', "case_events"."short_title"::text) @@ websearch_to_tsquery('english', 'police'::text)) GROUP BY "cases"."id" ORDER BY max(COALESCE(ts_rank(to_tsvector('english', "cases"."case_number"::text), websearch_to_tsquery('english', 'police'::text)), 0) + COALESCE(ts_rank(to_tsvector('english', "cases"."short_title"::text), websearch_to_tsquery('english', 'police'::text)), 0) + COALESCE(ts_rank(to_tsvector('english', "case_events"."party_name"::text), websearch_to_tsquery('english', 'police'::text)), 0) + COALESCE(ts_rank(to_tsvector('english', "case_events"."short_title"::text), websearch_to_tsquery('english', 'police'::text)), 0)) LIMIT $1 OFFSET $2; I've tested both with and without that indexing, and it doesn't seem to really make a difference, which makes me wonder whether I need to index things some other way, or if it's not the |
@tomcardoso for some reason if you join and lookup into two tables GIN indexes are not used. I have managed it is possible to check for matching JOINed table in CTE and just pass ids to final query. You need to use WITH related_labels AS (
SELECT id FROM labels WHERE account_id = 80 AND to_tsvector('english_nostop', "labels"."name"::TEXT) @@ websearch_to_tsquery('english_nostop', 'Hello'::TEXT)
)
SELECT
"products" .*,
COALESCE(ts_rank(to_tsvector('english_nostop', "products"."title"::TEXT), websearch_to_tsquery('english_nostop', 'Hello'::TEXT)), 0) + COALESCE(ts_rank(to_tsvector('english_nostop', "products"."catalog_number"::TEXT), websearch_to_tsquery('english_nostop', 'Hello'::TEXT)), 0) + COALESCE(ts_rank(to_tsvector('english_nostop', "products"."version"::TEXT), websearch_to_tsquery('english_nostop', 'Hello'::TEXT)), 0) + COALESCE(ts_rank(to_tsvector('english_nostop', "products"."artist"::TEXT), websearch_to_tsquery('english_nostop', 'Hello'::TEXT)), 0) + COALESCE(ts_rank(to_tsvector('english_nostop', "products"."barcode"::TEXT), websearch_to_tsquery('english_nostop', 'Hello'::TEXT)), 0) + COALESCE(ts_rank(to_tsvector('english_nostop', "products"."company_id"::TEXT), websearch_to_tsquery('english_nostop', 'Hello'::TEXT)), 0) + COALESCE(ts_rank(to_tsvector('english_nostop', "labels"."name"::TEXT), websearch_to_tsquery('english_nostop', 'Hello'::TEXT)), 0) AS "rank20534463617833501"
FROM
"products"
LEFT OUTER JOIN "labels" ON
"labels"."id" = "products"."label_id"
WHERE
"products"."account_id" = 80
AND (to_tsvector('english_nostop', "products"."title"::TEXT) @@ websearch_to_tsquery('english_nostop', 'Hello'::TEXT)
OR to_tsvector('english_nostop', "products"."version"::TEXT) @@ websearch_to_tsquery('english_nostop', 'Hello'::TEXT)
OR to_tsvector('english_nostop', "products"."artist"::TEXT) @@ websearch_to_tsquery('english_nostop', 'Hello'::TEXT)
OR to_tsvector('english_nostop', "products"."barcode"::TEXT) @@ websearch_to_tsquery('english_nostop', 'Hello'::TEXT)
OR to_tsvector('english_nostop', "products"."company_id"::TEXT) @@ websearch_to_tsquery('english_nostop', 'Hello'::TEXT)
OR label_id = ANY(ARRAY(SELECT * FROM related_labels)))
ORDER BY
"rank20534463617833501" DESC
|
I'm a bit confused as to how I can set up indexing on web search, especially if I want to use the approach we talked about in #137. To recap, what I'm doing is:
Which searches a number of fields for a certain keyword and returns all results that match in any of those fields.
But I'm finding the search to be very slow for my database (~1.8m case event entries). I went in and added simple keyword indexes of this sort:
But I have a feeling I may need to do something more. Do you have any advice on what I should do? I know Textacular recommends using the
:sql
schema file format, but I'd prefer to keep it as aschema.rb
for compatibility reasons with my production database.The text was updated successfully, but these errors were encountered: