Skip to content

Conversation

@jonmol
Copy link
Contributor

@jonmol jonmol commented Feb 4, 2021

What does it do?

Gives the users the option to set skipSearch on each attribute, excluding column search should search in. Also sets a fair share of the files attributes to non-searchable in the upload-plugin.

Why is it needed?

When a user wants to add an image to an entity, or just want to search for it, Strapi builds an sql query doing LIKE on every numeric or string field inside the entity (or upload_file). That's problematic for several reasons, but the main two I see are that it doesn't scale and that the search results are of a low quality.

The low quality of the search results is due to fields that are irrelevant matches, in the case of file hash is a prime example. Another are numeric ones that are much more suitable for filtering or enums.

For scaling the default file search sql looks like this:
select distinct “upload_file”.*, “upload_file”.“updated_at” as “_strapi_tmp_upload_file_updated_at” from “upload_file” where (“upload_file”.“name”::text ILIKE ? or “upload_file”.“alternativeText”::text ILIKE ? or “upload_file”.“caption”::text ILIKE ? or “upload_file”.“hash”::text ILIKE ? or “upload_file”.“ext”::text ILIKE ? or “upload_file”.“mime”::text ILIKE ? or “upload_file”.“url”::text ILIKE ? or “upload_file”.“previewUrl”::text ILIKE ? or “upload_file”.“provider”::text ILIKE ? or “upload_file”.“id”::text ILIKE ?) order by “_strapi_tmp_upload_file_updated_at” desc limit ?

And for an entity with plenty of fields it may look like this:
select distinct “someEntity”.*, “someEntity”.“someField1” as “_strapi_tmp_someEntity_internalName” from “someEntity” where (“someEntity”.“someField1”::text ILIKE ? or “someEntity”.“someField2”::text ILIKE ? or “someEntity”.“someField3”::text ILIKE ? or “someEntity”.“someField4”::text ILIKE ? or “someEntity”.“someField5”::text ILIKE ? or “someEntity”.“someField6”::text ILIKE ? or “someEntity”.“someField7”::text ILIKE ? or “someEntity”.“someField8”::text ILIKE ? or “someEntity”.“someField9”::text ILIKE ? or “someEntity”.“someField10”::text ILIKE ? or “someEntity”.“someField11”::text ILIKE ? or “someEntity”.“someField12”::text ILIKE ? or “someEntity”.“someField13”::text ILIKE ? or “someEntity”.“someField14”::text ILIKE ? or “someEntity”.“someField15”::text ILIKE ? or “someEntity”.“someField16”::text ILIKE ? or “someEntity”.“someField17”::text ILIKE ?) order by “_strapi_tmp_someEntity_someField1” asc limit ?

Those are really expensive queries, and for larger databases they start taking seconds executing.

@derrickmehaffy
Copy link
Member

This pull request has been mentioned on Strapi Community Forum. There might be relevant details there:

https://forum.strapi.io/t/search-generates-horrible-sql-queries/2620/2

@codecov
Copy link

codecov bot commented Feb 4, 2021

Codecov Report

Merging #9316 (dfe1f5e) into master (afaf111) will increase coverage by 0.04%.
The diff coverage is 60.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9316      +/-   ##
==========================================
+ Coverage   35.07%   35.12%   +0.04%     
==========================================
  Files        1308     1310       +2     
  Lines       14465    14535      +70     
  Branches     1439     1453      +14     
==========================================
+ Hits         5074     5105      +31     
- Misses       8481     8514      +33     
- Partials      910      916       +6     
Flag Coverage Δ
front 26.65% <0.00%> (ø)
unit 54.71% <75.00%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...strapi-admin/admin/src/containers/Admin/Wrapper.js 66.66% <ø> (ø)
...anager/admin/src/components/SelectWrapper/index.js 0.00% <ø> (ø)
...ssions/admin/src/components/Policies/Components.js 0.00% <0.00%> (ø)
...permissions/admin/src/components/Policies/index.js 0.00% <0.00%> (ø)
packages/strapi-utils/lib/sanitize-entity.js 97.10% <71.42%> (-2.90%) ⬇️
packages/strapi/lib/middlewares/session/index.js 39.28% <100.00%> (ø)
...rapi/lib/core/app-configuration/config-provider.js 70.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e29483b...ded5beb. Read the comment docs.

@jonmol
Copy link
Contributor Author

jonmol commented Feb 9, 2021

Is there anything I can do to speed up the accepting/discussion about this change? We're currently at around 60k uploaded files and searching for files to add to collection takes 10-30 seconds for our users.

Or should I just use a fork? (That's not intended as a threat, it's just that for us it's really crucial and we'll have to find some kind of solution as plenty of people are more or less blocked in their work)

@alexandrebodin
Copy link
Member

Hello @jonmol this change would need to be applied to mongo's connector too so the configuration is agnostic.

We would rather go with searchable: false to keep some consistency with other properties we have internally. We could even add that as a parameter in the CTB 🤔

@alexandrebodin alexandrebodin self-requested a review February 10, 2021 17:51
@jonmol
Copy link
Contributor Author

jonmol commented Feb 10, 2021

Hello @jonmol this change would need to be applied to mongo's connector too so the configuration is agnostic.

We would rather go with searchable: false to keep some consistency with other properties we have internally. We could even add that as a parameter in the CTB thinking

I agree that searchable: false is a nicer name for it, but then it wouldn't be backwards compatible? IE everyone would have to add searchable: true to all fields they want to search on. That's why I went for a double negation, I can change it but it would per default remove all fields from the search.

For mongoose, I'm sadly not very confident with it and we moved away from Mongo due to joins getting really slow with Strapi so I don't have a good setup for it and wouldn't be able to code "blindly" for it. Any chance you'd have the time to take over that part? I can rename the variable.

1 similar comment
@jonmol
Copy link
Contributor Author

jonmol commented Feb 11, 2021

Hello @jonmol this change would need to be applied to mongo's connector too so the configuration is agnostic.

We would rather go with searchable: false to keep some consistency with other properties we have internally. We could even add that as a parameter in the CTB thinking

I agree that searchable: false is a nicer name for it, but then it wouldn't be backwards compatible? IE everyone would have to add searchable: true to all fields they want to search on. That's why I went for a double negation, I can change it but it would per default remove all fields from the search.

For mongoose, I'm sadly not very confident with it and we moved away from Mongo due to joins getting really slow with Strapi so I don't have a good setup for it and wouldn't be able to code "blindly" for it. Any chance you'd have the time to take over that part? I can rename the variable.

@alexandrebodin
Copy link
Member

Hello @jonmol this change would need to be applied to mongo's connector too so the configuration is agnostic.
We would rather go with searchable: false to keep some consistency with other properties we have internally. We could even add that as a parameter in the CTB thinking

I agree that searchable: false is a nicer name for it, but then it wouldn't be backwards compatible? IE everyone would have to add searchable: true to all fields they want to search on. That's why I went for a double negation, I can change it but it would per default remove all fields from the search.

the default value for searchable can be true if not defined on the attribute ;)

For mongoose, I'm sadly not very confident with it and we moved away from Mongo due to joins getting really slow with Strapi so I don't have a good setup for it and wouldn't be able to code "blindly" for it. Any chance you'd have the time to take over that part? I can rename the variable.

Sadly the core team won't be able to help anytime soon to finish this as we still aren't enough for that. You might find help to finish your PR with some of our star contributors @derrickmehaffy might help you find the help when he is back from a few days off

@alexandrebodin alexandrebodin added source: core:content-type-builder Source is core/content-type-builder package issue: enhancement Issue suggesting an enhancement to an existing feature labels Feb 15, 2021
@alexandrebodin
Copy link
Member

Hi @jonmol looks like @ldelriof added the support for mongo with a PR on your repo :) we should be able to move forward with reviews & testing once you have merged their PR :)

skip non searchable fields on mongoose query
@strapi-cla
Copy link

strapi-cla commented Feb 16, 2021

CLA assistant check
All committers have signed the CLA.

@jonmol
Copy link
Contributor Author

jonmol commented Feb 16, 2021

Hi @jonmol looks like @ldelriof added the support for mongo with a PR on your repo :) we should be able to move forward with reviews & testing once you have merged their PR :)

Thanks for the reminder @alexandrebodin, colleague helping out there. I've asked him to sign the CLA as well.

Copy link
Member

@alexandrebodin alexandrebodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you keep the current models unchanged so we do not change any current behaviour. You can extend the models on your project directly once the option exists.

Would you be willing to make a PR on our documentation too to document this new option ?
@derrickmehaffy can help you with that :)

@jonmol
Copy link
Contributor Author

jonmol commented Feb 16, 2021

I've reverted my changes to strapi-plugin-upload/models now.

I can try to update the documentation as well @derrickmehaffy, in what section would it make sense to have the? https://github.com/strapi/documentation/blob/main/developer-docs/latest/concepts/models.md?

(Please note that English isn't my native language so there might be a need for some corrections)

@alexandrebodin
Copy link
Member

Hi @ldelriof do you mind signing the CLA so we can eventually accept this PR :)

@ldelriof
Copy link
Contributor

Hey @alexandrebodin, CLA signed

Copy link
Member

@alexandrebodin alexandrebodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM works just fine !

@alexandrebodin alexandrebodin added this to the 3.4.7 milestone Feb 16, 2021
@alexandrebodin alexandrebodin merged commit 7a6432d into strapi:master Feb 16, 2021
@jonmol jonmol deleted the search-optimization branch February 17, 2021 12:13
@jonmol
Copy link
Contributor Author

jonmol commented Feb 17, 2021

@alexandrebodin when do you think the next version will be released? (and in general, is there some kind of schedule or is it more that when you feel there's a fair amount of change you create a new release?)

@alexandrebodin
Copy link
Member

We will make a release tomorrow ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

issue: enhancement Issue suggesting an enhancement to an existing feature source: core:content-type-builder Source is core/content-type-builder package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants