Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performing the has:plural lookup is quite slow #8719

Closed
2 tasks done
Geeyun-JY3 opened this issue Feb 1, 2023 · 7 comments · Fixed by #8840
Closed
2 tasks done

Performing the has:plural lookup is quite slow #8719

Geeyun-JY3 opened this issue Feb 1, 2023 · 7 comments · Fixed by #8840
Assignees
Labels
bug Something is broken.
Milestone

Comments

@Geeyun-JY3
Copy link
Collaborator

Geeyun-JY3 commented Feb 1, 2023

Describe the issue

It takes quite a long time to perform the has:plural lookup at the language of a project level.
I cannot even load the page while performing the has:plural lookup at the project level and the language of an instance level.

I already tried

  • I've read and searched the documentation.
  • I've searched for similar issues in this repository.

Steps to reproduce the behavior

  1. Go to search tab of a language of a project / a project / a language of an instance
  2. Enter has:plural
  3. Click on the Search button
  4. Cannot load the page

Expected behavior

Everyone can open the corresponding page.

Screenshots

No response

Exception traceback

No response

How do you run Weblate?

weblate.org service

Weblate versions

Weblate 4.16-dev (weblate-4.15.2-209-gde41682274)

Weblate deploy checks

No response

Additional context

Example:

@Geeyun-JY3 Geeyun-JY3 changed the title Performing has:plural lookup is quite slow Performing the has:plural lookup is quite slow Feb 1, 2023
@nijel
Copy link
Member

nijel commented Feb 2, 2023

Confirmed on my test instance as well. Most likely this was caused by 50c5258 - the pg_trgm is great for text search, but performs poorly on whitespace search (what has:plural translates to) as it has to recheck every string. We need to have a fallback operator for whitespace only searches here...

Related: https://stackoverflow.com/questions/57683799/postgresql-using-pg-trgm-slower-then-full-scan

@nijel nijel added this to the 4.16 milestone Feb 2, 2023
@nijel nijel added the bug Something is broken. label Feb 2, 2023
@nijel nijel self-assigned this Feb 2, 2023
@nijel nijel closed this as completed in f190eef Feb 3, 2023
@github-actions
Copy link

github-actions bot commented Feb 3, 2023

Thank you for your report; the issue you have reported has just been fixed.

  • In case you see a problem with the fix, please comment on this issue.
  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, don’t hesitate to support Weblate by making a donation.

@Geeyun-JY3
Copy link
Collaborator Author

@nijel Performing a source:r"(^|\W)ID(\W|$)" lookup in the same level is quite slow too. Is it a similar bug?

@nijel
Copy link
Member

nijel commented Feb 11, 2023

Probably yes, though this one should not be that bad...

@Geeyun-JY3
Copy link
Collaborator Author

Geeyun-JY3 commented Feb 11, 2023

My primary use case is searching at the language of a project level. It still takes me several minutes to load the results listing page. I think this is unacceptable.
I think all the searching results listing page should be loaded in less than 1 minute.
While using this regular expression, it seems that if the specific word I want to search is less than three letters, performing this lookup would be quite slow.
Should I open an issue for it?

@nijel
Copy link
Member

nijel commented Feb 27, 2023

Yes, the string length is the key here (or actually the length of the alphanumeric part of the string). Unfortunately, getting that info from regular expression is not that straightforward, but I will look into options…

Weblate uses PostgreSQL trigram index to optimize these. They work great for 3+ alphanumeric chars, but horribly for shorter ones. Unfortunately, the PostgreSQL optimizer doesn't deal with this, and we work around this in Weblate for the strings by adding string concatenation to the field when the string is too short and would perform badly. Adding operation on the field stops PostgreSQL from using index, and resorts to full table scan, which is faster in this case.

nijel added a commit to nijel/weblate that referenced this issue Feb 27, 2023
This should improve performance of the short regex expression lookups.

Fixes WeblateOrg#8719
@nijel
Copy link
Member

nijel commented Feb 27, 2023

#8840 should address this.

nijel added a commit that referenced this issue Feb 27, 2023
This should improve performance of the short regex expression lookups.

Fixes #8719
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants