-
-
Notifications
You must be signed in to change notification settings - Fork 959
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Freezing in certain strings #8668
Comments
Just tried importing the referenced PO file to my test instance, and it works just fine.
This typically indicates repository corruption. Was the server forcibly restarted? Maybe the disk is faulty? That could break some other things as well. |
This issue looks more like a support question than an issue. We strive to answer these reasonably fast, but purchasing the support subscription is not only more responsible and faster for your business but also makes Weblate stronger. In case your question is already answered, making a donation is the right way to say thank you! |
I just updated the source files as usual: I update this way because it takes many hours to finish; there are too many components. Updating through the web interface, I received a few timeouts in the past. In the past, some updates/merges were stuck in the middle of the process, but strangely, those strings have been working until now. I restarted everything and accessed this string, and I didn't receive any error, but Postgres keeps running on 100% CPU |
SELECT pid, now() - pg_stat_activity.query_start AS duration, query, state
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';
82 | 00:17:39.792607 | SELECT "trans_unit"."id", "trans_unit"."translation_id", "trans_unit"."id_hash", "trans_unit"."location", "trans_unit"."context", "trans_unit"."note", "trans_unit"."flags", "trans_unit"."source", "trans_unit"."previous_source", "trans_unit"."target", "trans_unit"."state", "trans_unit"."original_state", "trans_unit"."details", "trans_unit"."position", "trans_unit"."num_words", "trans_unit"."priority", "trans_unit"."pending", "trans_unit"."timestamp", "trans_unit"."extra_flags", "trans_unit"."explanation", "trans_unit"."variant_id", "trans_unit"."source_unit_id", CASE WHEN ("trans_unit"."source" = '''''''' AND "trans_unit"."context" = '') THEN 1 ELSE 0 END AS "matches_current", "trans_translation"."id", "trans_translation"."component_id", "trans_translation"."language_id", "trans_translation"."plural_id", "trans_translation"."revision", "trans_translation"."filename", "trans_translation"."language_code", "trans_translation"."check_flags", "trans_component"."id", "trans_component"."name", "trans_component"."slu | active
(1 row) |
explain SELECT "trans_unit"."id", "trans_unit"."translation_id", "trans_unit"."id_hash", "trans_unit"."location", "trans_unit"."context", "trans_unit"."note", "trans_unit"."flags", "trans_unit"."source", "trans_unit"."previous_source", "trans_unit"."target", "trans_unit"."state", "trans_unit"."original_state", "trans_unit"."details", "trans_unit"."position", "trans_unit"."num_words", "trans_unit"."priority", "trans_unit"."pending", "trans_unit"."timestamp", "trans_unit"."extra_flags", "trans_unit"."explanation", "trans_unit"."variant_id", "trans_unit"."source_unit_id", CASE WHEN ("trans_unit"."source" = '''''''' AND "trans_unit"."context" = '') THEN 1 ELSE 0 END AS "matches_current", "trans_translation"."id", "trans_translation"."component_id", "trans_translation"."language_id", "trans_translation"."plural_id", "trans_translation"."revision", "trans_translation"."filename", "trans_translation"."language_code", "trans_translation"."check_flags", "trans_component"."id", "trans_component"."name", "trans_component"."slug", "trans_component"."project_id", "trans_component"."vcs", "trans_component"."repo", "trans_component"."linked_component_id", "trans_component"."push", "trans_component"."repoweb", "trans_component"."git_export", "trans_component"."report_source_bugs", "trans_component"."branch", "trans_component"."push_branch", "trans_component"."filemask", "trans_component"."template", "trans_component"."edit_template", "trans_component"."intermediate", "trans_component"."new_base", "trans_component"."file_format", "trans_component"."locked", "trans_component"."allow_translation_propagation", "trans_component"."enable_suggestions", "trans_component"."suggestion_voting", "trans_component"."suggestion_autoaccept", "trans_component"."check_flags", "trans_component"."enforced_checks", "trans_component"."license", "trans_component"."agreement", "trans_component"."new_lang", "trans_component"."language_code_style", "trans_component"."manage_units", "trans_component"."merge_style", "trans_component"."commit_message", "trans_component"."add_message", "trans_component"."delete_message", "trans_component"."merge_message", "trans_component"."addon_message", "trans_component"."pull_message", "trans_component"."push_on_commit", "trans_component"."commit_pending_age", "trans_component"."auto_lock_error", "trans_component"."source_language_id", "trans_component"."language_regex", "trans_component"."variant_regex", "trans_component"."priority", "trans_component"."restricted", "trans_component"."is_glossary", "trans_component"."glossary_color", "trans_component"."remote_revision", "trans_component"."local_revision", "trans_project"."id", "trans_project"."name", "trans_project"."slug", "trans_project"."web", "trans_project"."instructions", "trans_project"."set_language_team", "trans_project"."use_shared_tm", "trans_project"."contribute_shared_tm", "trans_project"."access_control", "trans_project"."translation_review", "trans_project"."source_review", "trans_project"."enable_hooks", "trans_project"."language_aliases", "trans_project"."machinery_settings", T6."id", T6."code", T6."name", T6."direction", T6."population", "lang_language"."id", "lang_language"."code", "lang_language"."name", "lang_language"."direction", "lang_language"."population", "lang_plural"."id", "lang_plural"."source", "lang_plural"."number", "lang_plural"."formula", "lang_plural"."type", "lang_plural"."language_id" FROM "trans_unit" INNER JOIN "trans_translation" ON ("trans_unit"."translation_id" = "trans_translation"."id") INNER JOIN "trans_component" ON ("trans_translation"."component_id" = "trans_component"."id") INNER JOIN "trans_project" ON ("trans_component"."project_id" = "trans_project"."id") INNER JOIN "lang_language" ON ("trans_translation"."language_id" = "lang_language"."id") INNER JOIN "lang_language" T6 ON ("trans_component"."source_language_id" = T6."id") INNER JOIN "lang_plural" ON ("trans_translation"."plural_id" = "lang_plural"."id") WHERE ("trans_unit"."source" ILIKE '''''''' AND "trans_component"."project_id" = 5 AND "trans_translation"."language_id" = 246) ORDER BY "matches_current" DESC LIMIT 20 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=29024.20..29024.25 rows=20 width=2601)
-> Sort (cost=29024.20..29024.29 rows=33 width=2601)
Sort Key: (CASE WHEN ((trans_unit.source = ''''''''::text) AND (trans_unit.context = ''::text)) THEN 1 ELSE 0 END) DESC
-> Nested Loop (cost=1070.34..29023.37 rows=33 width=2601)
-> Nested Loop (cost=1070.07..29011.35 rows=33 width=2549)
-> Bitmap Heap Scan on lang_language (cost=8.41..12.42 rows=1 width=30)
Recheck Cond: (id = 246)
-> Bitmap Index Scan on lang_language_pkey (cost=0.00..8.41 rows=1 width=0)
Index Cond: (id = 246)
-> Nested Loop (cost=1061.66..28998.60 rows=33 width=2519)
-> Index Scan using trans_project_pkey on trans_project (cost=0.14..8.16 rows=1 width=804)
Index Cond: (id = 5)
-> Nested Loop (cost=1061.52..28990.11 rows=33 width=1715)
-> Hash Join (cost=900.27..1059.09 rows=169 width=1421)
Hash Cond: (trans_translation.component_id = trans_component.id)
-> Seq Scan on trans_translation (cost=0.00..154.89 rows=598 width=103)
Filter: (language_id = 246)
-> Hash (cost=898.14..898.14 rows=170 width=1318)
-> Merge Join (cost=889.08..898.14 rows=170 width=1318)
Merge Cond: (t6.id = trans_component.source_language_id)
-> Index Scan using lang_language_pkey on lang_language t6 (cost=0.40..4024.35 rows=618 width=30)
-> Sort (cost=406.66..407.09 rows=170 width=1288)
Sort Key: trans_component.source_language_id
-> Bitmap Heap Scan on trans_component (cost=13.59..400.36 rows=170 width=1288)
Recheck Cond: (project_id = 5)
-> Bitmap Index Scan on trans_component_project_id_04a8b52c (cost=0.00..13.55 rows=170 width=0)
Index Cond: (project_id = 5)
-> Bitmap Heap Scan on trans_unit (cost=161.25..165.26 rows=1 width=294)
Recheck Cond: ((source ~~* ''''''''::text) AND (translation_id = trans_translation.id))
-> Bitmap Index Scan on unit_source_fulltext (cost=0.00..161.25 rows=1 width=0)
Index Cond: ((source ~~* ''''''''::text) AND (translation_id = trans_translation.id))
-> Index Scan using lang_plural_pkey on lang_plural (cost=0.28..0.36 rows=1 width=48)
Index Cond: (id = trans_translation.plural_id)
(33 rows) |
Thanks for analysis. What PostgreSQL version do you use? |
still using PostgreSQL |
Okay, we've benchmarked this on 13, where it typically showed significant gain. When I try it now, the scheduler doesn't use the trigram index, so the performance will be about the same as with the original lookups. The difference is that besides that we're only on the index scans:
Maybe using ILIKE in this case is not that reasonable as we thought... |
I've just upgraded PostgreSQL to 15.1 and it is still taking long time. explain SELECT "trans_unit"."id", "trans_unit"."translation_id", "trans_unit"."id_hash", "trans_unit"."location", "trans_unit"."context", "trans_unit"."note", "trans_unit"."flags", "trans_unit"."source", "trans_unit"."previous_source", "trans_unit"."target", "trans_unit"."state", "trans_unit"."original_state", "trans_unit"."details", "trans_unit"."position", "trans_unit"."num_words", "trans_unit"."priority", "trans_unit"."pending", "trans_unit"."timestamp", "trans_unit"."extra_flags", "trans_unit"."explanation", "trans_unit"."variant_id", "trans_unit"."source_unit_id", CASE WHEN ("trans_unit"."source" = '''''''' AND "trans_unit"."context" = '') THEN 1 ELSE 0 END AS "matches_current", "trans_translation"."id", "trans_translation"."component_id", "trans_translation"."language_id", "trans_translation"."plural_id", "trans_translation"."revision", "trans_translation"."filename", "trans_translation"."language_code", "trans_translation"."check_flags", "trans_component"."id", "trans_component"."name", "trans_component"."slug", "trans_component"."project_id", "trans_component"."vcs", "trans_component"."repo", "trans_component"."linked_component_id", "trans_component"."push", "trans_component"."repoweb", "trans_component"."git_export", "trans_component"."report_source_bugs", "trans_component"."branch", "trans_component"."push_branch", "trans_component"."filemask", "trans_component"."template", "trans_component"."edit_template", "trans_component"."intermediate", "trans_component"."new_base", "trans_component"."file_format", "trans_component"."locked", "trans_component"."allow_translation_propagation", "trans_component"."enable_suggestions", "trans_component"."suggestion_voting", "trans_component"."suggestion_autoaccept", "trans_component"."check_flags", "trans_component"."enforced_checks", "trans_component"."license", "trans_component"."agreement", "trans_component"."new_lang", "trans_component"."language_code_style", "trans_component"."manage_units", "trans_component"."merge_style", "trans_component"."commit_message", "trans_component"."add_message", "trans_component"."delete_message", "trans_component"."merge_message", "trans_component"."addon_message", "trans_component"."pull_message", "trans_component"."push_on_commit", "trans_component"."commit_pending_age", "trans_component"."auto_lock_error", "trans_component"."source_language_id", "trans_component"."language_regex", "trans_component"."variant_regex", "trans_component"."priority", "trans_component"."restricted", "trans_component"."is_glossary", "trans_component"."glossary_color", "trans_component"."remote_revision", "trans_component"."local_revision", "trans_project"."id", "trans_project"."name", "trans_project"."slug", "trans_project"."web", "trans_project"."instructions", "trans_project"."set_language_team", "trans_project"."use_shared_tm", "trans_project"."contribute_shared_tm", "trans_project"."access_control", "trans_project"."translation_review", "trans_project"."source_review", "trans_project"."enable_hooks", "trans_project"."language_aliases", "trans_project"."machinery_settings", T6."id", T6."code", T6."name", T6."direction", T6."population", "lang_language"."id", "lang_language"."code", "lang_language"."name", "lang_language"."direction", "lang_language"."population", "lang_plural"."id", "lang_plural"."source", "lang_plural"."number", "lang_plural"."formula", "lang_plural"."type", "lang_plural"."language_id" FROM "trans_unit" INNER JOIN "trans_translation" ON ("trans_unit"."translation_id" = "trans_translation"."id") INNER JOIN "trans_component" ON ("trans_translation"."component_id" = "trans_component"."id") INNER JOIN "trans_project" ON ("trans_component"."project_id" = "trans_project"."id") INNER JOIN "lang_language" ON ("trans_translation"."language_id" = "lang_language"."id") INNER JOIN "lang_language" T6 ON ("trans_component"."source_language_id" = T6."id") INNER JOIN "lang_plural" ON ("trans_translation"."plural_id" = "lang_plural"."id") WHERE ("trans_unit"."source" ILIKE '''''''' AND "trans_component"."project_id" = 5 AND "trans_translation"."language_id" = 246) ORDER BY "matches_current" DESC LIMIT 20 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=20438.91..20438.96 rows=20 width=2586)
-> Sort (cost=20438.91..20438.97 rows=22 width=2586)
Sort Key: (CASE WHEN ((trans_unit.source = ''''''''::text) AND (trans_unit.context = ''::text)) THEN 1 ELSE 0 END) DESC
-> Nested Loop (cost=208.21..20438.42 rows=22 width=2586)
-> Nested Loop (cost=207.94..20430.41 rows=22 width=2551)
-> Nested Loop (cost=207.66..20418.79 rows=22 width=2520)
-> Index Scan using lang_language_pkey on lang_language (cost=0.28..8.29 rows=1 width=31)
Index Cond: (id = 246)
-> Nested Loop (cost=207.39..20410.28 rows=22 width=2489)
-> Seq Scan on trans_project (cost=0.00..1.04 rows=1 width=804)
Filter: (id = 5)
-> Nested Loop (cost=207.39..20409.02 rows=22 width=1685)
-> Hash Join (cost=91.86..205.82 rows=169 width=1391)
Hash Cond: (trans_component.id = trans_translation.component_id)
-> Seq Scan on trans_component (cost=0.00..113.51 rows=170 width=1288)
Filter: (project_id = 5)
-> Hash (cost=84.39..84.39 rows=598 width=103)
-> Bitmap Heap Scan on trans_translation (cost=8.91..84.39 rows=598 width=103)
Recheck Cond: (language_id = 246)
-> Bitmap Index Scan on trans_translation_language_id_030f0b30 (cost=0.00..8.76 rows=598 width=0)
Index Cond: (language_id = 246)
-> Bitmap Heap Scan on trans_unit (cost=115.52..119.54 rows=1 width=294)
Recheck Cond: ((source ~~* ''''''''::text) AND (translation_id = trans_translation.id))
-> Bitmap Index Scan on unit_source_fulltext (cost=0.00..115.52 rows=1 width=0)
Index Cond: ((source ~~* ''''''''::text) AND (translation_id = trans_translation.id))
-> Index Scan using lang_language_pkey on lang_language t6 (cost=0.28..0.53 rows=1 width=31)
Index Cond: (id = trans_component.source_language_id)
-> Index Scan using lang_plural_pkey on lang_plural (cost=0.28..0.36 rows=1 width=31)
Index Cond: (id = trans_translation.plural_id)
(29 rows) Thanks! |
Okay, so it ends up using the index, but is still slow? Does reverting 8170d1d fix the performance? Is searching slow for you as well? It should utilize the same index. |
Yes, reverting that commit fixes the performance. |
Okay, the problem is that pg_trgm ignores non-word characters (non-alphanumerics) when extracting trigrams from a string. That leads to My guess is that searching for such strings will be problematic as well. |
Use iexact and icontains for strings that pg_trgm doesn't handle well. It looks at alphanumeric characters and in case there are none, all strings match the index causing huge penalty when doing recheck at the next step. Fixes WeblateOrg#8668
Can you please try if #8675 fixes the issue for you? |
Yes, I've tested it, and it works perfectly. Both in translation/edit and when searching. |
Use iexact and icontains for strings that pg_trgm doesn't handle well. It looks at alphanumeric characters and in case there are none, all strings match the index causing huge penalty when doing recheck at the next step. Fixes WeblateOrg#8668
Use iexact and icontains for strings that pg_trgm doesn't handle well. It looks at alphanumeric characters and in case there are none, all strings match the index causing huge penalty when doing recheck at the next step. Fixes WeblateOrg#8668
Use iexact and icontains for strings that pg_trgm doesn't handle well. It looks at alphanumeric characters and in case there are none, all strings match the index causing huge penalty when doing recheck at the next step. Fixes #8668
Thank you for your report; the issue you have reported has just been fixed.
|
Describe the issue
Hi.
We just updated Weblate to 4.15.1, and our instance is hanging when we access certain strings, weird strings (that should be ignored when creating the PO files, I know).
Instance logs, sometimes show this:
This is an example of the string that cause the issue:
https://github.com/freebsd/freebsd-doc-translate/blob/main/documentation/content/es/articles/serial-uart/_index.po#L38-L52
postgres be stuck in selects.
Do you know if there is something we can do here?
Regards.
I already tried
Steps to reproduce the behavior
Go to any string like this:
Expected behavior
No response
Screenshots
No response
Exception traceback
How do you run Weblate?
weblate.org service
Weblate versions
4.15.1
We have updated docker containers from
4.10.1
.Weblate deploy checks
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: