Skip to content

[Fixes #14097] Label thesaurus is not reloaded on thesaurus changes#14098

Open
etj wants to merge 3 commits intomasterfrom
14097_i18n_reload
Open

[Fixes #14097] Label thesaurus is not reloaded on thesaurus changes#14098
etj wants to merge 3 commits intomasterfrom
14097_i18n_reload

Conversation

@etj
Copy link
Copy Markdown
Contributor

@etj etj commented Apr 3, 2026

Updates GeoNode's i18n caching logic so that changes to the labels-i18n Thesaurus reliably invalidate previously cached localized label data (and related i18n-derived caches) instead of continuing to serve stale values.

Changes Made

  • Cache invalidation: When the Thesaurus date changes, the entire in-memory lang_cache is cleared (all languages) rather than only invalidating the current (lang, data_key) lookup, ensuring no stale data is served for any language.
  • Thread safety: All lang_cache mutations (set(), clear(), force_check()) are now guarded with _lock, making cache invalidation atomic across concurrent readers and writers. The DB query in set() is kept outside the lock to avoid holding it during I/O.
  • Regression test: Added test_stale_cache_invalidated_on_thesaurus_update in geonode/metadata/tests/test_i18n.py to reproduce the stale-cache scenario — it warms up the cache for multiple languages (en and it), bumps the Thesaurus date, and asserts that all language caches are rebuilt with fresh values.

@etj etj added this to the 5.1.0 milestone Apr 3, 2026
@etj etj requested a review from mattiagiupponi April 3, 2026 16:29
@etj etj self-assigned this Apr 3, 2026
Copilot AI review requested due to automatic review settings April 3, 2026 16:29
@etj etj added the bug label Apr 3, 2026
@cla-bot cla-bot bot added the cla-signed CLA Bot: community license agreement signed label Apr 3, 2026
@etj etj linked an issue Apr 3, 2026 that may be closed by this pull request
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates GeoNode’s i18n caching logic so that changes to the labels-i18n Thesaurus reliably invalidate previously cached localized label data (and related i18n-derived caches) instead of continuing to serve stale values.

Changes:

  • When the Thesaurus date changes, invalidate cached i18n data by clearing the in-memory cache rather than only invalidating the current (lang, data_key) lookup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 117 to 120
if cached_entry and cached_entry.date != thesaurus_date:
logger.info(f"Cache for {lang}:{data_key} needs to be recreated")
logger.info(f"Cache for {lang}:{data_key} dirty, clearing all caches")
self.lang_cache.clear()
return thesaurus_date, None
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_entry() now clears self.lang_cache under _lock, but set() and clear() mutate self.lang_cache without acquiring the same lock. This creates a real race where one thread can be in setdefault()/write while another thread clears the dict, leading to lost updates and inconsistent cache state. Consider guarding all lang_cache mutations (set(), clear(), force_check()) with _lock (or switching to a thread-safe strategy) so invalidation is atomic across readers/writers.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied in commit a2745f9. All lang_cache mutations (set(), clear(), force_check()) are now guarded with _lock. The DB query in set() is intentionally kept outside the lock to avoid holding it during I/O, with only the actual cache mutations inside the lock context.

Comment on lines 117 to 120
if cached_entry and cached_entry.date != thesaurus_date:
logger.info(f"Cache for {lang}:{data_key} needs to be recreated")
logger.info(f"Cache for {lang}:{data_key} dirty, clearing all caches")
self.lang_cache.clear()
return thesaurus_date, None
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new behavior invalidates the entire i18n cache when the Thesaurus date changes. Given this is a regression fix (and i18nCache is also used for other cached data like metadata schema), it would be good to add a unit test that reproduces the stale-cache scenario and asserts that previously cached values (e.g., schema and/or labels for multiple languages) are rebuilt after a Thesaurus update.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in commit d6add25. The new test test_stale_cache_invalidated_on_thesaurus_update in geonode/metadata/tests/test_i18n.py:

  1. Warms up the cache with labels for both en and it
  2. Updates the label values in the DB and bumps the Thesaurus date (simulating a real thesaurus update)
  3. Forces a freshness check (bypassing CHECK_INTERVAL)
  4. Asserts that both language caches are rebuilt with the new values, not the stale ones

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the cache invalidation logic in geonode/base/i18n.py by clearing the entire language cache when a dirty cache entry is detected. I have no further feedback to provide.

@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Apr 3, 2026

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have the users @Copilot on file. In order for us to review and merge your code, please contact the project maintainers to get yourself added.

@cla-bot cla-bot bot removed the cla-signed CLA Bot: community license agreement signed label Apr 3, 2026
Copilot finished work on behalf of etj April 3, 2026 16:47
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Apr 3, 2026

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have the users @Copilot on file. In order for us to review and merge your code, please contact the project maintainers to get yourself added.

Copilot finished work on behalf of etj April 3, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Label thesaurus is not reloaded on thesaurus changes

3 participants