Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: add option to warm-up country import cache #1514

Closed
eliasdorneles opened this issue Oct 13, 2023 · 4 comments
Closed

Feature request: add option to warm-up country import cache #1514

eliasdorneles opened this issue Oct 13, 2023 · 4 comments
Assignees

Comments

@eliasdorneles
Copy link

Hello,

Thanks for this library, nice work!

I'm using it in a web application that receives lots of requests, and recently had to debug an issue of thread contention caused by the importlib.import_module here:

self.entity = getattr(importlib.import_module(self.module_name), self.entity_name)

The threads were getting stuck trying to acquire the import lock, the call stack (obtained with py-spy) looked like this:

Thread 1525 (active): "AnyIO worker thread"
    _get_module_lock (<frozen importlib._bootstrap>:185)
    __enter__ (<frozen importlib._bootstrap>:170)
    _find_and_load (<frozen importlib._bootstrap>:1173)
    _gcd_import (<frozen importlib._bootstrap>:1204)
    import_module (importlib/__init__.py:126)
    get_entity (holidays/registry.py:211)
    __call__ (holidays/registry.py:193)
    country_holidays (holidays/utils.py:187)

The fix for it was to call country_holidays in my code at import time for every country i care about, thus avoiding the importlib.import_module from being called from inside the request lifecycle as the entity attribute will be already loaded.

What do you think of adding an option to do that by default?

For example, something like:

if os.getenv('PYTHON_HOLIDAYS_CACHE_WARMUP_ENABLED') == 'true':
    # preload it here
@arkid15r
Copy link
Collaborator

Hi @eliasdorneles,
I know we don't have an issue template yet so I blame myself only for the fact I can't figure out what exactly caused the issue and what the preferred fix would be. In general, I'd avoid introducing env variable dependency but the PH code is totally open for contributions that fix existing users headaches.

Having said that, I'd like to request some clarification on this (multi-threading?) issue. It'd be ideal to have a PR (draft is okay) to understand the problem and assess the fix implementation approach.

Thank you!

@arkid15r arkid15r self-assigned this Oct 23, 2023
@eliasdorneles
Copy link
Author

Hello,

Right, I understand, I will try to get a minimal example that shows the problem.
I believe it's an issue related to importlib.import_module not being thread-safe.

The web application I mentioned (which I unfortunately cannot share the code as-is) is a FastAPI app, with endpoints defined as "sync" (using regular def and not async def), so the requests are served by anyio worker threads.

I will come back with more details later!

Izzette added a commit to Izzette/python-holidays that referenced this issue May 16, 2024
@Izzette
Copy link
Contributor

Izzette commented May 16, 2024

@arkid15r I believe that this PR should fix the problem, #1794. I'll leave it in draft until I can confirm it works and characterize the potential perf impact.

Izzette added a commit to Izzette/python-holidays that referenced this issue May 22, 2024
@arkid15r
Copy link
Collaborator

Fixed by #1794

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants