Skip to content

Conversation

@effigies
Copy link
Contributor

@effigies effigies commented Oct 8, 2025

TemplateFlow depends a lot on import-time initialization, which can slow down any tool that imports it. It also makes the tests extremely contorted as they have to monkey patch global variables and then either reload modules or rerun module import-time behavior.

This PR creates the following classes:

  • templateflow.conf.cache.CacheConfig: a dataclass that will auto-populate in the way the TF_* global config variables did, if no alternatives are passed.
  • templateflow.conf.cache.TemplateFlowCache: A wrapper around a specific cache location that implements ensure (install if missing), update and wipe methods and holds a persistent Layout that is built on first access and deleted after an update or wipe so the next access triggers a rebuild.
  • templateflow.client.TemplateFlowClient: This provides the templateflow.api functions. It simply holds a TemplateFlowCache and uses its layout instead of a global TF_LAYOUT.

templateflow.cache.__init__ now creates a default TemplateFlowCache and templateflow.api now wraps that in a TemplateFlowClient, and provides its old methods through a module __getattr__. I've left the tests in place to demonstrate that the previously expected behavior remains.

Following @mgxd's review, I've adjusted the TemplateFlowClient.__init__ to be the only interface we advertise. There is a cache and a config object, which can be passed as an expert option, but there's no good reason for people to use it.


Open questions:

  1. How do you feel about this API and separation of concerns?
  2. What about the TemplateFlowClient.__init__ method? I feel like __init__(self, *, cache=None, **config_kwargs) might be a neater way of doing it, so nobody needs to directly construct a CacheConfig instead of TemplateFlowClient(root='/tmp/templateflow', timeout=2).
  3. Should we have a module-level __setattr__ to update templateflow.cache.TF_* constants? I don't think that was a well-supported feature, since many were imported into templateflow.api, so changing them in one place wouldn't change them in the other.

Any other thoughts would be welcome. I want copy the test_api test battery to test the client directly (using only the new classes), and add docstrings before finalizing this.

@effigies effigies force-pushed the rf/client branch 3 times, most recently from 4ab1bfa to d5f3685 Compare October 8, 2025 18:42
Copy link
Contributor

@mgxd mgxd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TemplateFlow depends a lot on import-time initialization, which can slow down any tool that imports it. It also makes the tests extremely contorted as they have to monkey patch global variables and then either reload modules or rerun module import-time behavior.

Agreed, this is a welcome change IMO 👍

How do you feel about this API and separation of concerns?

On board with TemplateFlowCache and TemplateFlowClient, but not sure CacheConfig needs to be public - especially with the suggestions in 2.

What about the TemplateFlowClient.init method?

Yes please, though WDYT about TemplateFlowClient.__init__(self, root: str | None, *, config=None, **config_kwargs) - inspired by BIDSLayout. Then users will only concern themselves with a single import / API. Which parameters should take precedence here? I would assume, from first to last:

  1. config is loaded
  2. kwargs overwrite
  3. root overwrite

Should we have a module-level setattr to update templateflow.cache.TF_* constants? I don't think that was a well-supported feature, since many were imported into templateflow.api, so changing them in one place wouldn't change them in the other.

I don't think so - in fact, isn't this what CacheConfig is essentially addressing?

@effigies effigies force-pushed the rf/client branch 2 times, most recently from 850632c to 8e4eb33 Compare October 19, 2025 16:14
@effigies effigies requested review from mgxd and oesteban October 19, 2025 16:43
@effigies effigies changed the title rf: Factor TemplateFlow into Config, Cache, and Client classes rf: Factor TemplateFlow into Cache and Client classes Oct 19, 2025
Copy link
Contributor

@mgxd mgxd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, left a suggestion for the pytest error, though not sure about the docs...

Copy link
Member

@oesteban oesteban left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outstanding work, Chris, thanks for this. Thanks also for your effort to keep things all caught up with best practices, new tools and the general landscape.

@oesteban
Copy link
Member

  • How do you feel about this API and separation of concerns?

I think this work is very much needed, thanks!

  • What about the TemplateFlowClient.__init__ method? I feel like __init__(self, *, cache=None, **config_kwargs) might be a neater way of doing it, so nobody needs to directly construct a CacheConfig instead of TemplateFlowClient(root='/tmp/templateflow', timeout=2).

It feels more natural. I wonder whether we should just break backwards compatibility and recommend people to try to create the client with the new API and then fall back to the old api.stuff() model.

  • Should we have a module-level __setattr__ to update templateflow.cache.TF_* constants? I don't think that was a well-supported feature, since many were imported into templateflow.api, so changing them in one place wouldn't change them in the other.

I think it could be useful for people trying to do weird things (e.g., multiple atlases, playing with versions of the Archive, etc.). However, I would leave that for a future PR if people request it.

effigies and others added 2 commits October 20, 2025 17:08
@effigies effigies merged commit 7718e6a into templateflow:master Oct 20, 2025
14 checks passed
@effigies effigies deleted the rf/client branch October 20, 2025 22:21
effigies added a commit that referenced this pull request Oct 21, 2025
25.1.0 (October 21, 2025)

New feature release in the 25.1 series.

This release introduces a new ``TemplateFlowClient`` class that provides the
functionality previously exposed in ``templateflow.api``.
``templateflow.api`` is now a thin wrapper around a global instance of ``TemplateFlowClient``,
so existing code using ``templateflow.api`` should continue to work as before.

These changes allow multiple independent clients to coexist in the same Python process,
as well as defer loading of data from the filesystem until it is requested,
significantly improving import time.

* RF: Factor TemplateFlow into Cache and Client classes (#149)
* FIX: Error on missing S3 files, do not write error data to disk (#148)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants