Fix problematic behavior of optimizer/scheduler in FeatureInversionTask by ganow · Pull Request #101 · KamitaniLab/bdpy

ganow · 2025-02-18T09:44:30Z

Problem

Current implementation of FeatureInversionTask has several limitations/problems in the use of optimizer/scheduler. Here are the concrete examples:

Initialization using param_groups works only one time

optimizer = optim.SGD([
  {"params": latent.parameters(), "lr": latent_lr},
  {"params": generator.parameters(), "lr" generator_lr},
], lr=base_lr)
task = FeatureInversionTask(encoder, generator, latent, critic, optimizer, num_iterations=100)
reconstructed = task(target_features)  # uses latent_lr for a latent module and generator_lr for a generator module
task.reset_states()
reconstructed_2 = task(target_features)  # uses base_lr for both latent module and generator module

Cannot use a learning rate scheduler

optimizer = optim.SGD(latent.parameters(), lr=0.01)
scheduler = ExponentialLR(optimizer, gamma=0.9)
task = FeatureInversionTask(encoder, generator, latent, critic, optimizer, scheduler=scheduler, num_iterations=100)
task.reset_states()
reconstructed = task(target_features)  # learning rate scheduler does not work properly

Cause

The cause of the problem is in the implementation of reset_states():

bdpy/bdpy/recon/torch/task/inversion.py

Lines 216 to 226 in 9ffe7bc

    
           def reset_states(self) -> None: 
        
               """Reset the state of the task.""" 
        
               self._generator.reset_states() 
        
               self._latent.reset_states() 
        
               self._optimizer = self._optimizer.__class__( 
        
                   chain( 
        
                       self._generator.parameters(), 
        
                       self._latent.parameters(), 
        
                   ), 
        
                   **self._optimizer.defaults, 
        
               )

Originally this method was implemented based on the following assumptions:

We can dynamically provide a consistent way of re-initializing optimizers only from their instances (L220-226)
Learning rate schedulers are not needed to be re-initialized

In reality, neither of these assumptions were true. In addition, since optimizers generally have dependencies on generator and latent instances, and learning rate schedulers have dependencies on optimizer instances, when any of these dependencies are re-instantiated, the references need to be replaced accordingly.

Solution

Instead of receiving the instances of the optimizer and learning rate scheduler themselves, FeatureInversionTask receives the factory method for creating instances. Following is the example use of the newly designed API:

encoder = build_encoder(...)
generator = build_generator(...)
latent = ArbitraryLatent(...)
critic = TargetNormalizedMSE(...)

optimizer_factory = build_optimizer_factory(
    SGD,
    get_params_fn=lambda generator, latent: [
        {"params": latent.parameters(), "lr": latent_lr},
        {"params": generator.parameters(), "lr" generator_lr},
    ],
    lr=base_lr, momentum=0.9
)
scheduler_factory = build_scheduler_factory(ExponentialLR, gamma=0.9)

task = FeatureInversionTask(
    encoder, generator, latent, critic,
    optimizer_factory, scheduler_factory,
    num_iterations=100,
)
reconstructed = task(target_features)

...
## In reset_states() of FeatureInversionTask
optimizer = optimizer_factory(generator, latent)
scheduler = scheduler_factory(optimizer)

Breaking changes in API

FeatureInversionTask takes optimizer_factory: (BaseGenerator, BaseLatent) -> Optimizer instead of optimizer: Optimizer as an input argument
FeatureInversionTask takes scheduler_factory: Optimizer -> LRScheduler instead of scheduler: LRScheduler as an input argument

…er factory

github-actions · 2025-02-18T09:46:28Z

Tests	Skipped	Failures	Errors	Time
109	0 💤	0 ❌	1 🔥	11.401s ⏱️

KenyaOtsuka

Looks good to me.

…d clarity

ganow · 2025-02-20T09:34:48Z

Based on Otsuka-san's suggestion, I have revised the type definitions as follows.

   build_optimizer_factory: (type[Optimizer], _GetParamsFnType) -> _OptimizerFactoryType
-  _GetParamsFnType: TypeAlias = (BaseGenerator, BaseLatent) -> Iterator[Parameter]
+  _GetParamsFnType: TypeAlias = (BaseGenerator, BaseLatent) -> _ParamsT
   _OptimizerFactoryType: TypeAlias = (BaseGenerator, BaseLatent) -> Optimizer
+  _ParamsT: TypeAlias = Iterable[Tensor] | Iterable[Dict[str, Any]] | Iterable[Tuple[str, Tensor]]

Reasons behind this modification

Previous type annotations were not compatible with the use of the build_optimizer_factory like the following:

optimizer_factory = build_optimizer_factory(
    SGD,
    get_params_fn=lambda generator, latent: [
        {"params": latent.parameters(), "lr": latent_lr},
        {"params": generator.parameters(), "lr" generator_lr},
    ],
    lr=base_lr, momentum=0.9
)

The get_params_fn will return the list of the dictionary while previous our type annotation assumes the return type should be the iterable of torch.nn.Parameter. So we aligned the type annotation with the one defined in pytorch.

Why we redefined the same concept in our codebase instead of just importing it from PyTorch?

I decided to define the _ParamsT in our codebase instead of importing the same concept in PyTorch.

bdpy/bdpy/recon/torch/modules/optimizer.py

Lines 14 to 18 in c23afe7

    
           # NOTE: The definition of `_ParamsT` is the same as in `torch.optim.optimizer` 
        
           #       in torch>=2.2.0. We define it here for compatibility with older versions. 
        
           _ParamsT: TypeAlias = Union[ 
        
               Iterable[Tensor], Iterable[Dict[str, Any]], Iterable[Tuple[str, Tensor]] 
        
           ]

The name of this type was _params_t before PyTorch v2.2.0, but the name of this concept has been changed as ParamsT in v2.2.0. So I decided to redefine this because I did not find a consistent way of importing this concept independent from the PyTorch version.

Note on the type definition of `_ParamsT`

In PyTorch, the optimizer can now accept named parameters from v2.6.0. In line with this, the definition of ParamsT has been changed in PyTorch as follows in v2.6.0 and later.

-  ParamsT: TypeAlias = Iterable[Tensor] | Iterable[Dict[str, Any]]]
+  ParamsT: TypeAlias = Iterable[Tensor] | Iterable[Dict[str, Any]] | Iterable[Tuple[str, Tensor]]

This PR uses a type definition that reflects this latest change. Therefore, such an example code will pass the type checking but will raise an exception at runtime if PyTorch earlier than v2.6.0 is installed in the environment:

def get_params_fn(generator: BaseGenerator, latent: BaseLatent):
    return chain(generator.named_parameters(), latent.named_parameters())

optimizer_factory = build_optimizer_factory(SGD, get_params_fn=get_params_fn, lr=0.1)

Therefore, if we use the code base associated with this change in an environment where PyTorch earlier than v2.6.0 is installed. As I couldn't think of a consistent type definition idea that appropriately works on different PyTroch versions, I adopted this implementation as a tentative decision.

ganow added 2 commits February 18, 2025 18:37

Add optimizer and scheduler factory functions to the optimizer module

64e075f

Refactor optimizer initialization in test_inversion.py to use optimiz…

0fb8f38

…er factory

ganow added the bug label Feb 18, 2025

ganow requested review from KenyaOtsuka and ShuntaroAoki February 18, 2025 10:21

KenyaOtsuka approved these changes Feb 19, 2025

View reviewed changes

ganow marked this pull request as ready for review February 19, 2025 04:19

Refactor type annotations in optimizer.py for better compatibility an…

c23afe7

…d clarity

ganow requested a review from kencan7749 February 27, 2025 05:10

ShuntaroAoki merged commit d86199f into dev Mar 27, 2025
0 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix problematic behavior of optimizer/scheduler in FeatureInversionTask#101

Fix problematic behavior of optimizer/scheduler in FeatureInversionTask#101
ShuntaroAoki merged 3 commits intodevfrom
fix-scheduler-behavior

ganow commented Feb 18, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Feb 18, 2025 •

edited

Loading

Uh oh!

KenyaOtsuka left a comment

Uh oh!

ganow commented Feb 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	def reset_states(self) -> None:
	"""Reset the state of the task."""
	self._generator.reset_states()
	self._latent.reset_states()
	self._optimizer = self._optimizer.__class__(
	chain(
	self._generator.parameters(),
	self._latent.parameters(),
	),
	**self._optimizer.defaults,
	)

Conversation

ganow commented Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Initialization using param_groups works only one time

Cannot use a learning rate scheduler

Cause

Solution

Breaking changes in API

Uh oh!

github-actions bot commented Feb 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KenyaOtsuka left a comment

Choose a reason for hiding this comment

Uh oh!

ganow commented Feb 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reasons behind this modification

Why we redefined the same concept in our codebase instead of just importing it from PyTorch?

Note on the type definition of _ParamsT

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ganow commented Feb 18, 2025 •

edited

Loading

github-actions bot commented Feb 18, 2025 •

edited

Loading

ganow commented Feb 20, 2025 •

edited

Loading

Note on the type definition of `_ParamsT`