C++ API handle optimizer defaults #161825

stmcgovern · 2025-08-29T21:52:19Z

This fixes the issue for all optimizers and parameter options.
A member function overwrite_from is added to the optimizer base class. Each optimizer then implements this function for comparing their accepted parameters to defaults. A SFINAE approach to handle the different optimizer parameters generically (in optimizer.h only) was evaluated, but I think this is easier to review and maintain.

This mirrors the Python API up to one edge case. An example of the edge case is provided below.

Python can distinguish between 1) Key not present in dict = "not specified" and 2) Key present in dict = "explicitly set". The C++ implementation cannot.
The issue hinges on whether or not to track if a particular parameter was set by the user explicitly or not (discrepancy in the case when the constructor default is explicitly passed in).

To track this seems like it will take more intervention than would be worth it (modify TORCH_ARG to keep track, use std::optional for the parameter types, use bitset tracking) and was not pursued in the current PR. I'm happy to alter the design if appropriate.

Example of edge case hinging on CONSTRUCTOR DEFAULTS vs OPTIMIZER DEFAULTS

CONSTRUCTOR DEFAULTS:
These are the values you get when calling AdamOptions()
AdamOptions().lr() = 0.001
AdamOptions().weight_decay() = 0
AdamOptions().eps() = 1e-08
OPTIMIZER DEFAULTS:
These are the values the user chose when creating the optimizer
User's optimizer defaults:
optimizer.lr() = 0.005
optimizer.weight_decay() = 0.1
optimizer.eps() = 1e-07
THE PROBLEM SCENARIO:
User wants to add a parameter group with explicit weight_decay=0.0
User sets: weight_decay(0)
THE CONFUSION:
Constructor default weight_decay: 0
User's explicit weight_decay: 0
Are they equal? YES

Since they're equal, our overwrite_from() logic thinks:
"User didn't set weight_decay explicitly, use optimizer default"
CURRENT BEHAVIOR:
Final weight_decay: 0.1
User expected: 0
Match? ❌ NO

=== KEY INSIGHT ===
Constructor defaults are built into the C++ class definition.
Optimizer defaults are chosen by the user at runtime. We want to respect the user intention.

pytorch-bot · 2025-08-29T21:52:23Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161825

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

⏳ No Failures, 12 Pending

As of commit f5c263c with merge base 322091d ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

stmcgovern · 2025-09-03T19:21:10Z

Hi @janeyx99, this is the PR I mentioned in issue #141884 . How can I link the PR to the issue? I thought the "Fixes #number" does that...

janeyx99

Thanks for working on the fix, this is an interesting bug indeed. I left a comment on the overall approach. Furthermore, since these tests don't run on CI--could you post a paste of the C++ test results locally?

janeyx99 · 2025-09-11T16:01:31Z

test/cpp/api/optim.cpp

+  ASSERT_NEAR(group1_opts.lr(), 0.002, 1e-6); // Inherited
+  ASSERT_EQ(group1_opts.betas(), std::make_tuple(0.8, 0.88)); // Inherited
+  ASSERT_NEAR(group1_opts.eps(), 1e-12, 1e-15); // Inherited
+  ASSERT_NEAR(group1_opts.weight_decay(), 0.11, 1e-6); // Preserved


How come these can't be ASSERT_EQ?

changed. use in serialization tests left.

janeyx99 · 2025-09-11T16:02:10Z

test/cpp/api/optim.cpp

+}
+
+TEST(OptimTest, MergeWithDefaultOptions_AdamW) {
+  torch::manual_seed(0);


is this important to the test? the actual params won't matter, right?

right. removed

janeyx99 · 2025-09-11T18:38:47Z

torch/csrc/api/src/optim/optimizer.cpp

      "You must override it in your subclass of torch::optim::OptimizerCloneableOptions<YourOptimizerOptions>.");
 }

+void OptimizerOptions::overwrite_from(const OptimizerOptions& source) {


Hi! Some highlevel qs:
How come we need a whole new overwrite_from API?
From the toplevel I would expect us to fix the base class so that the user specified defaults override the original defaults and then are used in add_param_group, without the need for adding a new API.

maybe we don't :) . I've tried to provide a fix to the base class, without adding a new API, but don't see a way to do it without one new virtual function call.

stmcgovern · 2025-09-20T18:35:56Z

Hi @janeyx99, thanks very much for your feedback. I've taken each one of your comments into consideration.
I've tried to find a complete fix to the base class, without adding a new API. The core problem is that we need a way to track whether the parameter field was explicitly set or not (to mimic Python dict behavior). I started with a function pointer registry and string based checking in the merge. Trying to move as much to compile time as possible, 1) led to the CRTP pattern for static dispatch and 2) avoiding string hash checking, with the bitset tracking, was adopted for performance and ease of serialization.
This uses a new macro (which I also don't like), but seems to solve the problem in an efficient and complete way (python api parity), without introducing a new API with much runtime overhead.

Here are the local optimizer tests (including the new ones). Is there a better way with gtest to see what the test actually ran (not just pass/fail)? dropping the filter and running 1020 tests pass too.

(base) [root@49023e5b8e19 pytorch]# ./build/bin/test_api --gtest_filter="*Optim*" -v
CUDA not available. Disabling CUDA and MultiCUDA tests
Note: Google Test filter = *Optim*-*_CUDA:*_MultiCUDA
[==========] Running 51 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 44 tests from OptimTest
[ RUN      ] OptimTest.OptimizerAccessors
[       OK ] OptimTest.OptimizerAccessors (1 ms)
[ RUN      ] OptimTest.OldInterface
[       OK ] OptimTest.OldInterface (0 ms)
[ RUN      ] OptimTest.XORConvergence_SGD
[       OK ] OptimTest.XORConvergence_SGD (655 ms)
[ RUN      ] OptimTest.XORConvergence_LBFGS
[       OK ] OptimTest.XORConvergence_LBFGS (434 ms)
[ RUN      ] OptimTest.XORConvergence_Adagrad
[       OK ] OptimTest.XORConvergence_Adagrad (250 ms)
[ RUN      ] OptimTest.XORConvergence_RMSprop
[       OK ] OptimTest.XORConvergence_RMSprop (236 ms)
[ RUN      ] OptimTest.XORConvergence_RMSpropWithMomentum
[       OK ] OptimTest.XORConvergence_RMSpropWithMomentum (703 ms)
[ RUN      ] OptimTest.XORConvergence_Adam
[       OK ] OptimTest.XORConvergence_Adam (264 ms)
[ RUN      ] OptimTest.XORConvergence_AdamWithAmsgrad
[       OK ] OptimTest.XORConvergence_AdamWithAmsgrad (278 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_Adam
[       OK ] OptimTest.ProducesPyTorchValues_Adam (92 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_AdamWithWeightDecay
[       OK ] OptimTest.ProducesPyTorchValues_AdamWithWeightDecay (96 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_AdamWithWeightDecayAndAMSGrad
[       OK ] OptimTest.ProducesPyTorchValues_AdamWithWeightDecayAndAMSGrad (99 ms)
[ RUN      ] OptimTest.XORConvergence_AdamW
[       OK ] OptimTest.XORConvergence_AdamW (267 ms)
[ RUN      ] OptimTest.XORConvergence_AdamWWithAmsgrad
[       OK ] OptimTest.XORConvergence_AdamWWithAmsgrad (266 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_AdamW
[       OK ] OptimTest.ProducesPyTorchValues_AdamW (94 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_AdamWWithoutWeightDecay
[       OK ] OptimTest.ProducesPyTorchValues_AdamWWithoutWeightDecay (95 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_AdamWWithAMSGrad
[       OK ] OptimTest.ProducesPyTorchValues_AdamWWithAMSGrad (99 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_Adagrad
[       OK ] OptimTest.ProducesPyTorchValues_Adagrad (78 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_AdagradWithWeightDecay
[       OK ] OptimTest.ProducesPyTorchValues_AdagradWithWeightDecay (83 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_AdagradWithWeightDecayAndLRDecay
[       OK ] OptimTest.ProducesPyTorchValues_AdagradWithWeightDecayAndLRDecay (83 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_RMSprop
[       OK ] OptimTest.ProducesPyTorchValues_RMSprop (82 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_RMSpropWithWeightDecay
[       OK ] OptimTest.ProducesPyTorchValues_RMSpropWithWeightDecay (86 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_RMSpropWithWeightDecayAndCentered
[       OK ] OptimTest.ProducesPyTorchValues_RMSpropWithWeightDecayAndCentered (94 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_RMSpropWithWeightDecayAndCenteredAndMomentum
[       OK ] OptimTest.ProducesPyTorchValues_RMSpropWithWeightDecayAndCenteredAndMomentum (99 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_SGD
[       OK ] OptimTest.ProducesPyTorchValues_SGD (69 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_SGDWithWeightDecay
[       OK ] OptimTest.ProducesPyTorchValues_SGDWithWeightDecay (71 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_SGDWithWeightDecayAndMomentum
[       OK ] OptimTest.ProducesPyTorchValues_SGDWithWeightDecayAndMomentum (72 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_SGDWithWeightDecayAndNesterovMomentum
[       OK ] OptimTest.ProducesPyTorchValues_SGDWithWeightDecayAndNesterovMomentum (80 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_LBFGS
[       OK ] OptimTest.ProducesPyTorchValues_LBFGS (68 ms)
[ RUN      ] OptimTest.ProducesPyTorchValues_LBFGS_with_line_search
[       OK ] OptimTest.ProducesPyTorchValues_LBFGS_with_line_search (298 ms)
[ RUN      ] OptimTest.ZeroGrad
[       OK ] OptimTest.ZeroGrad (0 ms)
[ RUN      ] OptimTest.ExternalVectorOfParameters
[       OK ] OptimTest.ExternalVectorOfParameters (0 ms)
[ RUN      ] OptimTest.AddParameter_LBFGS
[       OK ] OptimTest.AddParameter_LBFGS (0 ms)
[ RUN      ] OptimTest.CheckLRChange_StepLR_Adam
[       OK ] OptimTest.CheckLRChange_StepLR_Adam (0 ms)
[ RUN      ] OptimTest.CheckLRChange_ReduceLROnPlateau_Adam
[       OK ] OptimTest.CheckLRChange_ReduceLROnPlateau_Adam (0 ms)
[ RUN      ] OptimTest.MergeWithDefaultOptions_Adam
[       OK ] OptimTest.MergeWithDefaultOptions_Adam (0 ms)
[ RUN      ] OptimTest.MergeWithDefaultOptions_SGD
[       OK ] OptimTest.MergeWithDefaultOptions_SGD (0 ms)
[ RUN      ] OptimTest.MergeWithDefaultOptions_AdamW
[       OK ] OptimTest.MergeWithDefaultOptions_AdamW (0 ms)
[ RUN      ] OptimTest.MergeWithDefaultOptions_Adagrad
[       OK ] OptimTest.MergeWithDefaultOptions_Adagrad (0 ms)
[ RUN      ] OptimTest.MergeWithDefaultOptions_RMSprop
[       OK ] OptimTest.MergeWithDefaultOptions_RMSprop (0 ms)
[ RUN      ] OptimTest.MergeWithDefaultOptions_LBFGS
[       OK ] OptimTest.MergeWithDefaultOptions_LBFGS (0 ms)
[ RUN      ] OptimTest.MergeWithDefaultOptions_NoOptionsInheritance
[       OK ] OptimTest.MergeWithDefaultOptions_NoOptionsInheritance (0 ms)
[ RUN      ] OptimTest.SerializationPreservesFieldTracking_Adam
[       OK ] OptimTest.SerializationPreservesFieldTracking_Adam (9 ms)
[ RUN      ] OptimTest.SerializationPreservesFieldTracking_SGD
[       OK ] OptimTest.SerializationPreservesFieldTracking_SGD (0 ms)
[----------] 44 tests from OptimTest (5217 ms total)

[----------] 7 tests from SerializeTest
[ RUN      ] SerializeTest.Optim
[       OK ] SerializeTest.Optim (1 ms)
[ RUN      ] SerializeTest.Optim_Adagrad
[       OK ] SerializeTest.Optim_Adagrad (1 ms)
[ RUN      ] SerializeTest.Optim_SGD
[       OK ] SerializeTest.Optim_SGD (1 ms)
[ RUN      ] SerializeTest.Optim_Adam
[       OK ] SerializeTest.Optim_Adam (1 ms)
[ RUN      ] SerializeTest.Optim_AdamW
[       OK ] SerializeTest.Optim_AdamW (1 ms)
[ RUN      ] SerializeTest.Optim_RMSprop
[       OK ] SerializeTest.Optim_RMSprop (1 ms)
[ RUN      ] SerializeTest.Optim_LBFGS
[       OK ] SerializeTest.Optim_LBFGS (1 ms)
[----------] 7 tests from SerializeTest (12 ms total)

[----------] Global test environment tear-down
[==========] 51 tests from 2 test suites ran. (5230 ms total)
[  PASSED  ] 51 tests.
(base) [root@49023e5b8e19 pytorch]#

janeyx99 · 2025-09-23T18:52:16Z

Hmmm the reason I was hesitant about the first approach was because it required modifying every optimizer, which this new approach unfortunately still requires. If that is unavoidable, I think it is okay to have as simple of a solution as possible, but have it be an internal detail vs something users can see.

To that effect, I would prefer the solution with as few additions to the public API surface + the lowest complexity. If the original approach was cleaner, then we can have a private _override_defaults API that the constructors call that users shouldn't have to care about + maybe some documentation for why the helper is necessary. What do you think?

stmcgovern · 2025-09-25T16:18:02Z

Thanks for your comments @janeyx99 . That makes sense! I certainly agree that users should not have to be aware of these implementation details. I wasn't sure how much the runtime performance cost impacted your review. will revisit and simplify.

stmcgovern · 2025-10-01T19:47:16Z

Hi @janeyx99, here is my preferred approach. It does everything in optimizer.h/cpp, doesn't introduce any new API and most work is done at compile time. I've tried to follow how c10 uses template metaprogramming style. C++20 concepts can help smooth out some of the boilerplate, when they become available in PyTorch. Local tests are passing. Please let me know what you think. Thanks!

janeyx99

Much nicer! Can we privatize all the helpers?

torch/csrc/api/include/torch/optim/optimizer.h

stmcgovern · 2025-10-02T21:32:30Z

Hi @janeyx99 , I've changed the helpers. I hope I've answered your questions. The proposed solution approach is to use constructor defaults as a comparison baseline to detect user intent, then inherit from optimizer defaults for unspecified fields.

janeyx99 · 2025-10-06T18:43:40Z

torch/csrc/api/include/torch/optim/optimizer.h

 };

-template <typename Derived>
+// Forward declarations for optimizer option types


Can we make the following classes and structs private as well?

since these are forward declarations, I'm inclined to not change the style to use the prefix (would need touching all optimizer files). I do have to change them to be struct (which resolves the inconsistency causing clang build failure).

janeyx99

Please do privatize as much as possible so we are not inadvertently growing our API surface.

janeyx99 · 2025-10-06T18:43:57Z

torch/csrc/api/include/torch/optim/optimizer.h

+
+  // SFINAE field detection - detects optimizer fields using public accessor methods
+  template <class T, class Enable = void>
+  struct has_lr : std::false_type {};


These structs too

These helper structs are in the private part of the class OptimizerCloneableOptions.
Do you just want me to prefix all the implementation stuff I've added with an underscore? Is that just a style convention (happy to follow) or is there some codegen or python binding specific transformation?

janeyx99 · 2025-10-06T18:44:57Z

Thank you so much for following through this change! We are very close to the end!

stmcgovern · 2025-10-07T23:03:38Z

OK I think this is ready. Thanks for your feedback and help @janeyx99 !

janeyx99 · 2025-10-08T03:10:25Z

@pytorchbot merge

pytorchmergebot · 2025-10-08T03:12:02Z

PR targets viable/strict rather than main, refusing merge request

amjames · 2025-10-08T13:56:33Z

@pytorchbot merge -r main

pytorchmergebot · 2025-10-08T13:58:14Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

pytorchmergebot · 2025-10-08T13:58:17Z

Successfully rebased 141884-optimizer-defaults-clean onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via git checkout 141884-optimizer-defaults-clean && git pull --rebase)

pytorchmergebot · 2025-10-08T14:05:57Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

izaitsevfb · 2025-10-10T16:59:08Z

Hey, apologies for the revert, but your PR is causing undefined symbol errors when linking the crossplatform build targets internally at meta:

ld.lld: error: undefined symbol: typeinfo for torch::optim::AdamOptions
ld.lld: error: undefined symbol: torch::optim::AdamOptions::AdamOptions(double)
ld.lld: error: undefined symbol: vtable for torch::optim::LBFGSOptions

(Similar errors for AdamW, Adagrad, RMSprop, LBFGS)

probably need to move _merge_by_comparison() implementation from the header to optimizer.cpp where all optimizer option types are fully defined.

facebook-github-bot · 2025-10-10T17:48:46Z

@pytorchbot revert -m="Diff reverted internally" -c="ghfirst"

This Pull Request has been reverted by a revert inside Meta. To re-land this change, please open another pull request, assign the same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).)

pytorchmergebot · 2025-10-10T17:55:58Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

This reverts commit f332017. Reverted #161825 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](#161825 (comment)))

pytorchmergebot · 2025-10-10T17:56:15Z

@stmcgovern your PR has been successfully reverted.

This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.

stmcgovern · 2025-10-10T18:03:52Z

Hey, apologies for the revert, but your PR is causing undefined symbol errors when linking the crossplatform build targets internally at meta:
ld.lld: error: undefined symbol: typeinfo for torch::optim::AdamOptions
ld.lld: error: undefined symbol: torch::optim::AdamOptions::AdamOptions(double)
ld.lld: error: undefined symbol: vtable for torch::optim::LBFGSOptions
(Similar errors for AdamW, Adagrad, RMSprop, LBFGS)

probably need to move _merge_by_comparison() implementation from the header to optimizer.cpp where all optimizer option types are fully defined.

Thanks for the information @izaitsevfb. I'll move the function as you suggest and open another PR.

Addresses PyTorch issue pytorch#141884 by implementing automatic parameter group inheritance that achieves Python-C++ API parity without breaking changes. - Uses comparison-based merging to infer user intent vs default inheritance - C++17 SFINAE patterns following PyTorch conventions (matches c10/util/TypeTraits.h) -Add comprehensive tests for optimizer parameter group inheritance

stmcgovern · 2025-10-10T20:41:23Z

Follow-on PR is #165182

Fixes pytorch#141884 This fixes the issue for all optimizers and parameter options. A member function `overwrite_from` is added to the optimizer base class. Each optimizer then implements this function for comparing their accepted parameters to defaults. A SFINAE approach to handle the different optimizer parameters generically (in optimizer.h only) was evaluated, but I think this is easier to review and maintain. This mirrors the Python API up to one edge case. An example of the edge case is provided below. Python can distinguish between 1) Key not present in dict = "not specified" and 2) Key present in dict = "explicitly set". The C++ implementation cannot. The issue hinges on whether or not to track if a particular parameter was set by the user explicitly or not (discrepancy in the case when the constructor default is explicitly passed in). To track this seems like it will take more intervention than would be worth it (modify TORCH_ARG to keep track, use std::optional for the parameter types, use bitset tracking) and was not pursued in the current PR. I'm happy to alter the design if appropriate. ### Example of edge case hinging on CONSTRUCTOR DEFAULTS vs OPTIMIZER DEFAULTS 1. CONSTRUCTOR DEFAULTS: These are the values you get when calling AdamOptions() AdamOptions().lr() = 0.001 AdamOptions().weight_decay() = 0 AdamOptions().eps() = 1e-08 2. OPTIMIZER DEFAULTS: These are the values the user chose when creating the optimizer User's optimizer defaults: optimizer.lr() = 0.005 optimizer.weight_decay() = 0.1 optimizer.eps() = 1e-07 3. THE PROBLEM SCENARIO: User wants to add a parameter group with explicit weight_decay=0.0 User sets: weight_decay(0) 4. THE CONFUSION: Constructor default weight_decay: 0 User's explicit weight_decay: 0 Are they equal? YES Since they're equal, our overwrite_from() logic thinks: "User didn't set weight_decay explicitly, use optimizer default" 5. CURRENT BEHAVIOR: Final weight_decay: 0.1 User expected: 0 Match? ❌ NO === KEY INSIGHT === Constructor defaults are built into the C++ class definition. Optimizer defaults are chosen by the user at runtime. We want to respect the user intention. Pull Request resolved: pytorch#161825 Approved by: https://github.com/janeyx99

This reverts commit f332017. Reverted pytorch#161825 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](pytorch#161825 (comment)))

pytorch-bot bot added the release notes: optim label Aug 29, 2025

stmcgovern mentioned this pull request Aug 29, 2025

Default optimizer options are ignored in the C++ API #141884

Closed

pytorchbot added the open source label Aug 29, 2025

stmcgovern changed the title ~~141884 C++ API handle optimizer defaults~~ #141884 C++ API handle optimizer defaults Sep 3, 2025

stmcgovern changed the title ~~#141884 C++ API handle optimizer defaults~~ C++ API handle optimizer defaults Sep 3, 2025

janeyx99 reviewed Sep 11, 2025

View reviewed changes

janeyx99 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 11, 2025

stmcgovern force-pushed the 141884-optimizer-defaults-clean branch from a041b6f to d205fa6 Compare September 20, 2025 18:31

stmcgovern force-pushed the 141884-optimizer-defaults-clean branch from d205fa6 to 0c5cf7f Compare October 1, 2025 19:16

janeyx99 reviewed Oct 1, 2025

View reviewed changes

torch/csrc/api/include/torch/optim/optimizer.h Outdated Show resolved Hide resolved

torch/csrc/api/include/torch/optim/optimizer.h Outdated Show resolved Hide resolved

torch/csrc/api/include/torch/optim/optimizer.h Outdated Show resolved Hide resolved

stmcgovern force-pushed the 141884-optimizer-defaults-clean branch from 0c5cf7f to 1f7aed3 Compare October 2, 2025 21:06

janeyx99 reviewed Oct 6, 2025

View reviewed changes

janeyx99 previously approved these changes Oct 6, 2025

View reviewed changes

stmcgovern force-pushed the 141884-optimizer-defaults-clean branch from 1f7aed3 to 41731f4 Compare October 7, 2025 23:00

janeyx99 previously approved these changes Oct 8, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 8, 2025

pytorchmergebot added the merging label Oct 8, 2025

pytorchmergebot added the Merged label Oct 8, 2025

pytorchmergebot closed this in f332017 Oct 8, 2025

pytorchmergebot removed the merging label Oct 8, 2025

pytorchmergebot added Reverted ci-no-td Do not run TD on this PR labels Oct 10, 2025

pytorchmergebot reopened this Oct 10, 2025

stmcgovern added 5 commits October 10, 2025 19:21

prefix private helper functions with underscore

b6699f9

use _ pattern for sfinae structs and fix fwd declaration consistency

7f7b5b2

fix lint

1b0f333

fix linker vtable errors

5ae6650

stmcgovern force-pushed the 141884-optimizer-defaults-clean branch from f5c263c to 5ae6650 Compare October 10, 2025 20:32

pytorch-bot bot removed the ciflow/trunk Trigger trunk jobs on your pull request label Oct 10, 2025

stmcgovern closed this Oct 10, 2025

stmcgovern mentioned this pull request Oct 10, 2025

141884 optimizer defaults fix #165182

Open

C++ API handle optimizer defaults #161825

C++ API handle optimizer defaults #161825

Uh oh!

Conversation

stmcgovern commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Example of edge case hinging on CONSTRUCTOR DEFAULTS vs OPTIMIZER DEFAULTS

Uh oh!

pytorch-bot bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161825

⏳ No Failures, 12 Pending

Uh oh!

stmcgovern commented Sep 3, 2025

Uh oh!

janeyx99 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stmcgovern Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stmcgovern commented Sep 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

janeyx99 commented Sep 23, 2025

Uh oh!

stmcgovern commented Sep 25, 2025

Uh oh!

stmcgovern commented Oct 1, 2025

Uh oh!

janeyx99 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stmcgovern commented Oct 2, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

janeyx99 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

janeyx99 commented Oct 6, 2025

Uh oh!

stmcgovern commented Oct 7, 2025

Uh oh!

janeyx99 commented Oct 8, 2025

Uh oh!

pytorchmergebot commented Oct 8, 2025

Uh oh!

amjames commented Oct 8, 2025

Uh oh!

pytorchmergebot commented Oct 8, 2025

Uh oh!

pytorchmergebot commented Oct 8, 2025

Uh oh!

pytorchmergebot commented Oct 8, 2025

Merge started

Uh oh!

izaitsevfb commented Oct 10, 2025

Uh oh!

stmcgovern commented Aug 29, 2025 •

edited

Loading

pytorch-bot bot commented Aug 29, 2025 •

edited

Loading

stmcgovern Sep 20, 2025 •

edited

Loading

stmcgovern commented Sep 20, 2025 •

edited

Loading