Skip to content

Conversation

@aliafzal
Copy link
Contributor

Summary:
internal

General Context: We are in the process of transition to a unified DeltaTracker and this is 6/n diffs representing changes towards the transition.

Context: MRS DeltaTracker is initialized right before training, to allow for OSS DeltaTracker to have similar behavior adding a post DMP init function for initializing ModelDeltaTracker if not initialized.

Differential Revision: D80615308

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 20, 2025
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Oct 20, 2025

@aliafzal has exported this pull request. If you are a Meta employee, you can view the originating Diff in D80615308.

Ali Afzal added 7 commits October 20, 2025 20:43
Summary:
### Overview

This diff adds support for tracking optimizer states  in the Model Delta Tracker system. It introduces a new tracking mode called `MOMENTUM_LAST` that enables tracking of momentum values from optimizers to support approximate top-k delta-row selection.

### Key Changes

#### 1. Optimizer State Tracking Support

*   To support tracking of optimizer states I have added `optim_state_tracker_fn` attribute to `GroupedEmbeddingsLookup` and `GroupedPooledEmbeddingsLookup` classes responsible for traversing over the BatchedFused modules.
*   Implemented `register_optim_state_tracker_fn()` method in both classes to register the trackable callable
*   Tracking calls are invoked after each lookup operation.

#### 2. Model Delta Tracker Changes

*   Added `record_momentum()` method to track momentum values from optimizer states and its support in record_lookup function.
*   Added validation and optim tracker function logic to support the new `MOMENTUM_LAST` mode

#### 3. New Tracking Mode

*   Added `TrackingMode.MOMENTUM_LAST` to [`**types.py**`](command:code-compose.open?%5B%22%2Ffbcode%2Ftorchrec%2Fdistributed%2Fmodel_tracker%2Ftypes.py%22%2Cnull%5D "/fbcode/torchrec/distributed/model_tracker/types.py")
*   Maps to `EmbdUpdateMode.LAST` to capture the most recent momentum values

Differential Revision: D76868111
Differential Revision: D76918891
Differential Revision: D80614364
Differential Revision: D80614586
Differential Revision: D80614689
Differential Revision: D80615183
aliafzal added a commit to aliafzal/torchrec that referenced this pull request Oct 21, 2025
…h#3472)

Summary:
Pull Request resolved: meta-pytorch#3472

internal

General Context: We are in the process of transition to a unified DeltaTracker and this is 6/n diffs representing  changes towards the transition.

Context: MRS DeltaTracker is initialized right before training, to allow for OSS DeltaTracker to have similar behavior adding a post DMP init function for initializing ModelDeltaTracker if not initialized.

Differential Revision: D80615308
…h#3472)

Summary:
Pull Request resolved: meta-pytorch#3472

internal

General Context: We are in the process of transition to a unified DeltaTracker and this is 6/n diffs representing  changes towards the transition.

Context: MRS DeltaTracker is initialized right before training, to allow for OSS DeltaTracker to have similar behavior adding a post DMP init function for initializing ModelDeltaTracker if not initialized.

Differential Revision: D80615308
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant