Skip to content

Conversation

aliafzal
Copy link
Contributor

@aliafzal aliafzal commented Jun 7, 2025

Summary:

Summary

Introducing DeltaStore class which efficiently manages embedding table updates with the following features:

  • Tracks embedding table updates by table FQN with batch indexing
  • Supports multiple embedding update modes (NONE, FIRST, LAST)
  • Provides compaction functionality for calculating unique
  • Allows retrieval of unique/delta IDs per table with optional embedding values

How lookups are preserved and fetched?

In DeltaStore, lookups are preserved in the per_fqn_lookups dictionary, which maps table FQNs to lists of IndexedLookup objects. Each IndexedLookup contains:

  1. idx: The batch index
  2. ids: Tensor of embedding IDs
  3. embeddings: Optional tensor of embedding values

Lookups are added via the append method and can be:

  • Deleted with the delete method (up to a specific index)
  • Compacted with the compact method (merges lookups within a range)
  • Retrieved as unique/delta rows with the get_delta method

This diffs:

  1. delta_store.py includes all main logic to preserve, fetch, compact and delete
  2. types.py includes required datatypes and enums
  3. test_delta_store.py Includes test cases for compute, delete and compact methods

Reviewed By: TroyGarden

Differential Revision: D71130002

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 7, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D71130002

Summary:

# Summary
Introducing DeltaStore class which efficiently manages embedding table updates with the following features:
*   Tracks embedding table updates by table FQN with batch indexing
*   Supports multiple embedding update modes (NONE, FIRST, LAST)
*   Provides compaction functionality for calculating unique
*   Allows retrieval of unique/delta IDs per table with optional embedding values

## How lookups are preserved and fetched?
In DeltaStore, lookups are preserved in the `per_fqn_lookups` dictionary, which maps table FQNs to lists of `IndexedLookup` objects. Each `IndexedLookup` contains:

1.  `idx`: The batch index
2.  `ids`: Tensor of embedding IDs
3.  `embeddings`: Optional tensor of embedding values

Lookups are added via the `append` method and can be:

*   Deleted with the  `delete` method (up to a specific index)
*   Compacted with the `compact` method (merges lookups within a range)
*   Retrieved as unique/delta rows with the `get_delta` method

## This diffs:
1. delta_store.py includes all main logic to preserve, fetch, compact and delete
2. types.py includes required datatypes and enums
3. test_delta_store.py Includes test cases for compute, delete and compact methods

Reviewed By: TroyGarden

Differential Revision: D71130002
@aliafzal aliafzal force-pushed the export-D71130002 branch from 0480a9b to f4530f1 Compare June 7, 2025 18:56
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D71130002

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants