Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test case for EBC key-order change #2388

Closed
wants to merge 2 commits into from

Conversation

TroyGarden
Copy link
Contributor

Summary:

context

  • post
  • this test case mimics the EBC key-order change after sharding
    {F1864056306}

details

  • it's a very simple model: EBC ---> KTRegroupAsDict
  • we generate two EBCs: ebc1 and ebc2, such that the table orders are different:
        ebc1 = EmbeddingBagCollection(
            tables=[tb1_config, tb2_config, tb3_config],
            is_weighted=False,
        )
        ebc2 = EmbeddingBagCollection(
            tables=[tb1_config, tb3_config, tb2_config],
            is_weighted=False,
        )
  • we export the model with ebc1 and unflatten the model, and then swap with ebc2 (you can think this as the the sharding process resulting a shardedEBC), so that we can mimic the key-order change as shown in the above graph
  • the test checks the final results after KTRegroupAsDict are consistent with the original eager model

Reviewed By: PaulZhang12

Differential Revision: D62604419

Summary:
# context

current error:
```
  1) torchrec.fb.ir.tests.test_serializer.TestSerializer: test_deserialized_device_vle
    1) RuntimeError: Node ir_dynamic_batch_emb_lookup_default referenced nonexistent value id_list_features__values! Run Graph.lint() to diagnose such issues

    While executing %ir_dynamic_batch_emb_lookup_default : [num_users=1] = call_function[target=torch.ops.torchrec.ir_dynamic_batch_emb_lookup.default](args = ([%id_list_features__values, None, %id_list_features__lengths, None], %floordiv, [4, 5]), kwargs = {})
    Original traceback:
    File "/data/users/hhy/fbsource/buck-out/v2/gen/fbcode/009ebbab256a7e75/torchrec/fb/ir/tests/__test_serializer__/test_serializer#link-tree/torchrec/fb/ir/tests/test_serializer.py", line 142, in forward
        return self.sparse_arch(id_list_features)
      File "torchrec/fb/ir/tests/test_serializer.py", line 446, in test_deserialized_device_vle
        output = deserialized_model(features_batch_3.to(device))
      File "torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "torch/nn/modules/module.py", line 1747, in _call_impl
        return forward_call(*args, **kwargs)
      File "torch/export/unflatten.py", line 482, in forward
        tree_out = torch.fx.Interpreter(self, graph=self.graph).run(
      File "torch/fx/interpreter.py", line 146, in run
        self.env[node] = self.run_node(node)
      File "torch/fx/interpreter.py", line 200, in run_node
        args, kwargs = self.fetch_args_kwargs_from_env(n)
      File "torch/fx/interpreter.py", line 372, in fetch_args_kwargs_from_env
        args = self.map_nodes_to_values(n.args, n)
      File "torch/fx/interpreter.py", line 394, in map_nodes_to_values
        return map_arg(args, load_arg)
      File "torch/fx/node.py", line 760, in map_arg
        return map_aggregate(a, lambda x: fn(x) if isinstance(x, Node) else x)
      File "torch/fx/node.py", line 768, in map_aggregate
        t = tuple(map_aggregate(elem, fn) for elem in a)
      File "torch/fx/node.py", line 768, in <genexpr>
        t = tuple(map_aggregate(elem, fn) for elem in a)
      File "torch/fx/node.py", line 772, in map_aggregate
        return immutable_list(map_aggregate(elem, fn) for elem in a)
      File "torch/fx/node.py", line 772, in <genexpr>
        return immutable_list(map_aggregate(elem, fn) for elem in a)
      File "torch/fx/node.py", line 778, in map_aggregate
        return fn(a)
      File "torch/fx/node.py", line 760, in <lambda>
        return map_aggregate(a, lambda x: fn(x) if isinstance(x, Node) else x)
      File "torch/fx/interpreter.py", line 391, in load_arg
        raise RuntimeError(f'Node {n} referenced nonexistent value {n_arg}! Run Graph.lint() '
```

Differential Revision: D59238744
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 12, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62604419

Summary:
Pull Request resolved: pytorch#2388

# context
* [post](https://fb.workplace.com/groups/1028545332188949/permalink/1042204770823005/)
* this test case mimics the EBC key-order change after sharding
 {F1864056306}

# details
* it's a very simple model: EBC ---> KTRegroupAsDict
* we generate two EBCs: ebc1 and ebc2, such that the table orders are different:
```
        ebc1 = EmbeddingBagCollection(
            tables=[tb1_config, tb2_config, tb3_config],
            is_weighted=False,
        )
        ebc2 = EmbeddingBagCollection(
            tables=[tb1_config, tb3_config, tb2_config],
            is_weighted=False,
        )
```
* we export the model with ebc1 and unflatten the model, and then swap with ebc2 (you can think this as the the sharding process resulting a shardedEBC), so that we can mimic the key-order change as shown in the above graph
* the test checks the final results after KTRegroupAsDict are consistent with the original eager model

Reviewed By: PaulZhang12

Differential Revision: D62604419
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62604419

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants