Collection Support [Inprogress] #802

inocsin · 2022-01-13T07:18:27Z

TODO list and current progress

Description

Fixes #629 #428

Type of change

New feature: collection support

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes

inocsin · 2022-03-17T12:28:11Z

@narendasan @peri044, the first version of collection feature is ready, can you review the code?
There are some diffenrences from the method mentioned in #629
(1) This implementation can handle input and output of list/tuple type of the whole graph. While the trt-segment doesn't have to handle input and output of list/tuple type, because ListConstruct and TupleConsturct are included and handled in torch-segment
(2) I didn't find any api to calculate the list size of Value which has the type of in->type()->kind() == torch::jit::TypeKind::ListType, so I didn't flatten the input IValue of ir::Input to vector, instead, I use vector<vector> as flattened spec, because it's easier to index Input spec without knowing the size of List. For example, given ([a,b],(c,d)) as input, we can get Input spec of c with input_spec[1][0] without know the size of list [a,b].

narendasan

Took a high level pass. I think the logic seems mostly fine, lots of usability issues however re the messaging.

narendasan · 2022-03-24T22:59:54Z

core/ir/ir.cpp

 InputSpecMap pair_input_vals_with_specs(std::vector<const torch::jit::Value*> vals, std::vector<Input> specs) {
+  LOG_DEBUG("pair_input_vals_with_specs");


Could we have a better message here or just remove it if its not useful. Perhaps dump what the pairings are?

narendasan · 2022-03-24T23:00:02Z

core/ir/ir.cpp

@@ -27,19 +36,59 @@ InputSpecMap pair_input_vals_with_specs(std::vector<const torch::jit::Value*> va
  return a;
 }

+CollectionInputSpecMap pair_input_vals_with_specs_collection(std::vector<const torch::jit::Value*> vals, std::vector<std::vector<Input>>& specs) {
+  LOG_DEBUG("pair_input_vals_with_specs collection");


Same thing here

narendasan · 2022-03-24T23:00:47Z

core/ir/ir.cpp

+
+  CollectionInputSpecMap a;
+  for (size_t i = 0; i < vals.size(); i++) {
+    LOG_DEBUG("Paring " << i << ": " << vals[i]->debugName() << " : " << specs[i]);


It would be good to make this slightly more formatted so it stands out in the log

narendasan · 2022-03-24T23:01:17Z

core/ir/ir.cpp

 std::vector<const torch::jit::Value*> get_tensor_inputs(
    std::shared_ptr<torch::jit::Graph>& g,
    StaticParams& static_params) {
  std::vector<const torch::jit::Value*> input_tensors;
  auto inputs = g->inputs();
+  LOG_DEBUG("Inputs size " << inputs.size());


Not sure we need to keep this, or if we should this needs to be more discriptive

narendasan · 2022-03-24T23:01:22Z

core/ir/ir.cpp

  for (auto in : inputs) {
+    LOG_DEBUG("input debug name: " << in->debugName());


narendasan · 2022-03-24T23:19:38Z

tests/cpp/test_collection.cpp

+
+TEST(CppAPITests, TestCollectionListInput) {
+
+  std::string path = "/root/Torch-TensorRT/list_input.ts";


narendasan · 2022-03-24T23:20:16Z

core/compiler.cpp

+    // ir::TypeMap& first_use_type_map) {
+    // Associate input specs with inputs
+    // cfg.convert_info.inputs = std::move(ir::associate_specs_with_inputs(g, cfg.inputs, static_params));
+    cfg.convert_info.collection_inputs = std::move(ir::associate_specs_with_collection_inputs(g, cfg.graph_inputs, static_params));


Why do we call this twice with different apis?

This will get the input spec map(collection_inputs maybe confused with GraphInput.collection_input, it should be changed into collection_input_spec_map)

narendasan · 2022-03-24T23:20:25Z

core/compiler.cpp

+    cfg.convert_info.collection_inputs = std::move(ir::associate_specs_with_collection_inputs(g, cfg.graph_inputs, static_params));
+
+    auto collection_inputs = ir::get_collection_inputs(g, static_params);
+    LOG_DEBUG("In MapInputsAndDetermineDTypes " << "g->inputs() size " << g->inputs().size() << ", collection_inputs size " << collection_inputs.size());


Better logging messages

narendasan · 2022-03-24T23:21:16Z

core/compiler.cpp

+          LOG_INFO(
+              "Since input type is not explicitly defined, infering using first tensor calculation\n  Found input "
+              << in->debugName() << " has type " << est_type_opt[i].value()
+              << ". If this is incorrect explicitly set dtype for input and file a bug");


I dont think we should tell them to file a bug here since we just use a heuristic, its not always going to be right.

narendasan · 2022-03-24T23:21:48Z

core/compiler.cpp

+          // If we can calculate the type from the graph and the type was not defined by the user then use the calculated
+          // type
+          LOG_INFO(
+              "Since input type is not explicitly defined, infering using first tensor calculation\n  Found input "


Inferred instead of Found

narendasan · 2022-04-05T01:56:54Z

@inocsin can you rebase quickly?

inocsin · 2022-04-05T04:10:23Z

@inocsin can you rebase quickly?

@narendasan rebased to master

inocsin · 2022-04-08T10:09:14Z

@naren，I think we should remove aten::__getitem__ and prim::ListConstruct converters, since tensorrt doesn't support tensorList as output, so the ListConstruct should be done in torch subgraph. And since we cannot evaluate item in tensorList, we should remove aten::getitem. What do you think?

inocsin · 2022-04-08T10:29:18Z

py/torch_tensorrt/ts/_compile_spec.py

+            # info.inputs = [i._to_internal() for i in inputs]
+            info.graph_inputs.inputs = [i._to_internal() for i in inputs]
+        else:
+            info.graph_inputs.input_signature = _parse_collection_input(compile_spec["inputs"])


Still cannot pass python value (torch.classes.tensorrt._Input()) to info.graph_inpus.input_signature (torch::jit::IValue)

@narendasan do you have any suggestions?

narendasan · 2022-04-09T03:45:46Z

We need prim::ListConstruct because some operations just group and ungroup tensors but dont actually need to be executed at runtime. By forcing it to run PyTorch we would incur a performance cost for switching between TensorRT and PyTorch a bunch

…sorrt::Input compatible with IValue. Support simple case of tuple input model. Add unit test. Signed-off-by: inocsin <vcheungyi@163.com>

… elements. Using two level vector to store ir::Input Signed-off-by: inocsin <vcheungyi@163.com>

Signed-off-by: inocsin <vcheungyi@163.com>

…sionInfo.collection_input_spec_map Signed-off-by: inocsin <vcheungyi@163.com>

Signed-off-by: inocsin <vcheungyi@163.com>

…ually Signed-off-by: inocsin <vcheungyi@163.com>

inocsin · 2022-04-14T12:19:18Z

Now can keep prim::ListConstruct in prim.cpp, and don't have to fallback it manually. Only fallback ListConstrct to torch when the output of tensorrt subgraph is ListConstruct.

Signed-off-by: inocsin <vcheungyi@163.com>

…and all the nodes can be converted Signed-off-by: inocsin <vcheungyi@163.com>

narendasan · 2022-04-19T01:27:02Z

py/torch_tensorrt/ts/_compile_spec.py

+        #     raise KeyError("Input specs should be either torch_tensorrt.Input or torch.Tensor, found types: {}".format(
+        #         [type(i) for i in compile_spec["inputs"]]))
+
+        if isinstance(compile_spec["inputs"], list) and all([isinstance(i, torch.Tensor) or isinstance(i, Input) for i in compile_spec["inputs"]]):


Can these be reduced into a common code path?

Signed-off-by: inocsin <vcheungyi@163.com>

ncomly-nvidia · 2022-05-06T00:06:39Z

Fixes #469

facebook-github-bot · 2022-05-12T22:53:43Z

Hi @inocsin!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

facebook-github-bot · 2022-06-11T01:04:22Z

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

ncomly-nvidia · 2022-07-28T18:56:57Z

Should fix #798

ntakouris · 2022-08-03T10:11:37Z

When this is done, it will probably make many (if not all) torchvision models much closer to automatic conversion to trt:

Relevant Issue

narendasan · 2022-08-04T13:27:09Z

Closing in favor of #1201

JXQI · 2023-11-01T02:49:01Z

After I merge this changes, When I import torch_tensorrt raise errors:
File "/media/tx-deepocean/Data/poetry_cache/virtualenvs/chest-ct-engine-jmgQEMqM-py3.7/lib/python3.7/site-packages/torch_tensorrt/_enums.py", line 1, in <module> from torch_tensorrt._C import dtype, DeviceType, EngineCapability, TensorFormat ImportError: /media/tx-deepocean/Data/poetry_cache/virtualenvs/chest-ct-engine-jmgQEMqM-py3.7/lib/python3.7/site-packages/torch_tensorrt/_C.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZN14torch_tensorrt4core2ir11GraphInputsC1ERN3c106IValueE

can you help me?thanks

inocsin force-pushed the collection branch from e38a367 to 84f78c9 Compare March 17, 2022 11:44

narendasan reviewed Mar 24, 2022

View reviewed changes

inocsin force-pushed the collection branch from eaf1254 to b91c1c9 Compare April 5, 2022 04:09

inocsin commented Apr 8, 2022

View reviewed changes

inocsin added 18 commits April 12, 2022 10:08

feat: [collection] make torch_tensorrt::core::ir::Input and torch_ten…

4434553

…sorrt::Input compatible with IValue. Support simple case of tuple input model. Add unit test. Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] try to defer determing the data type of tuple/list…

2fc1363

… elements. Using two level vector to store ir::Input Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] limited support for tuple input

0072e37

Signed-off-by: inocsin <vcheungyi@163.com>

fix: [collection] test normal input, fix bug

b1d66cb

Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] support list input type

d4e54f1

Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] support user defined input data type

a9aa2e7

Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] support output type of list and tuple

5830cbe

Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] add unit test for complex collection model

6733cfb

Signed-off-by: inocsin <vcheungyi@163.com>

chore: [collection] delete comments

d21b0ab

Signed-off-by: inocsin <vcheungyi@163.com>

chore: [collection] update code and comments

eada66d

Signed-off-by: inocsin <vcheungyi@163.com>

chore: [collection] rename ConversionInfo.collection_inputs to Conver…

633c00f

…sionInfo.collection_input_spec_map Signed-off-by: inocsin <vcheungyi@163.com>

refactor: [collection] fuse Input with GraphInputs

89665c8

Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] move collection test model to hub.py

205452e

Signed-off-by: inocsin <vcheungyi@163.com>

test: [collection] update model path in test_collection.cpp

a4d4131

Signed-off-by: inocsin <vcheungyi@163.com>

fix: [collection] solve confict in ir.cpp

2d585e5

Signed-off-by: inocsin <vcheungyi@163.com>

feat: [collection] update python api, refactor code

5f36810

Signed-off-by: inocsin <vcheungyi@163.com>

fix: [collection] remove aten::__getitem__ and prim::ListConstruct

d9d8665

Signed-off-by: inocsin <vcheungyi@163.com>

[collection] rebase to master, update some api

991f023

Signed-off-by: inocsin <vcheungyi@163.com>

inocsin force-pushed the collection branch from 6391ae3 to 991f023 Compare April 12, 2022 03:52

feat: [collection] handle prim::ListConstruct without fallback it man…

016c991

…ually Signed-off-by: inocsin <vcheungyi@163.com>

inocsin added 2 commits April 14, 2022 21:57

chore: [collection] update test_resolve_nontensor_inputs.cpp

2e7cd58

Signed-off-by: inocsin <vcheungyi@163.com>

fix: [collection] handle the case that only the output is collection …

b35cdd0

…and all the nodes can be converted Signed-off-by: inocsin <vcheungyi@163.com>

narendasan reviewed Apr 19, 2022

View reviewed changes

fix: [collection] update tests/cpp/test_example_tensors.cpp

fa6c10e

Signed-off-by: inocsin <vcheungyi@163.com>

inocsin mentioned this pull request May 18, 2022

Collection: Python api support #1071

Merged

7 tasks

facebook-github-bot added the cla signed label Jun 11, 2022

ncomly-nvidia added the release: v1.2 Tagged to be included in v1.2 label Jul 11, 2022

ncomly-nvidia added the component: core Issues re: The core compiler label Jul 25, 2022

github-actions bot requested review from bowang007, narendasan and peri044 July 25, 2022 15:48

ncomly-nvidia mentioned this pull request Jul 26, 2022

🐛 [Bug] Returning list of tensors fails when operations are applied to tensors: RuntimeError: [Error thrown at core/conversion/conversion.cpp:220] List type. Only a single tensor or a TensorList type is supported. #899

Closed

narendasan closed this Aug 4, 2022

		InputSpecMap pair_input_vals_with_specs(std::vector<const torch::jit::Value*> vals, std::vector<Input> specs) {
		LOG_DEBUG("pair_input_vals_with_specs");

		for (auto in : inputs) {
		LOG_DEBUG("input debug name: " << in->debugName());


		TEST(CppAPITests, TestCollectionListInput) {

		std::string path = "/root/Torch-TensorRT/list_input.ts";

Collection Support [Inprogress] #802

Collection Support [Inprogress] #802

Uh oh!

Conversation

inocsin commented Jan 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODO list and current progress

Description

Type of change

Checklist:

Uh oh!

inocsin commented Mar 17, 2022

Uh oh!

narendasan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

narendasan commented Apr 5, 2022

Uh oh!

inocsin commented Apr 5, 2022

Uh oh!

inocsin commented Apr 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

narendasan commented Apr 9, 2022

Uh oh!

inocsin commented Apr 14, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ncomly-nvidia commented May 6, 2022

Uh oh!

facebook-github-bot commented May 12, 2022

Action Required

Process

Uh oh!

facebook-github-bot commented Jun 11, 2022

Uh oh!

ncomly-nvidia commented Jul 28, 2022

Uh oh!

ntakouris commented Aug 3, 2022

Uh oh!

narendasan commented Aug 4, 2022

Uh oh!

JXQI commented Nov 1, 2023

Uh oh!

Uh oh!

inocsin commented Jan 13, 2022 •

edited

Loading

inocsin commented Apr 8, 2022 •

edited

Loading