AttributeError: 'tuple' object has no attribute 'size' #195

fabiofumarola · 2022-11-10T16:26:15Z

fixed #141 . I have tested it only on one huggingface model but it should work for every model.
The only problem is that I could not fixe the verification of the out file

for more information, see https://pre-commit.ci

mert-kurttutan · 2022-11-10T17:30:59Z

I think the fix proposed here is a good start but too specific. It seems to work only for outputs of nested tuple types. It would be better if we take an approach where we traverse the data (in a nested way) to capture sizes of all individual tensors.

An example of such nested traverse would be traverse_input_data function in torchinfo.py. With appropriate action_fn, we can get the sizes all individual tensors (and collect them inside a list maybe).

I also propose another test case where some of the hidden layers have nested output and inputs, with nested structure being complicated enough. By complicated enough, I mean dict inside list inside tuple, etc.

This way seems more robust to me.

fabiofumarola · 2022-11-11T08:24:53Z

Ok. 2 questions:

Can I add test adding the transformers dependency only for test?
I have tried to perform the testing with the --overwrite but still the test fails.

I'll update the pull request as you suggested.

mert-kurttutan · 2022-11-11T08:54:42Z

Do you mean installing transformer package? I think we should add a test case where hidden layers has the nested output input structure that I mentioned. Maybe you can create custom model that have these properties and add to the fixture/models file.
I think transformers models seem important enough that we test them. Then, you need to modify github actions workflow files to install transformers. This might introduce additional version compatibility issues. For instance, it might not compatible with past versions of torch.
See the error reason in the checks of details. You have modify test.yml file to install transformer package.

snimu · 2023-01-12T18:04:36Z

The summary output has a problem: Total params is 76,961,152 but that's not the sum of the Param #-columns, which is 128,746,176.
Does this have anything to do with the proposed changes or is it a problem somewhere else in torchinfo?

snimu · 2023-01-13T08:17:15Z

The summary output has a problem: Total params is 76,961,152 but that's not the sum of the Param #-columns, which is 128,746,176. Does this have anything to do with the proposed changes or is it a problem somewhere else in torchinfo?

Got it: in the summary_list, some items have negative num_params.

I tested this by putting the following code into ModelStatistics.__init__(...):

warnstr = "\n"
for layer_info in summary_list:
    if layer_info.is_recursive:
        continue
    num_params = layer_info.num_params if layer_info.is_leaf_layer else layer_info.leftover_params()
    warnstr += f"{layer_info.class_name}: {num_params}\n"
import warnings
warnings.warn(warnstr).  # warnings.warn instead of print so that pytest doesn't suppress its output

I applied the fix from this pull-request in LayerInfo.calculate_size(...) to make summary work.

Then, I ran the test provided in this pull-request, which tests the following model: AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-small"). Running the given test yielded this output:

tests/torchinfo_xl_test.py::test_huggingface
  /Users/sebastianmuller/Documents/Programmieren/torchinfo/torchinfo/model_statistics.py:35: UserWarning: 
  T5ForConditionalGeneration: -51782336
  T5Stack: 35332800
  Embedding: 16449536
  Dropout: 0
  ModuleList: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Embedding: 192
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5LayerNorm: 512
  T5Stack: 16449536
  Dropout: 0
  ModuleList: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Embedding: 192
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5Block: 0
  ModuleList: 0
  T5LayerSelfAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerCrossAttention: 0
  T5LayerNorm: 512
  T5Attention: 0
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Linear: 196608
  Dropout: 0
  T5LayerFF: 0
  T5LayerNorm: 512
  T5DenseGatedActDense: 0
  Linear: 524288
  NewGELUActivation: 0
  Linear: 524288
  Dropout: 0
  Linear: 524288
  Dropout: 0
  T5LayerNorm: 512
  Linear: 16449536

Summing them all up gives 76,961,152. Ignoring the negative number (-51,782,336) for num_params gives 128,743,488, the correct result (of course, negative numbers have to be ignored for trainable_params, leftover_params(), and leftover_trainable_params() as well). Since I don't quite understand where the negative number is coming from, I can't quite vouch for the correctness of the results, only their consistency.

I will (probably) open a pull-request addressing this problem and issue#141 soon, just wanted to answer my own question so that nobody is left hanging when they read this thread.

TylerYep · 2023-02-05T20:17:18Z

Thanks for the work here all. Closing in favor of #212 .

Fabio Fumarola and others added 4 commits November 10, 2022 17:24

fixes TylerYep#141

87e3e11

[pre-commit.ci] auto fixes from pre-commit.com hooks

c630a7a

for more information, see https://pre-commit.ci

added missing transformer

a10ba31

Merge branch 'main' of github.com:fabiofumarola/torchinfo

1ddb8dc

snimu mentioned this pull request Jan 13, 2023

Enable analyzing nested input- and output-dicts #212

Merged

TylerYep closed this Feb 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: 'tuple' object has no attribute 'size' #195

AttributeError: 'tuple' object has no attribute 'size' #195

fabiofumarola commented Nov 10, 2022

mert-kurttutan commented Nov 10, 2022

fabiofumarola commented Nov 11, 2022

mert-kurttutan commented Nov 11, 2022

snimu commented Jan 12, 2023

snimu commented Jan 13, 2023

TylerYep commented Feb 5, 2023

AttributeError: 'tuple' object has no attribute 'size' #195

AttributeError: 'tuple' object has no attribute 'size' #195

Conversation

fabiofumarola commented Nov 10, 2022

mert-kurttutan commented Nov 10, 2022

fabiofumarola commented Nov 11, 2022

mert-kurttutan commented Nov 11, 2022

snimu commented Jan 12, 2023

snimu commented Jan 13, 2023

TylerYep commented Feb 5, 2023