# Notes
When performing the lowering transformation, the process gets killed. You can verify this yourself by running the script below. You could also run the same code in an interactive terminal, which gives the same result

In [1]:
from finn.util.visualization import showInNetron
import onnx
from finn.core.modelwrapper import ModelWrapper
from finn.transformation.lower_convs_to_matmul import LowerConvsToMatMul

file_name = "quartznet_modified_4d_inferred_shapes.onnx"
model = ModelWrapper(file_name)
#import pdb; pdb.set_trace()

model = model.transform(LowerConvsToMatMul(), make_deepcopy=True, cleanup=False, fix_float64=True)


I think it is because the model is too large. When I added some print statements in the lower_convs_to_matmul.py file, I noticed that the process stopped at line model.transform(InferShapes()). As each Conv node gets replaced by 3-4 other nodes, the model grows by 171*3.5=600 nodes. In fact, the exact number of nodes is 1350 after the transformation (originally it was 931). \
Now, we can perform the InferShapes transformation on the original model (931 nodes). As only 419 nodes are added, it raises the question whether we can lower the size by perhaps calling several transfransormations as RemoveUnusedTensors

In [4]:
import onnx
from finn.core.modelwrapper import ModelWrapper
from finn.transformation.infer_shapes import InferShapes

file_name = "quartznet_modified_4d_inferred_shapes.onnx"
model = ModelWrapper(file_name)

model = model.transform(InferShapes())

Some clean up before:

In [2]:
import onnx
from finn.core.modelwrapper import ModelWrapper
from finn.transformation.infer_shapes import InferShapes
from finn.transformation.general import RemoveUnusedTensors
from finn.transformation.double_to_single_float import DoubleToSingleFloat
from finn.transformation.lower_convs_to_matmul import LowerConvsToMatMul

file_name = "quartznet_modified_4d_inferred_shapes.onnx"
model = ModelWrapper(file_name)

model = model.transform(RemoveUnusedTensors())
model = model.transform(DoubleToSingleFloat())

# Important:
# Comment line 248 (model=model.transform(InferShapes()) in lower_convs_to_matmul 
# before running the line below). Save the file and restart the kernel.
model = model.transform(LowerConvsToMatMul())

model = model.transform(RemoveUnusedTensors())
model = model.transform(DoubleToSingleFloat())

Now, I cannot save the model, nor perform the InferShapes transformation without the script getting stopped. You can verify this by running the lines below. Another transformation, like MoveAddPastMul, also seems to fail.

In [3]:
model = model.save("modified_"+file_name)

ValueError: Message ONNX_REL_1_7.ModelProto exceeds maximum protobuf size of 2GB: 3381363977

In [None]:
model = model.transform(InferShapes())

In [None]:
from finn.transformation.streamline.reorder import MoveAddPastMul
model = model.transform(MoveAddPastMul())

# How to fix this?
A suggestion is made here: https://github.com/microsoft/onnxruntime/issues/4707 .\
Specifying a higher version of onnxruntime and onnx in the requirements.txt file does not solve the problem unfortunately, because the newer versions of onnxruntime are not found. I think this should be relatively easy to solve, but I do not know how.

Another problem would then remain: the model is too big for the InferShapes() transform. At least, that is what I identify as the problem. Perhaps I am missing something here. \
For now, the quickest solution is to extract a subgraph from the original graph, perform the lowering transformation, and verify whether the modified graph gives the same result as before the transformation. \
As other transformations also might not run, the transformations can be performed on subgraphs (on each repetitive structure in turn) instead of on the complete model.