[Bug] Slice indexing in ONNX #94

soodoshll · 2023-02-12T07:00:17Z

Basically, ONNX uses extremely large numbers to represent slicing until the end of certain dimensions, which will be prohibited the defensive conditions in

hidet/python/hidet/graph/ops/definitions/transform.py

Line 481 in 80a35d6

if not (-n <= i <= n and -n <= j <= n):

yaoyaoding · 2023-02-16T19:46:34Z

The indexing of hidet tensor would follow the specification of ArrayAPI standard. Thus, we need to do deal with the difference between onnx sematics and ArrayAPI standard when importing onnx model.

Could you please provide some examples to trigger this error? If not, we could leave it to the future when some actual model triggers this error.

soodoshll · 2023-02-16T19:52:45Z

Yes, a very simple snippet (torch->onnx->hidet) can trigger this error:

import torch
import hidet
import onnx

class Foo(torch.nn.Module):
    def __init__(self):
        super().__init__()
    
    def forward(self, x):
        return x[2:]

device = 'cuda'

model = Foo()
model.to(device)

x = torch.ones([100], dtype=torch.int32, device=device)
z = model(x)

torch.onnx.export(model, (x,), 'tmp.onnx', input_names = ['x'],
                  output_names = ['z'])
model = onnx.load('tmp.onnx')

hidet.torch.dynamo_config.search_space(1)

x = hidet.from_torch(x)
symbol_data = [hidet.symbol_like(x)]
hidet_onnx_module = hidet.graph.frontend.from_onnx(model)
symbol_output = hidet_onnx_module(*symbol_data)
graph: hidet.FlowGraph = hidet.trace_from(symbol_output, inputs=symbol_data)
with hidet.graph.PassContext() as ctx:
    graph_opt: hidet.FlowGraph = hidet.graph.optimize(graph)
cuda_graph = graph_opt.cuda_graph()
outputs = cuda_graph.run([x])
print(outputs[0])

yaoyaoding · 2023-02-16T21:54:27Z

Thanks @soodoshll, working on it.

yaoyaoding · 2023-02-17T02:38:28Z

Thanks @soodoshll, this error should be fixed in #106.

Add graph module for using flash attention and clarify some differences in flash attention vs torch sdpa. **Attention: (pun intended)** Softmax has temperature scaling option. Divides inputs by scalar, good explanation of numerical effects [here](https://medium.com/@harshit158/softmax-temperature-5492e4007f71). Used when softmax inputs QK are too big for float 16 (abs value > 65504). This usually means the numbers are so large that dividing by small (< 4) scalar has little effect. Stable diffusion does not use this, as torch spda supports float 32 (or somehow avoids NaNs from large values). No visual or significant numeric differences in this output layer noticed. Towards #57.

yaoyaoding mentioned this issue Feb 16, 2023

[ONNX] Fix the out of bound error in onnx slice function during importing #106

Merged

yaoyaoding closed this as completed in #106 Feb 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Slice indexing in ONNX #94

[Bug] Slice indexing in ONNX #94

soodoshll commented Feb 12, 2023 •

edited

Loading

yaoyaoding commented Feb 16, 2023

soodoshll commented Feb 16, 2023

yaoyaoding commented Feb 16, 2023

yaoyaoding commented Feb 17, 2023

[Bug] Slice indexing in ONNX #94

[Bug] Slice indexing in ONNX #94

Comments

soodoshll commented Feb 12, 2023 • edited Loading

yaoyaoding commented Feb 16, 2023

soodoshll commented Feb 16, 2023

yaoyaoding commented Feb 16, 2023

yaoyaoding commented Feb 17, 2023

soodoshll commented Feb 12, 2023 •

edited

Loading