[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167' #17837

harsh-deepchecks · 2023-10-08T10:32:56Z

Describe the issue

I am using the "nightdessert/WeCheck" model from hugging face. I am trying to use ONNX optimization on the ORT based model but getting the below error when I am using O4 or gpu specific optimizations. I tried using O1, O2, O3 optimizations, but I don't see much benefit in the performance of the original model and the optimized models.
I am getting the below error when loading the O4 optimized model:

NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167'

Added the code to reproduce the error on any environment having a CUDA GPU.

To reproduce

# load the onnx runtime based model
onnx_wecheck_model = ORTModelForSequenceClassification.from_pretrained("nightdessert/WeCheck", export=True)

optimizer = ORTOptimizer.from_pretrained(onnx_wecheck_model)
optimization_config = AutoOptimizationConfig.O4()
optimizer.optimize(save_dir='onnx-optimized-wecheck-model',optimization_config=optimization_config)
opt_model = ORTModelForSequenceClassification.from_pretrained('onnx-optimized-wecheck-model', file_name="model_optimized.onnx", device=device)

Urgency

No response

Platform

Linux

OS Version

20.04.5 LTS (Focal Fossa)

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

onnxruntime-gpu-1.16.0

ONNX Runtime API

Python

Architecture

X64

Execution Provider

CUDA

Execution Provider Library Version

No response

The text was updated successfully, but these errors were encountered:

tianleiwu · 2023-10-09T03:04:25Z

@harsh-deepchecks,
Try add provider="CUDAExecutionProvider" like the following:

import torch
from optimum.onnxruntime import ORTModelForSequenceClassification
opt_model = ORTModelForSequenceClassification.from_pretrained('onnx-optimized-wecheck-model', file_name="model_optimized.onnx", device=torch.device("cuda"), provider="CUDAExecutionProvider")

The default provider is CPU EP. For CUDA, you have to specify it explicitly.

harsh-deepchecks · 2023-10-09T05:01:58Z

Thanks a lot @tianleiwu, it worked perfectly.

github-actions bot added the ep:CUDA issues related to the CUDA execution provider label Oct 8, 2023

harsh-deepchecks closed this as completed Oct 9, 2023

lshqqytiger mentioned this issue Nov 1, 2023

[Bug]: Running Olive with ROCMExecutionProvider. microsoft/Olive#667

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167' #17837

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167' #17837

harsh-deepchecks commented Oct 8, 2023

tianleiwu commented Oct 9, 2023

harsh-deepchecks commented Oct 9, 2023

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167' #17837

[ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for MemcpyToHost(1) node with name 'Memcpy_token_167' #17837

Comments

harsh-deepchecks commented Oct 8, 2023

Describe the issue

To reproduce

Urgency

Platform

OS Version

ONNX Runtime Installation

ONNX Runtime Version or Commit ID

ONNX Runtime API

Architecture

Execution Provider

Execution Provider Library Version

tianleiwu commented Oct 9, 2023

harsh-deepchecks commented Oct 9, 2023