You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if use_gpu:
backend = "cuda"
backend_config = "cuda"
args = ["--iree-cuda-llvm-target-arch=sm_80", "--iree-hal-cuda-disable-loop-nounroll-wa"]
ireert.flags.FUNCTION_INPUT_VALIDATION = False
ireert.flags.parse_flags("--cuda_allow_inline_execution")
...
# Setting up input on host and moving to device.
host_inputs =[encoded_input["input_ids"], encoded_input["attention_mask"], encoded_input["token_type_ids"]]
if use_gpu:
device_inputs = [ireert.asdevicearray(config.device, a) for a in host_inputs]
else:
device_inputs = host_inputs
The text was updated successfully, but these errors were encountered:
We need model the CUDA backend in SHARK to be similar to:
https://github.com/nod-ai/transformer-benchmarks/blob/435984a420a2f285f717aa4752c14c0cabfd8c96/benchmark.py#L397-L437
The text was updated successfully, but these errors were encountered: