(tutorial-relay-quick-start)=

# 编译深度学习模型的快速入门教程
**作者**: [Yao Wang](https://github.com/kevinthesun), [Truman Tian](https://github.com/SiNZeRo)

这个例子展示了如何用 Relay python 前端构建一个神经网络，并通过 TVM 为 Nvidia GPU 生成运行时库。注意，你需要在启用 cuda 和 llvm 的情况下构建 TVM。

## 支持的 TVM 硬件后端概述

下图显示了 TVM 目前支持的硬件后端：

![](images/tvm_support_list.png)

在本教程中，我们将选择 cuda 和 llvm 作为目标后端。首先，让我们导入 Relay 和 TVM。

In [1]:
from env import tvm

In [2]:
import numpy as np

from tvm import relay
from tvm.relay import testing
import tvm
from tvm import te
from tvm.contrib import graph_executor
import tvm.testing

## 在 Relay 中定义神经网络

首先，用 relay 的 python 前端定义神经网络。为了简单起见，将使用 Relay 中预先定义的 resnet-18 网络。参数用 Xavier 初始化器进行初始化。Relay 也支持其他模型格式，如 MXNet、CoreML、ONNX 和 Tensorflow。

在本教程中，假设将在我们的设备上进行推理，并且批量大小被设置为 1。输入图像是大小为 224*224 的 RGB 彩色图像。可以调用 {py:meth}`tvm.relay.expr.TupleWrapper.astext` 来显示网络结构。

In [3]:
batch_size = 1
num_class = 1000
image_shape = (3, 224, 224)
data_shape = (batch_size,) + image_shape
out_shape = (batch_size, num_class)

mod, params = relay.testing.resnet.get_workload(
    num_layers=18, batch_size=batch_size, image_shape=image_shape
)

# set show_meta_data=True if you want to show meta data
print(mod.astext(show_meta_data=False))

#[version = "0.0.5"]
def @main(%data: Tensor[(1, 3, 224, 224), float32], %bn_data_gamma: Tensor[(3), float32], %bn_data_beta: Tensor[(3), float32], %bn_data_moving_mean: Tensor[(3), float32], %bn_data_moving_var: Tensor[(3), float32], %conv0_weight: Tensor[(64, 3, 7, 7), float32], %bn0_gamma: Tensor[(64), float32], %bn0_beta: Tensor[(64), float32], %bn0_moving_mean: Tensor[(64), float32], %bn0_moving_var: Tensor[(64), float32], %stage1_unit1_bn1_gamma: Tensor[(64), float32], %stage1_unit1_bn1_beta: Tensor[(64), float32], %stage1_unit1_bn1_moving_mean: Tensor[(64), float32], %stage1_unit1_bn1_moving_var: Tensor[(64), float32], %stage1_unit1_conv1_weight: Tensor[(64, 64, 3, 3), float32], %stage1_unit1_bn2_gamma: Tensor[(64), float32], %stage1_unit1_bn2_beta: Tensor[(64), float32], %stage1_unit1_bn2_moving_mean: Tensor[(64), float32], %stage1_unit1_bn2_moving_var: Tensor[(64), float32], %stage1_unit1_conv2_weight: Tensor[(64, 64, 3, 3), float32], %stage1_unit1_sc_weight: Tensor[(64, 64, 1

## 编译

下一步是使用 Relay/TVM 管道对模型进行编译。用户可以指定编译的优化级别。目前这个值可以是 0 到 3。优化 passes 包括算子融合（operator fusion）、预计算（pre-computation）、布局变换（layout transformation）等。

{py:func}`relay.build` 返回三个部分：json 格式的执行图，TVM 模块库中专门为这个图在目标硬件上编译的函数，以及模型的参数 blobs。在编译过程中，Relay 做了图层面的优化，而 TVM 做了张量层面的优化，从而产生了一个优化的运行模块为模型服务。

我们将首先为 Nvidia GPU 进行编译。在幕后， {py:func}`relay.build` 首先做了一些图层面的优化，例如修剪（pruning）、融合（fusing）等，然后将运算符（即优化后的图的节点）注册到 TVM 实现中，生成 `tvm.module`。为了生成模块库，TVM 将首先把高层 IR 转移到指定目标后端的低层内在 IR 中，在这个例子中是 CUDA。然后机器代码将被生成为模块库。

In [4]:
opt_level = 3
target = tvm.target.cuda()
with tvm.transform.PassContext(opt_level=opt_level):
    lib = relay.build(mod, target, params=params)

URLError(ConnectionRefusedError(111, 'Connection refused'))
Download attempt 0/3 failed, retrying.
URLError(ConnectionRefusedError(111, 'Connection refused'))
Download attempt 1/3 failed, retrying.
Download attempt 0/3 failed, retrying.
Download attempt 1/3 failed, retrying.


TVMError: Traceback (most recent call last):
  133: TVMFuncCall
  132: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::relay::backend::RelayBuildModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  131: tvm::relay::backend::RelayBuildModule::Build(tvm::IRModule, tvm::runtime::Array<tvm::Target, void> const&, tvm::relay::Executor const&, tvm::relay::Runtime const&, tvm::WorkspaceMemoryPools const&, tvm::runtime::String const&)
  130: tvm::relay::backend::RelayBuildModule::BuildRelay(tvm::IRModule, tvm::runtime::String const&)
  129: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<tvm::relay::backend::GraphExecutorCodegenModule::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#2}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)
  128: tvm::relay::backend::GraphExecutorCodegen::Codegen(tvm::IRModule, tvm::relay::Function, tvm::runtime::String)
  127: tvm::transform::Pass::operator()(tvm::IRModule) const
  126: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  125: tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  124: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  123: tvm::transform::ModulePassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  122: _ZN3tvm7runtime13PackedFuncObj9ExtractorINS0_16PackedFuncSubObjIZNS0_15TypedPackedFuncIFNS_8IRModuleES5_NS_9transform11PassContextEEE17AssignTypedLambdaIZNS_5relay3tec11LowerTEPassERKNS0_6StringESt8functionIFvNS_8BaseFuncEEENS_13VirtualDeviceEEUlS5_S7_E_EEvT_EUlRKNS0_7TVMArgsEPNS0_11TVMRetValueEE_EEE4CallEPKS1_SN_SR_
  121: tvm::relay::tec::LowerTE(tvm::IRModule const&, tvm::runtime::String const&, std::function<void (tvm::BaseFunc)>, tvm::VirtualDevice)
  120: tvm::transform::Pass::operator()(tvm::IRModule) const
  119: tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  118: tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const
  117: _ZN3tvm7runtime13PackedFuncObj9ExtractorINS0_16PackedFuncSubObjIZNS0_15TypedPackedFuncIFNS_5relay8FunctionES6_NS_8IRModuleENS_9transform11PassContextEEE17AssignTypedLambdaIZNS5_3tec15LowerTensorExprERKNS0_6StringENSD_10TECompilerESt8functionIFvNS_8BaseFuncEEENS_13VirtualDeviceEEUlS6_S7_S9_E_EEvT_EUlRKNS0_7TVMArgsEPNS0_11TVMRetValueEE_EEE4CallEPKS1_SP_ST_
  116: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  115: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  114: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::FunctionNode const*)
  113: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::FunctionNode const*)
  112: _ZN3tvm5relay9transform22DeviceAwareExprMutator21DeviceAwareVisit
  111: tvm::relay::ExprMutator::VisitExpr_(tvm::relay::FunctionNode const*)
  110: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  109: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  108: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  107: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  106: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  105: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  104: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  103: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  102: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  101: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  100: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  99: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  98: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  97: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  96: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  95: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  94: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  93: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  92: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  91: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  90: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  89: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  88: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  87: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  86: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  85: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  84: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  83: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  82: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  81: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  80: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  79: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  78: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  77: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  76: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  75: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  74: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  73: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  72: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  71: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  70: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  69: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  68: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  67: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  66: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  65: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  64: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  63: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  62: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  61: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  60: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  59: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  58: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  57: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  56: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  55: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  54: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  53: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  52: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  51: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  50: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  49: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  48: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  47: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  46: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  45: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  44: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  43: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  42: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  41: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  40: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  39: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  38: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  37: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  36: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  35: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  34: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  33: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  32: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  31: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  30: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  29: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  28: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  27: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  26: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  25: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  24: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  23: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  22: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  21: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  20: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  19: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  18: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  17: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  16: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  15: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  14: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  13: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  12: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  11: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  10: tvm::relay::ExprMutator::VisitExpr(tvm::RelayExpr const&)
  9: _ZZN3tvm5relay11ExprFunctorIFNS_9RelayExprERKS2_EE10InitVTableEvENUlRKNS_7r
  8: tvm::relay::transform::DeviceAwareExprMutator::VisitExpr_(tvm::relay::CallNode const*)
  7: tvm::relay::tec::LowerTensorExprMutator::DeviceAwareVisitExpr_(tvm::relay::CallNode const*)
  6: tvm::relay::tec::LowerTensorExprMutator::MakeLoweredCall(tvm::relay::Function, tvm::runtime::Array<tvm::RelayExpr, void>, tvm::Span, tvm::Target)
  5: tvm::relay::tec::TECompilerImpl::Lower(tvm::relay::tec::CCacheKey const&, tvm::runtime::String)
  4: tvm::relay::tec::TECompilerImpl::LowerInternal(tvm::relay::tec::CCacheKey const&, std::function<tvm::runtime::String (tvm::runtime::String)>)
  3: tvm::relay::tec::PrimFuncFor(tvm::relay::Function const&, tvm::Target const&, std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)>)
  2: tvm::relay::tec::ScheduleBuilder::Create(tvm::relay::Function const&, std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)>)
  1: tvm::relay::OpImplementation::Schedule(tvm::Attrs const&, tvm::runtime::Array<tvm::te::Tensor, void> const&, tvm::Target const&)
  0: tvm::runtime::PackedFuncObj::Extractor<tvm::runtime::PackedFuncSubObj<TVMFuncCreateFromCFunc::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#2}> >::Call(tvm::runtime::PackedFuncObj const*, tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*) [clone .cold]
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/http/client.py", line 1447, in connect
    super().connect()
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/http/client.py", line 941, in connect
    self.sock = self._create_connection(
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/socket.py", line 845, in create_connection
    raise err
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/socket.py", line 833, in create_connection
    sock.connect(sa)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/_ffi/_ctypes/packed_func.py", line 81, in cfun
    rv = local_pyfunc(*pyargs)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/relay/op/strategy/generic.py", line 51, in wrapper
    return topi_schedule(outs)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/autotvm/task/topi_integration.py", line 242, in wrapper
    return topi_schedule(cfg, outs, *args, **kwargs)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/topi/cuda/conv2d.py", line 46, in schedule_conv2d_nchw
    traverse_inline(s, outs[0].op, _callback)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/topi/utils.py", line 81, in traverse_inline
    _traverse(final_op)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/topi/utils.py", line 78, in _traverse
    _traverse(tensor.op)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/topi/utils.py", line 78, in _traverse
    _traverse(tensor.op)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/topi/utils.py", line 79, in _traverse
    callback(op)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/topi/cuda/conv2d.py", line 44, in _callback
    schedule_direct_cuda(cfg, s, op.output(0))
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/topi/cuda/conv2d_direct.py", line 47, in schedule_direct_cuda
    ref_log = autotvm.tophub.load_reference_log(
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/autotvm/tophub.py", line 220, in load_reference_log
    download_package(tophub_location, package_name)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/autotvm/tophub.py", line 182, in download_package
    download(download_url, Path(rootpath, package_name), overwrite=True)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/contrib/download.py", line 125, in download
    raise err
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/site-packages/tvm/contrib/download.py", line 112, in download
    urllib2.urlretrieve(url, download_loc, reporthook=_download_progress)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 241, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/home/xinet/anaconda3/envs/py310/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:

urllib.error.URLError: <urlopen error [Errno 111] Connection refused>

## 运行生成库

现在我们可以创建图执行器并在 Nvidia GPU 上运行该模块。

In [5]:
# create random input
dev = tvm.cuda()
data = np.random.uniform(-1, 1, size=data_shape).astype("float32")
# create module
module = graph_executor.GraphModule(lib["default"](dev))
# set input and parameters
module.set_input("data", data)
# run
module.run()
# get output
out = module.get_output(0, tvm.nd.empty(out_shape)).numpy()

# Print first 10 elements of output
print(out.flatten()[0:10])

[0.00089283 0.00103331 0.0009094  0.00102275 0.00108751 0.00106737
 0.00106262 0.00095838 0.00110792 0.00113151]


## 保存和加载已编译的模块

也可以将 graph、lib 和参数保存到文件中，并在部署环境中加载它们。

In [6]:
# save the graph, lib and params into separate files
from tvm.contrib import utils

temp = utils.tempdir()
path_lib = temp.relpath("deploy_lib.tar")
lib.export_library(path_lib)
print(temp.listdir())

['deploy_lib.tar']


In [7]:
# load the module back.
loaded_lib = tvm.runtime.load_module(path_lib)
input_data = tvm.nd.array(data)

module = graph_executor.GraphModule(loaded_lib["default"](dev))
module.run(data=input_data)
out_deploy = module.get_output(0).numpy()

# Print first 10 elements of output
print(out_deploy.flatten()[0:10])

# check whether the output from deployed module is consistent with original one
tvm.testing.assert_allclose(out_deploy, out, atol=1e-5)

[0.00089283 0.00103331 0.0009094  0.00102275 0.00108751 0.00106737
 0.00106262 0.00095838 0.00110792 0.00113151]
