Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

完成「prepare model」的翻译 #71

Merged
merged 3 commits into from
May 27, 2018
Merged

Conversation

edvardHua
Copy link
Contributor

对应的 Issue

@edvardHua edvardHua mentioned this pull request May 8, 2018
@leviding
Copy link
Member

leviding commented May 8, 2018

resolve #28

@jasonxia23
Copy link

@leviding 认领校对

@leviding
Copy link
Member

leviding commented May 9, 2018

@jasonxia23 ok

@luochen1992
Copy link

@leviding 认领校对

@leviding
Copy link
Member

leviding commented May 9, 2018

@luochen1992 ok

Copy link

@luochen1992 luochen1992 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

文中的Operations 翻译成运算是否更好呢?


- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto): Defines a single operation in a model. It has a unique name, a list of the names of other nodes it pulls inputs from, the operation type it implements (for example `Add`, or `Mul`), and any attributes that are needed to control that operation. This is the basic unit of computation for TensorFlow, and all work is done by iterating through a network of these nodes, applying each one in turn. One particular operation type that’s worth knowing about is `Const`, since this holds information about a constant. This may be a single, scalar number or string, but it can also hold an entire multi-dimensional tensor array. The values for a `Const` are stored inside the `NodeDef`, and so large constants can take up a lot of room when serialized.
- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto): 定义了模型中一个单独的操作。它有唯一的名字,列表储存着输入到该节点名字(来源于其他节点),实现的操作类型(譬如 `Add`,或者 `Mul`),以及控制该操作所需要的属性值。它是 TensorFlow 计算中的基础单元,所有的任务都是通过逐个迭代网络中的这些节点来完成的。有一个特别的操作是我们需要知道的,那就是 `Const`,它包含的信息是一个常量。`Const` 操作可以是一个数值或者字符串,甚至它可以保存一个多位的张量数组。`Const` 的值是储存在 `NodeDef` 里面的,所以一个大的常量会在序列化后会占据较大的空间。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

『一个多位的张量数组』=>『一个多维的张量数组』

The `input_checkpoint` should be the most recent saved checkpoint. As mentioned in the checkpoint section, you need to give the common prefix to the set of checkpoints here, rather than a full filename.
`input_graph` 参数指向 `GraphDef` 文件,它包含了模型的框架。如果 `GraphDef` 文件是以文本的格式保存,也就是后缀是 `.pbtxt` 而不是 `.pb` 的话,你需要给命令添加额外的参数 `--input_binary=false`。

`input_checkpoint` 应该是最近一次保存的检查点文件。如上所属,你需要传递一个通用的前缀来引用它,而不是完成的文件名。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

『如上所属』=>『如上所述』


There are hundreds of operations available in TensorFlow, and each one has multiple implementations for different data types. On mobile platforms, the size of the executable binary that’s produced after compilation is important, because app download bundles need to be as small as possible for the best user experience. If all of the ops and data types are compiled into the TensorFlow library then the total size of the compiled library can be tens of megabytes, so by default only a subset of ops and data types are included.
TensorFlow 支持上百中不同的操作,而且针对不同的数据类型还有多种不同的实现。在移动平台上,为了能够获得最好的用户体验,通常情况下都会要求编译好的二进制可执行文件尽量的小。如果我们将所有的操作和数据类型都集成到 TensorFlow 库中的话,将会占据好几兆的空间,所以我们的依赖库只会包含一部分的操作和数据类型。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

『支持上百中不同的操作』=>『支持上百种不同的操作』

@luochen1992
Copy link

@edvardHua @leviding 校对完成

@leviding
Copy link
Member

@edvardHua 可以修改啦

Copy link

@jasonxia23 jasonxia23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leviding 校对完成
@edvardHua 很多地方对语法理解有误,造成译文与原文出入较大


You may find yourself getting very confused by all the different ways that TensorFlow can save out graphs. To help, here’s a rundown of some of the different components, and what they are used for. The objects are mostly defined and serialized as protocol buffers:
有时候我们会困惑为什么 TensorFlow 保存模型的方法会有这么多,它们之间的区别是什么。
为了帮助理解,下面简单的介绍了一部分不同组件用处。这些对象大多数以 Protocol Buffers 机制的形式序列化保存到文件里。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

多了一个换行符


You may find yourself getting very confused by all the different ways that TensorFlow can save out graphs. To help, here’s a rundown of some of the different components, and what they are used for. The objects are mostly defined and serialized as protocol buffers:
有时候我们会困惑为什么 TensorFlow 保存模型的方法会有这么多,它们之间的区别是什么。
为了帮助理解,下面简单的介绍了一部分不同组件用处。这些对象大多数以 Protocol Buffers 机制的形式序列化保存到文件里。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

简单的介绍
=>
简单地介绍


You may find yourself getting very confused by all the different ways that TensorFlow can save out graphs. To help, here’s a rundown of some of the different components, and what they are used for. The objects are mostly defined and serialized as protocol buffers:
有时候我们会困惑为什么 TensorFlow 保存模型的方法会有这么多,它们之间的区别是什么。
为了帮助理解,下面简单的介绍了一部分不同组件用处。这些对象大多数以 Protocol Buffers 机制的形式序列化保存到文件里。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

protocol buffers 宜译出 “协议缓冲区”


- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto): Defines a single operation in a model. It has a unique name, a list of the names of other nodes it pulls inputs from, the operation type it implements (for example `Add`, or `Mul`), and any attributes that are needed to control that operation. This is the basic unit of computation for TensorFlow, and all work is done by iterating through a network of these nodes, applying each one in turn. One particular operation type that’s worth knowing about is `Const`, since this holds information about a constant. This may be a single, scalar number or string, but it can also hold an entire multi-dimensional tensor array. The values for a `Const` are stored inside the `NodeDef`, and so large constants can take up a lot of room when serialized.
- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto): 定义了模型中一个单独的操作。它有唯一的名字,列表储存着输入到该节点名字(来源于其他节点),实现的操作类型(譬如 `Add`,或者 `Mul`),以及控制该操作所需要的属性值。它是 TensorFlow 计算中的基础单元,所有的任务都是通过逐个迭代网络中的这些节点来完成的。有一个特别的操作是我们需要知道的,那就是 `Const`,它包含的信息是一个常量。`Const` 操作可以是一个数值或者字符串,甚至它可以保存一个多位的张量数组。`Const` 的值是储存在 `NodeDef` 里面的,所以一个大的常量会在序列化后会占据较大的空间。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

列表储存着输入到该节点名字(来源于其他节点)
=>
一个作为它拉取输入来源的其他节点的名称列表

- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto): Defines a single operation in a model. It has a unique name, a list of the names of other nodes it pulls inputs from, the operation type it implements (for example `Add`, or `Mul`), and any attributes that are needed to control that operation. This is the basic unit of computation for TensorFlow, and all work is done by iterating through a network of these nodes, applying each one in turn. One particular operation type that’s worth knowing about is `Const`, since this holds information about a constant. This may be a single, scalar number or string, but it can also hold an entire multi-dimensional tensor array. The values for a `Const` are stored inside the `NodeDef`, and so large constants can take up a lot of room when serialized.
- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto): 定义了模型中一个单独的操作。它有唯一的名字,列表储存着输入到该节点名字(来源于其他节点),实现的操作类型(譬如 `Add`,或者 `Mul`),以及控制该操作所需要的属性值。它是 TensorFlow 计算中的基础单元,所有的任务都是通过逐个迭代网络中的这些节点来完成的。有一个特别的操作是我们需要知道的,那就是 `Const`,它包含的信息是一个常量。`Const` 操作可以是一个数值或者字符串,甚至它可以保存一个多位的张量数组。`Const` 的值是储存在 `NodeDef` 里面的,所以一个大的常量会在序列化后会占据较大的空间。

- [Checkpoint](https://www.tensorflow.org/code/tensorflow/core/util/tensor_bundle/tensor_bundle.h):通过使用 `Variable` 操作,我们也可以保存模型中的值。与 `Const` 操作不同的是,它不需要以 `NodeDef` 的形式保存内容,所以只占用 `GraphDef` 文件中很少的空间。与在训练中周期性保存内存中的值到 Checkpoint (后文检测检查点) 文件中不同的是,`Variable` 操作一般在训练中更新模型权重的时候用到。所以它是一个时序要求严格的操作,当使用分布式架构来训练模型的时候,多个 worker 会在不同的时间点执行该操作,因此储存模型的文件格式必须能够被快速读取且具备一定的扩展性。模型将会保存成多个检查点文件,包括用来描述检查点都保存了什么信息的元文件。当你在 API 中引用检查点文件时(譬如说当你将检查点文件名当参数传递给命令行),你将会用到文件的前缀来引用相关联的文件,例如:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkpoint 宜译出,检查点


- Are they only useful in back-propagation, for gradients? Since mobile is focused on inference, we don’t include these.
- 移动端只专注推断,因此在后向传播中计算梯度用到的操作和类型是不需要包含。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后向传播
=>
反向传播

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其实这两种说法应该都没错,不过反向传播更常用一些。


Operations are broken into two parts. The first is the op definition, which declares the signature of the operation, which inputs, outputs, and attributes it has. These take up very little space, and so all are included by default. The implementations of the op computations are done in kernels, which live in the `tensorflow/core/kernels` folder. You need to compile the C++ file containing the kernel implementation of the op you need into the library. To figure out which file that is, you can search for the operation name in the source files.
操作将会被分为两部分。第一部分是操作的定义,里面声明了操作的签名,譬如输入,输出以及属性。这些只占据很小的空间,而且都是默认包含的。操作的计算和实现都是在内核中实现的,它在源码的路径是 `tensorflow/core/kernels`,通过添加 C++ 操作的实现,您可以编译自己需要的操作到库中。通过在源文件中搜索操作的名字,您可以找到您需要的文件。

[Here’s an example search in github](https://github.com/search?utf8=%E2%9C%93&q=repo%3Atensorflow%2Ftensorflow+extension%3Acc+path%3Atensorflow%2Fcore%2Fkernels+REGISTER+Mul&type=Code&ref=searchresults).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这段需要删除

### Add the implementation to the build
### 在构建中添加实现

如果您在使用 Bazel 构建安卓应用,那么需要添加 [`android_extended_ops_group1`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3565) 或 [`android_extended_ops_group2`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3632) 作为构建目标。同时也需要包含里面所有的 .cc 文件。如果在构建中抛出没有头文件的异常,那么您可以添加[`android_extended_ops`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3525)作为构建目标。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[android_extended_ops] 左右的空格


If you’re using Bazel, and building for Android, you’ll want to add the files you’ve found to the [`android_extended_ops_group1`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3565) or [`android_extended_ops_group2`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3632) targets. You may also need to include any .cc files they depend on in there. If the build complains about missing header files, add the .h’s that are needed into the [`android_extended_ops`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3525) target.
如果您使用 makefile 为 IOS 或 Raspberry Pi 等设备构建应用,那么请到[`tensorflow/contrib/makefile/tf_op_files.txt`](https://www.tensorflow.org/code/tensorflow/contrib/makefile/tf_op_files.txt)添加相关的实现文件。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOS
=>
iOS


If you’re using Bazel, and building for Android, you’ll want to add the files you’ve found to the [`android_extended_ops_group1`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3565) or [`android_extended_ops_group2`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3632) targets. You may also need to include any .cc files they depend on in there. If the build complains about missing header files, add the .h’s that are needed into the [`android_extended_ops`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3525) target.
如果您使用 makefile 为 IOS 或 Raspberry Pi 等设备构建应用,那么请到[`tensorflow/contrib/makefile/tf_op_files.txt`](https://www.tensorflow.org/code/tensorflow/contrib/makefile/tf_op_files.txt)添加相关的实现文件。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[tensorflow/contrib/makefile/tf_op_files.txt] 左右的空格

@leviding
Copy link
Member

@jasonxia23 译文与原文出入较大的地方是否全都校对出来?可以帮忙全部校对出来,积分可以调整。

@jasonxia23
Copy link

@leviding 先让作者自行检查一遍吧~

@leviding
Copy link
Member

@edvardHua 修改一遍吧

@edvardHua
Copy link
Contributor Author

@leviding 修改了一遍啦,感谢 @jasonxia23 @luochen1992 的校对。

另外最近捣鼓了一个手机端的单人姿态估计,用到了 TensorFlow Lite,代码已经开源 PoseEstimationForMobile,欢迎各位大佬 star 和提 PR。

@leviding
Copy link
Member

@jasonxia23

@leviding
Copy link
Member

@jasonxia23 还校对吗

@jasonxia23
Copy link

@leviding 我回头再看一遍吧~

@leviding
Copy link
Member

@jasonxia23 尽快蛤

Copy link

@jasonxia23 jasonxia23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edvardHua 麻烦再改下哈
@leviding 好了,不好意思这周太忙了


- [GraphDef](https://www.tensorflow.org/code/tensorflow/core/framework/graph.proto): Has a list of `NodeDefs`, which together define the computational graph to execute. During training, some of these nodes will be `Variables`, and so if you want to have a complete graph you can run, including the weights, you’ll need to call a restore operation to pull those values from checkpoints. Because checkpoint loading has to be flexible to deal with all of the training requirements, this can be tricky to implement on mobile and embedded devices, especially those with no proper file system available like iOS. This is where the [`freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py) script comes in handy. As mentioned above, `Const` ops store their values as part of the `NodeDef`, so if all the `Variable` weights are converted to `Const` nodes, then we only need a single `GraphDef` file to hold the model architecture and the weights. Freezing the graph handles the process of loading the checkpoints, and then converts all Variables to Consts. You can then load the resulting file in a single call, without having to restore variable values from checkpoints. One thing to watch out for with `GraphDef` files is that sometimes they’re stored in text format for easy inspection. These versions usually have a ‘.pbtxt’ filename suffix, whereas the binary files end with ‘.pb’.
- [GraphDef](https://www.tensorflow.org/code/tensorflow/core/framework/graph.proto):保存着 `NodeDefs` 列表,定义着计算图是如何被运行的。在训练中,有一些节点可能是 `Variables`,所以如果你想要一个完整的可运行的图,也即包含权重的,您需要调用恢复操作从检查点文件中提取这些值。检查点文件的格式设计的很灵活以至于能够满足我们训练的所有要求,通过一些技巧来移植模型到手机或其他嵌入设备内,尤其是像 IOS 设备那种具备特殊文件系统的。脚本 [`freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py) 就是用来生成一个完整的可运行的图的。上面我们讲解过,`Const` 操作是作为 `NodeDef` 中的值储存的,因此如果将所有的 `Variable` 转换成 `Const` 节点的话,那么一个单独的 `GraphDef` 文件就已经包含了模型的结构和权重了。冻结网络的流程包含加载检查点文件,转换 `Variables` 为 `Consts` 这两个过程。然后您便可以抛弃检查点文件,单独调用 GraphDef 文件来加载模型了。需要注意的是有时候 `GraphDef` 文件会被保存为文本的格式以便我们查看里面的值,这种情况下文件后缀为 `.pbtxt`,否则后缀为 `.pb`。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOS 改成 iOS 吧


The trickiest part of this process is figuring out the names of the nodes you want to use as inputs and outputs during inference. You'll need these anyway once you start to run inference, but you also need them here so that the transform can calculate which nodes are not needed on the inference-only path. These may not be obvious from the training code. The easiest way to determine the node name is to explore the graph with TensorBoard.
这个过程中最棘手的部分就是要弄清楚在推断过程中那些节点对应的名字是作为输入和输出的。输入输出节点的名字不仅在运行推断过程中会被用到,而且在转换过程中也需要根据它来判断推断的路径从而得知那些节点是不需要的。在 Tensorboard 中视察图结构是最容易得知这些节点的方法。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

那些
=>
哪些


Remember that mobile applications typically gather their data from sensors and have it as arrays in memory, whereas training typically involves loading and decoding representations of the data stored on disk. In the case of Inception v3 for example, there’s a `DecodeJpeg` op at the start of the graph that’s designed to take JPEG-encoded data from a file retrieved from disk and turn it into an arbitrary-sized image. After that there’s a `BilinearResize` op to scale it to the expected size, followed by a couple of other ops that convert the byte data into float and scale the value magnitudes it in the way the rest of the graph expects. A typical mobile app will skip most of these steps because it’s getting its input directly from a live camera, so the input node you will actually supply will be the output of the `Mul` node in this case.
请记住,移动应用程序通常从传感器收集数据,并将其作为内存中的数组,但是训练过程通常涉及对储存在磁盘上的数据进行加载和解码。例如,在 Inception V3 的情况下,图的开始部分有一个 `DecodeJpeg` 操作,它设计的目的是将从磁盘检索到的文件中的 jpeg 编码数据转换成任意大小的图像。在此之后,有一个`双线性调整`操作将其扩展到预期的大小,然后是其他一些操作,它们将字节数据转换为浮点数,并按图中其余部分所期望的方式缩放数值。一个典型的移动应用程序会跳过这些的步骤,因为它直接从摄像头中实时获得输入,所以你将提供的输入节点将是 `Mul` 节点的输出。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

双线性调整
=>
BilinearResize


- Are they useful mainly for other training needs, such as checkpoint saving? These we leave out.
- 移动端只专注推断,因此在后向传播中计算梯度用到的操作和类型是不需要包含。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是不需要包含
=>
不需要包含


If you’re using Bazel, and building for Android, you’ll want to add the files you’ve found to the [`android_extended_ops_group1`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3565) or [`android_extended_ops_group2`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3632) targets. You may also need to include any .cc files they depend on in there. If the build complains about missing header files, add the .h’s that are needed into the [`android_extended_ops`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3525) target.
如果您使用 makefile 为 IOS 或 Raspberry Pi 等设备构建应用,那么请到 [`tensorflow/contrib/makefile/tf_op_files.txt`](https://www.tensorflow.org/code/tensorflow/contrib/makefile/tf_op_files.txt) 添加相关的实现文件。

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOS
=>
iOS

@leviding
Copy link
Member

@edvardHua 可以修改啦

@leviding leviding merged commit d6ef421 into xitu:zh-hans May 27, 2018
@leviding
Copy link
Member

问题我修改了哈

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants