Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add rnn en doc #9809

Merged
merged 5 commits into from
Apr 24, 2018
Merged

add rnn en doc #9809

merged 5 commits into from
Apr 24, 2018

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Apr 10, 2018

fix #9574

@shanyi15 shanyi15 self-requested a review April 11, 2018 08:11
Copy link
Collaborator

@shanyi15 shanyi15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just review grammar, please check.
technical expression please reconfirm.
Thanks

The existing RNN implementations of the PaddlePaddle is `RecurrentLayerGroup`,
which supports the variable length sequences without padding.
This doc will design fluid's RNN based on this idea.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Background没有翻译,请问这部分内容是否不需要了呢?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以不要了,设计的时候背景变化了很多

This doc will design fluid's RNN based on this idea.

## Multi-layer sequence data format `LODTensor`
At present, Paddle will store data in one mini-batch in one-dimensional array.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at present和will在时间方面矛盾,可以用Paddle stores data.....


`Argument.sequenceStartPositions` is used to store information for each sentence.

In Paddle, `Argument.subSequenceStartPositions` is used to store 2 levels of sequence information, while higher dimensional sequences can not be supported.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看到中文是“两层”,这里翻译为2 levels不确定是否准确(或2 layers?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 level 表示两个层次

```
Each `level_t` here stores a level of offset information consistent with paddle's current practice.

In order to transmit sequence information more transparently, we have introduced a new tensor called `LODTensor`[4].
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为background里还有参考文献[1][2][3],若不翻会导致[4]突然出现。

};
```
Among them, `lod_start_pos_` uses `shared_ptr` to reduce the cost of storage and replication.
Think of `LODTensor` as an extension of `Tensor`, which is almost completely compatible with the original `Tensor`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LODTensor can be thought as an extension of Tensor


## How to support the framework
### Replace `Tensor` with `LoDTensor`
To implement the passing of `LODTensor`, many `Tensor` in the framework need to be `LODTensor`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

上文中传递翻译为“transmit”,可以统一一下,在这里将passing也改为transmit


## How to support the framework
### Replace `Tensor` with `LoDTensor`
To implement the passing of `LODTensor`, many `Tensor` in the framework need to be `LODTensor`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“变成”翻译为“become”会更符合语境

## How to support the framework
### Replace `Tensor` with `LoDTensor`
To implement the passing of `LODTensor`, many `Tensor` in the framework need to be `LODTensor`.
Simple implementation, directly ** replace all previous `Tensor` with `LODTensor`, where you can directly modify the `Tensor` interface created in `pybind.cc`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可能是想加粗,但少打了两个**,预览看没显示对


In addition, the user may need to perceive the existence of a sequence (such as the sequence of the visualization needs to parse the output sequence in the model), so some of the serial operation APIs also need to be exposed to the python layer.

### Pass `lod_start_pos` along with the Op call chain
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass----transmit可以统一一下说法

The framework needs to support the following features to implement the passing of `lod_start_pos`:

1. Implement the transfer as `shared_ptr`
     - Do not modify the contents of `lod_start_pos` as a consumer
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里可能是格式没写对,请用preview查看

@Superjomn
Copy link
Contributor Author

Superjomn commented Apr 16, 2018

这个design相当一部分内容没有实现出来,但没时间大改,所以删掉一些不一样的地方。 @shanyi15

Copy link
Collaborator

@shanyi15 shanyi15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shanyi15 shanyi15 merged commit 576b9fd into PaddlePaddle:develop Apr 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Translate RNN 变长输入设计 to English
2 participants