Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/beam search op #5052

Merged
merged 14 commits into from
Nov 15, 2017
Merged

Conversation

Superjomn
Copy link
Contributor

@Superjomn Superjomn commented Oct 24, 2017

the beam search operator for source decoder.

fixes: #5018

leave the python test in another PR, after source decoder is ready.

Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • My concern is that comments in this PR are hard to follow for users.
  • I am not sure it can work well with other dependent operators, because it lacks test and modifications of some merged operators may be required, so are we going to merge this PR right now?

AddAttr<int>("beam_size", "beam size for beam search");

AddComment(
"This is a beam search operator that help to generate sequences.");
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems oversimple, making it hard to understand how this operator works for most users (I guess it is also hard to understand for reviewers).
For example:

  • what is the format of its inputs/outputs?
  • what are the operators it should be co-work with (this operator itself cannot complete the entire beam search process)

and so on ...

framework::OpAttrChecker *op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
// inputs and outputs stored in proto
AddInput("ids", "the candidate ids");
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I prefer each comment is a complete sentence. I think it is better to use the upper casing for the first letter in a sentence, and add period at the end sentences, and make sure the comment is a meaningful sentence(easy to understand) rather than a self-defined phrase.
  • I prefer to use the unabbreviated word index instead of ids in the comment. Because I think the latter one is not a commonly used abbreviation for everyone.

distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. */
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation of the license has some problem. It should be like that in the accuracy_op.h. (Confirmed this with qingqing.)

return true;
}

void BeamSearchAlgorithm::CollectSelectedResult(
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I prefer to use BeamSearch instead of BeamSearchAlgorithm. But if using BeamSearch, the name seems to be confusing with the operator's name.
My concern is that:

  • In most references I have read, people usually user "beam search", and "beam search algorithm" seems a little strange to me (This is a personal feeling, I think maybe we can listen to other reviewer's advice).
  • This operator has only been involved in two steps in the entire beam search process: "the beam expansion" and "beam shrink". Itself cannot complete the entire generation process. It has to co-work with other operators and the framework. So for me, BeamSearchAlgorithm seems to be a "too large word".

auto *scores_data = selected_scores->data<score_t>();

for (size_t high_id = 0; high_id < ids_->NumElements(lod_level_);
high_id++) { // source sentence level
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Personally, I like to use i++ instead of ++i, but admittedly, there is little difference when i is an integer. (The google style also says that there is no reason to give preference when i is an integer, so I think there is no strong reason to change this.)
  • The comment // source sentence level here is an incomplete sentence. We have had an online discussion, so I can understand it, but I am worried it may be hard to follow for many other users.

* This is an implementation of beam search algorithm, input the candidate ids
* and candidate scores, it will sort and select the top beam_size candidates,
* return the corresponding ids and scores.
*/
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kind of information can also be added into the operator's comments to let users know some details of this operators: the input of this operator expects and what it outputs.

As a user, I certainly also want to know (at least) the below questions:

  • Where are the candidate ids and the candidate scores from? Which operator generates the candidate ids and the candidate scores?
  • Who will use the returned results of this operator?

using id_t = size_t;
using score_t = float;
/*
* Input the arguments that needed by this class.
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comments lack of useful information. This makes me think what are the meanings of these inputs? and who produces these inputs(this determines how I can use this operator).

* It stores the corresponding scores of candidate ids in selected_ids.
*
* Return false if all the input tensor is empty, in machine translation task
* that means no candidates is provided, and the task will stop running.
Copy link
Contributor

@lcy-seso lcy-seso Oct 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • a small grammatical mistake: Return false if all the input tensors are empty.
  • Is it necessary to make the comments/this operator highly depends on the machine translation task?

const std::vector<std::vector<Item>>& inputs);

/*
* For each source, select top beam_size records.
Copy link
Contributor

@lcy-seso lcy-seso Oct 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each unfinished prefix.

I think we can avoid using "source" because it is not a common concept in standard beam search. The source is used in the seq-to-seq model, but beam search can be used more than that.

auto& scores = scores_var->Get<framework::LoDTensor>();
size_t level = Attr<int>("level");
size_t beam_size = Attr<int>("beam_size");
BeamSearchAlgorithm alg(ids, scores, level, beam_size);
Copy link
Contributor

@lcy-seso lcy-seso Oct 29, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think BeamSearchAlgorithm is not a good name. I wrote my reasons in the above comments.

* ids:
* LoD (should have 2 levels)
* first level: [0, 1, 3]
* second level: [0, 1, 2, 3, 4]
Copy link
Member

@jacquesqiao jacquesqiao Nov 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[0, 1, 2, 3, 4] should be [0, 3, 6, 9]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first level should be [0, 1, 4], the second level is right.

Copy link
Member

@jacquesqiao jacquesqiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Superjomn Superjomn merged commit 09866fb into PaddlePaddle:develop Nov 15, 2017
@Superjomn Superjomn deleted the feature/beam_search_op branch November 15, 2017 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Beam search operator
4 participants