Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bilistm_crf模型中的为啥要sort_by_lengths(word_lists, tag_lists),作用是啥。 #37

Open
tianke0711 opened this issue Jun 6, 2021 · 1 comment

Comments

@tianke0711
Copy link

tianke0711 commented Jun 6, 2021

bilistm_crf模型中的为啥要sort_by_lengths(word_lists, tag_lists),作用是啥。在训练中有啥好处。 如果不排序是否也没事呢。


def sort_by_lengths(word_lists, tag_lists):
    pairs = list(zip(word_lists, tag_lists))
    indices = sorted(range(len(pairs)),
                     key=lambda k: len(pairs[k][0]),
                     reverse=True)
    pairs = [pairs[i] for i in indices]
    # pairs.sort(key=lambda pair: len(pair[0]), reverse=True)

    word_lists, tag_lists = list(zip(*pairs))

    return word_lists, tag_lists, indices
@qukequke
Copy link

前面没对batch中的句子做补零,在取到batch之后,用batch中的第一个句子的长度作为最大长度 对batch内的剩余句子进行补零,所以排序是有用的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants