Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TN] speed up compilation #35

Merged
merged 1 commit into from
Oct 9, 2022
Merged

[TN] speed up compilation #35

merged 1 commit into from
Oct 9, 2022

Conversation

pengzhendong
Copy link
Member

No description provided.

@xingchensong xingchensong merged commit 5f44a5a into master Oct 9, 2022
@xingchensong xingchensong deleted the zhendong-tn branch October 9, 2022 09:07
| add_weight(math, 1.08)
| add_weight(char, 100))
# insert space between tokens, and remove the last space
tagger = self.build_rule(tagger + insert(' '))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build_rule 调用的是 cdrewrite 函数,cdrewrite(tagger, '', '', self.VSIGMA) 可以达到 tagger.star 的效果。但是 cdrewrite 中对 taggerself.VSIGMA 使用了 compose,导致速度特别慢。

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great Catch!

@xingchensong
Copy link
Member

以识别的视角来看,相当于:

  1. 之前是 TL+Union(G_1, G_2),在G上union,然后需要和TL做compose,这里TL相当于VSIGMA,不同的G相当于不同的tagger
  2. 现在是Union(TLG_1, TLG_2), 在TLG完整体上union,没有了compose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants