Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LexRank: Graph-based Lexical Centrality as Salience in Text Summarization, Erkan+, Journal of Artificial Intelligence Research, 2004 #215

Open
AkihikoWatanabe opened this issue Jan 1, 2018 · 1 comment

Comments

@AkihikoWatanabe
Copy link
Owner

http://www.jair.org/media/1523/live-1523-2354-jair.pdf

@AkihikoWatanabe AkihikoWatanabe changed the title LexRank: Graph-based Lexical Centrality as Salience in Text Summarization LexRank: Graph-based Lexical Centrality as Salience in Text Summarization, Erkan+, Journal of Artificial Intelligence Research, 2004 Jan 1, 2018
@AkihikoWatanabe
Copy link
Owner Author

AkihikoWatanabe commented Jan 17, 2018

代表的なグラフベースな(Multi) Document Summarization手法。
ほぼ #214 と同じ手法。

2種類の手法が提案されている:

  • [LexRank] tf-idfスコアでsentenceのbag-of-wordsベクトルを作り、cosine similarityを計算し閾値以上となったsentenceの間にのみedgeを張る(重みは確率的に正規化)。その後べき乗法でPageRank。
  • [ContinousLexRank] tf-idfスコアでsentenceのbag-of-wordsベクトルを作り、cosine similarityを用いてAffinity Graphを計算し、PageRankを適用(べき乗法)。

DUC2003, 2004(MDS)で評価。
Centroidベースドな手法をROUGE-1の観点でoutperform。
document clusterの17%をNoisyなデータにした場合も実験しており、Noisyなデータを追加した場合も性能劣化が少ないことも示している。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant