Skip to content

tanajp/SIFRank_ja

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SIFRank

Original paper SIFRank: A New Baseline for Unsupervised Keyphrase Extraction Based on Pre-trained Language Model

requirements

allennlp==0.8.4
nltk==3.4.3
torch==1.2.0
stanza==1.0.0

Download

  • ELMo(Japanese model) weights.hdf5 from here , and save it to the auxiliary_data/ directory

Sample usage

import sys
sys.path.append('/content/drive/My Drive/SIFRank_ja')
sys.path.append('/content/drive/My Drive/SIFRank_ja/embeddings')
import stanza
import sent_emb_sif, word_emb_elmo
from model.method import SIFRank, SIFRank_plus

#download from https://allennlp.org/elmo
options_file = "https://exawizardsallenlp.blob.core.windows.net/data/options.json"
weight_file = "/content/drive/My Drive/SIFRank_ja/auxiliary_data/weights.hdf5"

ELMO = word_emb_elmo.WordEmbeddings(options_file, weight_file, cuda_device=0)
SIF = sent_emb_sif.SentEmbeddings(ELMO, lamda=1.0)
ja_model = stanza.Pipeline(
    lang="ja", processors={}, use_gpu=True
)
elmo_layers_weight = [0.0, 1.0, 0.0]

text = "ここにテキストを入力してください。"
keyphrases = SIFRank(text, SIF, ja_model, N=5,elmo_layers_weight=elmo_layers_weight)
keyphrases_ = SIFRank_plus(text, SIF, ja_model, N=5, elmo_layers_weight=elmo_layers_weight)

print(keyphrases)
print(keyphrases_)

Cite

If you use this code, please cite this paper

@article{DBLP:journals/access/SunQZWZ20,
  author    = {Yi Sun and
               Hangping Qiu and
               Yu Zheng and
               Zhongwei Wang and
               Chaoran Zhang},
  title     = {SIFRank: {A} New Baseline for Unsupervised Keyphrase Extraction Based
               on Pre-Trained Language Model},
  journal   = {{IEEE} Access},
  volume    = {8},
  pages     = {10896--10906},
  year      = {2020},
  url       = {https://doi.org/10.1109/ACCESS.2020.2965087},
  doi       = {10.1109/ACCESS.2020.2965087},
  timestamp = {Fri, 07 Feb 2020 12:04:22 +0100},
  biburl    = {https://dblp.org/rec/journals/access/SunQZWZ20.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

About

sifrank_ja_model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages