Graph-based Biomedical Text Summarizer
-
Download the source code of the graph-based biomedical text summarizer.
-
Extract the zip file.
-
Download the BERT repository from https://github.com/google-research/bert, and copy the files to the BERT directory already available with the summarizer.
-
Download a BioBERT pretrained model from https://github.com/naver/biobert-pretrained, and copy the files to the BERT directory already available with the summarizer.
-
Copy your input document (preferably a txt file) to the INPUT directory already available with the summarizer.
-
Run the following script:
- python Summarizer.py -i INPUT_FILE_NAME -o OUTPUT_FILE_NAME -c COMPRESSION_RATE -k TOP_K_SIMILARITY -r RANKING_ALGORITHM
-
Five parameters must be specified when running the script:
- INPUT_FILE_NAME is the name of input file already copied to the INPUT directory.
- OUTPUT_FILE_NAME is the name of output file containing the summary that will be created in the OUTPUT directory.
- COMPRESSION_RATE specifies the size of summary and takes a value in the range (0, 1).
- TOP_K_SIMILARITY specifies the top K percent of similarity values between sentences that will be used to construct the edges of the graph.
- RSNKING_ALGORITHM specifies the graph ranking algorithm and takes a value from (pr, hits, ppf)
-
After finishing the summarization process, the summary can be found in the OUTPUT directory already available with the summarizer.
Note: A newer version of the summarizer that works with Word2vec and GloVe embeddings will be uploaded by the end of November 2019.