sume

The sume module is an automatic summarization library written in Python.

Description

sume contains the following extraction algorithms:

Concept-based ILP model for summarization (Gillick & Favre, 2009)

A typical usage of this module is:

import sume

# directory from which text documents to be summarized are loaded. Input
# files are expected to be in one tokenized sentence per line format.
dir_path = "/tmp/"

# create a summarizer, here a concept-based ILP model
s = sume.models.ConceptBasedILPSummarizer(dir_path)

# load documents with extension 'txt'
s.read_documents(file_extension="txt")

# compute the parameters needed by the model
# extract bigrams as concepts
s.extract_ngrams()

# compute document frequency as concept weights
s.compute_document_frequency()

# prune sentences that are shorter than 10 words, identical sentences and
# those that begin and end with a quotation mark
s.prune_sentences(mininum_sentence_length=10,
                  remove_citations=True,
                  remove_redundancy=True)

# solve the ilp model
value, subset = s.solve_ilp_problem()

# outputs the summary
print '\n'.join([s.sentences[j].untokenized_form for j in subset])

Citing the sume module

If you use sume, please cite the following paper:

Florian Boudin, Hugo Mougard and Benoît Favre, Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP).

Contributors

Florian Boudin
Hugo Mougard

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
sume		sume
.gitignore		.gitignore
LICENCE.md		LICENCE.md
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sume

sume

.gitignore

.gitignore

LICENCE.md

LICENCE.md

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

sume

Description

Citing the sume module

Contributors

About

Releases

Packages

Languages

License

PedroPovedaQ/sume

Folders and files

Latest commit

History

Repository files navigation

sume

Description

Citing the sume module

Contributors

About

Resources

License

Stars

Watchers

Forks

Languages