Tools for extractive summarization based on lexical and prosodic features
R Shell Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bashscripts
ineventwrappers
nxtscripts
praatscripts
pyscripts
rscripts
sentence_breaks
sgescripts
.gitignore
LICENSE
README.md

README.md

inEvent Summarization Code

Tools for extractive summarization based on lexical and prosodic features for the inEvent Project. Most of the code is written in R with bash shell script wrappers.

Requires:

  • praat (for prosodic features)
  • R (3.1.1 or higher)
  • python: Theano (for MLP)

The main wrapper is ineventwrappers/inevent-summarization.sh which at the moment takes as input:

  • ASR output in JSON format (Augmented ASR in the inEvent API)
  • A .wav file
  • A media file (e.g. .mp4)
  • The working directory It returns a JSON file of time aligned quotes from with extractive summary scores (probabilities).

Note a large amount of extra feature files and debug information is currently produced. More documentation will appear in the future.

Notes:

  • Although the relevant function is called extract-spurt-feats.sh, we could be using any sort of segment really with a start and endtime. For this system, this is output after processing ASR json files.