No description or website provided.
OpenEdge ABL Python
Failed to load latest commit information.
keywords Reading file directly Sep 12, 2012
README.md Adding README file to main repository Sep 14, 2012

README.md

Keywords.py is a python script that generates keywords from a text. It has been used during the SPINDLE project to generate keywords from automatic transcriptions.

How to use it

Usage:

python keywords.py text.txt

or

>>> from keywords import keywords_and_ngrams

Output:

List object containing two lists of tuples. The first list of tuples contains keywords, log-likelihood values. The second list of tuples contains bigrams, number of appearances values.

keyword-0 ll-0
keyword-1 ll-1
keyword-2 ll-2

bigram-0 n-appearances-bigram-0
bigram-1 n-appearances-bigram-1
bigram-2 n-appearances-bigram-2

Example

From the Automatic Keyword Generation from Automatic Speech-to-Text Transcriptions blog post:

[[["automatic", 154.36391852338383], 
["keywords", 100.22612939881635], 
["transcriptions", 71.04632660561263], 
["corpus", 54.20602606031698], 
["generated", 52.54525739261641], 
["word", 43.869201333759946], 
["keyword", 38.434091570196095], 
["reference", 27.60386703890638], 
["accuracy", 26.693961750667555], 
["frequency", 26.58439010818277], 
...
], [[["automatic", "transcriptions"], 3]]]

Parameters

  • nKeywords: number of keywords generated by the script (default 100)
  • thresholdLL: log-likelihood value threshold (default 19)
  • nBigrams: number of bigrams generated by the script (default 25)
  • thresholdBigrams: minimun of appearances of a bigram (default 2)