Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
OpenEdge ABL
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
keywords
README.md

README.md

Keywords.py is a python script that generates keywords from a text. It has been used during the SPINDLE project to generate keywords from automatic transcriptions.

How to use it

Usage:

python keywords.py text.txt

or

>>> from keywords import keywords_and_ngrams

Output:

List object containing two lists of tuples. The first list of tuples contains keywords, log-likelihood values. The second list of tuples contains bigrams, number of appearances values.

keyword-0 ll-0
keyword-1 ll-1
keyword-2 ll-2

bigram-0 n-appearances-bigram-0
bigram-1 n-appearances-bigram-1
bigram-2 n-appearances-bigram-2

Example

From the Automatic Keyword Generation from Automatic Speech-to-Text Transcriptions blog post:

[[["automatic", 154.36391852338383], 
["keywords", 100.22612939881635], 
["transcriptions", 71.04632660561263], 
["corpus", 54.20602606031698], 
["generated", 52.54525739261641], 
["word", 43.869201333759946], 
["keyword", 38.434091570196095], 
["reference", 27.60386703890638], 
["accuracy", 26.693961750667555], 
["frequency", 26.58439010818277], 
...
], [[["automatic", "transcriptions"], 3]]]

Parameters

  • nKeywords: number of keywords generated by the script (default 100)
  • thresholdLL: log-likelihood value threshold (default 19)
  • nBigrams: number of bigrams generated by the script (default 25)
  • thresholdBigrams: minimun of appearances of a bigram (default 2)
Something went wrong with that request. Please try again.