Skip to content
No description or website provided.
OpenEdge ABL Python
Find file
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
keywords
README.md

README.md

Keywords.py is a python script that generates keywords from a text. It has been used during the SPINDLE project to generate keywords from automatic transcriptions.

How to use it

Usage:

python keywords.py text.txt

or

>>> from keywords import keywords_and_ngrams

Output:

List object containing two lists of tuples. The first list of tuples contains keywords, log-likelihood values. The second list of tuples contains bigrams, number of appearances values.

keyword-0 ll-0
keyword-1 ll-1
keyword-2 ll-2

bigram-0 n-appearances-bigram-0
bigram-1 n-appearances-bigram-1
bigram-2 n-appearances-bigram-2

Example

From the Automatic Keyword Generation from Automatic Speech-to-Text Transcriptions blog post:

[[["automatic", 154.36391852338383], 
["keywords", 100.22612939881635], 
["transcriptions", 71.04632660561263], 
["corpus", 54.20602606031698], 
["generated", 52.54525739261641], 
["word", 43.869201333759946], 
["keyword", 38.434091570196095], 
["reference", 27.60386703890638], 
["accuracy", 26.693961750667555], 
["frequency", 26.58439010818277], 
...
], [[["automatic", "transcriptions"], 3]]]

Parameters

  • nKeywords: number of keywords generated by the script (default 100)
  • thresholdLL: log-likelihood value threshold (default 19)
  • nBigrams: number of bigrams generated by the script (default 25)
  • thresholdBigrams: minimun of appearances of a bigram (default 2)
Something went wrong with that request. Please try again.