Skip to content

NicolasKieffer/tdm-skeeft

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tdm-skeeft

tdm-skeeft is a tdm module for terme exctraction of structured text. It can be used to get keywords (or summary) of document.

Installation

Using npm :

$ npm i -g tdm-skeeft
$ npm i --save tdm-skeeft

Using Node :

/* require of Skeeft module */
const Skeeft = require('tdm-skeeft');

/* Build new Instance of Matrix */
let matrix = new Skeeft.Matrix();

/* Build new Instance of Indexator */
let indexator = new Skeeft.Indexator();

Launch tests

$ npm run test

Build documentation

$ npm run docs

API Documentation

Classes

Indexator

Functions

Matrix(options)

Constructor

Indexator

Kind: global class

new Indexator([options])

Returns: Indexator - - An instance of Indexator

Param Type Description
[options] Object Options of constructor
[options.filters] Object Filters options given to title & fulltext extractors
[options.filters.title] Filter Options given to extractor of title
[options.filters.fulltext] Filter Options given to extractor of fulltext
[options.stopwords] Object Stopwords
[options.dictionary] Object Dictionnary

Example (Example usage of 'contructor' (with paramters))

let options = {
    'filters': {
      'title' : customTitleFilter, // According customTitleFilter contain your custom settings
      'fulltext' : customFulltextFilter, // According customFulltextFilter contain your custom settings
    },
    'dictionary': myDictionary, // According myDictionary contain your custom settings
    'stopwords': myStopwords // According myStopwords contain your custom settings
  },
  indexator = new Indexator(options);
// returns an instance of Indexator with custom options

Example (Example usage of 'contructor' (with default values))

let indexator = new Indexator();
// returns an instance of Indexator with default options

indexator.summarize(xmlString, selectors, indexation, delimiter) ⇒ Array

Summarize a fulltext

Kind: instance method of Indexator
Returns: Array - List of extracted sentences (representative summary)

Param Type Description
xmlString String Fulltext (XML formated string)
selectors Object Used selectors
selectors.title String Used selectors
selectors.title Object Used selectors
indexation Object Indexation of xmlString
delimiter RegExp Delimiter used to split text into sentences

Example (Example usage of 'summarize' function)

let indexator = new Indexator();
indexator.summarize(xmlString, {'title' :'title', 'segments': ['paragraph1', 'paragraph2']}, indexator.index(xmlString)); // return an ordered Array of Object [{...}, {...}]

indexator.index(xmlString, selectors, criterion) ⇒ Array

Index a fulltext

Kind: instance method of Indexator
Returns: Array - List of extracted keywords

Param Type Description
xmlString String Fulltext (XML formated string)
selectors Object Used selectors
criterion String Criterion used (sort)

Example (Example usage of 'index' function)

let indexator = new Indexator();
indexator.index(xmlString, {'title' :'title', 'segments': ['paragraph1', 'paragraph2']}); // return an ordered Array of Object [{...}, {...}]

Matrix(options)

Constructor

Kind: global function
Returnsthis:

Param Type Description
options Object Options of constructor

Example (Example usage of 'contructor' (with paramters))

let options = {
    'boost': 20
  },
  matrix = new Matrix(options);
// returns an instance of Matrix with custom options

Example (Example usage of 'contructor' (with default values))

let matrix = new Matrix();
// returns an instance of Matrix with default options

matrix.init(indexations, selectors)

Init each values of this object

Kind: instance method of Matrix
Returns{object}: Return 'this' reference

Param Type Description
indexations Array Array filled with indexations of each segments
selectors Array Array filled with each segment's name

matrix.fill(criterion)

Fill a matrix with values of choosen criterion

Kind: instance method of Matrix
Returns{matrix}: Return a mathsjs matrix filled with values (and )

Param Type Description
criterion string Key of term object

matrix.stats(m)

Calcul some statistics

Kind: instance method of Matrix
Returns{object}: Return an object with some statistcs :

  • FR (rappel d’étiquetage),
  • FP (précision d’étiquetage),
  • FF (F-mesure d’étiquetage),
  • rowsFF (nb of terms for each rows of FF matrix),
  • colsFF (nb of terms for each columns of FF matrix),
  • mFF (mean of FF)
Param Type Description
m Matrix Matrix (mathjs.matrix())

matrix.select(stats, boost, criterion)

Select terms

Kind: instance method of Matrix
Returns{object}: Return an object with selected terms

Param Type Description
stats Object Statistics of Text (result of Matrix.stats())
boost Object List of boosted terms (Object with key = term)
criterion String Criterion used by skeeft (frequency or specificity)

Matrix.sort(terms, compare)

Sort all terms with the 'compare' function

Kind: static method of Matrix
Returns{array}: Return the array of sorted terms

Param Type Description
terms Object List of terms
compare function Compare function

Matrix.compare(a, b)

Compare two elements depending of its factor

Kind: static method of Matrix
Returns{integer}: return 1, -1 or 0

Param Type Description
a Object First object
b Object Second object

About

Skeeft tdm module

Resources

Stars

Watchers

Forks

Packages

No packages published