Skip to content
Chris McKenzie edited this page Jun 17, 2015 · 2 revisions

This is a discussion on methodologies to classify small phrases in a fairly-context-free way on the dimension of appropriateness and tact.

The first approach consisted of:

  • stop word removal & input cleansing
  • n-gram analysis with manual weights attributed to key(word/phrases).

In a vocabulary based approach, the nuances of a semantic abrasiveness is lost.


A rough sketch of what's needed.

There should be a multi-pronged approach to aggregate classification. Some of the keyword systems that are used here are viterbi and perhaps this paper. More generally, we are talking about ensemble learning.

Clone this wiki locally