UniStack

An experimental duplicate question search algorithm for Stack Exchange.

Current Algorithm

As this is an experimental project, everything is subject to change.

Extract question sentence + context words, then model remianing words.

Question sentence: words starting from an MD or Wxx tagged word till the end of the sentence.
Context words: any NN tagged word.
Model: PoS tagged bag-of-words (unigram).
Weighting: TF-IDF.
Similarity function: cosine.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
NFastTag @ edb5c43		NFastTag @ edb5c43
StackExchange @ b03fd2e		StackExchange @ b03fd2e
UniStack		UniStack
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
UniStack.sln		UniStack.sln