STS via Jaccard Similarity of Tsetlin Machine Optimized Topic Sets

What's in this codebase?

Word embeddings are clustered via K-means++ clustering into K groups via cosine similarity (note that the K-means++ variant is used to ensure optimal clustering)
Words are modeled as a bitmap of length K with the bit corresponding to its cluster set
Sentences are modeled as an OR over all of the word-bitmaps therein
A Tsetlin Machine is trained to learn which bits should be flipped as to optimize for the correct Jaccard Similarity between sentence bitmaps.

Given a sentence represention described above, use one Tsetlin Machine for each bit classifying whether or not the bit was set correctly as to align the Jaccard Similarity of sentences-as-bitmaps with human-labeled data
A Tsetlin Machine returning 0 means "do not change this bit," whereas 1 means "flip the bit"
Initialize all Tsetlin Machines with random internal states
Calculate the initial total loss by iterating through each training set pair and taking the overall sum of the differences between the Jaccard Similarities of the bitmap representations and the human-labeled similarity.
Calculate the next total loss by iterating through each training pair as above but pass the bitmap representations through the TMs and adjust their bits accordingly before taking the Jaccard similarities and comparing to the human labeled data
For each TM, if changing its individual decision would have resulted in a lower overall loss, reward it, otherwise penalize it.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
doc		doc
lib/santoku		lib/santoku
test		test
.gitignore		.gitignore
README.md		README.md
make.lua		make.lua