Skip to content

pufe/truffle_hog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TruffleHog

Provides a method to search for matches within a list of documents using TF-IDF.

There are two main use cases: finding which documents are the most similar within the list; finding which document is the most related to a search query.

How to use

Convert each document into a tuple where the first item is an identifier, and the second is a list of tokens. Tokenizer is not included, because you may want to write your own.

Example of document_list:

[{1, ~w(this is a a sample)},
 {2, ~w(this example is another example)}]

Create an index using the function index_documents.

index = list_documents |> TruffleHog.index_documents()

Use find_matches to find the matches on the index.

matches = index |> TruffleHog.find_matches(["search", "items"], quantity)

Installation

If available in Hex, the package can be installed by adding truffle_hog to your list of dependencies in mix.exs:

def deps do
  [
    {:truffle_hog, "~> 0.1.0"}
  ]
end

Documentation can be generated with ExDoc and published on HexDocs. Once published, the docs can be found at https://hexdocs.pm/truffle_hog.

About

Provides a method to search for matches within a list of documents using TF-IDF.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages