gifts

Searching for elements that have the common features with the query.

query = ['A', 'B']

elements = [
    ['N', 'A', 'M'],  # common features: 'A'
    ['C', 'B', 'A'],  # common features: 'A', 'B'  
    ['X', 'Y']  # no common features
]

In this case, the search with return ['C', 'B', 'A'] and ['N', 'A', 'M'] in that particular order.

Use for full-text search

Finding documents that contain words from the query.

from gifts import SmoothFts

fts = SmoothFts()

fts.add(["wait", "mister", "postman"],
        doc_id="doc1")

fts.add(["please", "mister", "postman", "look", "and", "see"],
        doc_id="doc2")

fts.add(["oh", "yes", "wait", "a", "minute", "mister", "postman"],
        doc_id="doc3")

# print IDs of documents in which at least one word of the query occurs, 
# starting with the most relevant matches
for doc_id in fts.search(['postman', 'wait']):
    print(doc_id)

Use for abstract data mining

In the examples above, the words were literally words as strings. But they can be any objects suitable as dict keys.

from gifts import SmoothFts

fts = SmoothFts()

fts.add([3, 1, 4, 1, 5, 9, 2], doc_id="doc1")
fts.add([6, 5, 3, 5], doc_id="doc2")
fts.add([8, 9, 7, 9, 3, 2], doc_id="doc3")

for doc_id in fts.search([5, 3, 7]):
    print(doc_id)

Implementation details

When ranking the results, the algorithm takes into account::

the number of matching words
the rarity of such words in the database
the frequency of occurrence of words in the document

SmoothFts

from gifts import SmoothFts

It uses logarithmic tf-idf for weighting the words and cosine similarity for scoring the matches.

SimpleFts

from gifts import SimpleFts

Minimalistic approach: weigh, multiply, compare. This object is noticeably faster than SmoothFts.

Install

pip

pip3 install git+https://github.com/rtmigo/gifts_py#egg=gifts

setup.py

install_requires = [
    "gifts@ git+https://github.com/rtmigo/gifts_py"
]

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
gifts		gifts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
lint.py		lint.py
mypy.ini		mypy.ini
requirements.txt		requirements.txt
setup.py		setup.py
test_pkg.py		test_pkg.py
test_unit.py		test_unit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gifts

Use for full-text search

Use for abstract data mining

Implementation details

SmoothFts

SimpleFts

Install

pip

setup.py

See also

About

Releases

Contributors 2

Languages

License

rtmigo/gifts_py

Folders and files

Latest commit

History

Repository files navigation

gifts

Use for full-text search

Use for abstract data mining

Implementation details

SmoothFts

SimpleFts

Install

pip

setup.py

See also

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Contributors 2

Languages