Refactor TextSearch classes for improved clarity and semantic accuracy #2416

annuaicoder · 2025-06-02T04:24:50Z

Summary

This PR refactors the TextSearch and IndexedTextSearch classes to improve maintainability, accuracy, and readability. Key improvements include:

✅ Introduced a shared BaseTextSearch class to remove duplicate logic
✅ Replaced LevenshteinDistance with SpacySimilarity (if available) for better semantic comparisons
✅ Added support for returning top-N best matches instead of yielding a single result
✅ Improved logging and added detailed inline comments for clarity
✅ Preserved full backward compatibility with ChatterBot’s storage filtering and APIs

Why

The previous implementation had redundant code across classes and relied solely on Levenshtein distance, which does not capture meaning. This update leverages built-in ChatterBot capabilities to improve results while making the code more maintainable.

Notes

Falls back gracefully if in_response_to is missing
Still works without external dependencies like sentence-transformers

Let me know if you’d like this to support a fallback to LevenshteinDistance if SpacySimilarity isn’t available.

Improve search algorithms

7d9f41f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor TextSearch classes for improved clarity and semantic accuracy #2416

Refactor TextSearch classes for improved clarity and semantic accuracy #2416

Uh oh!

annuaicoder commented Jun 2, 2025

Uh oh!

Uh oh!

Refactor TextSearch classes for improved clarity and semantic accuracy #2416

Are you sure you want to change the base?

Refactor TextSearch classes for improved clarity and semantic accuracy #2416

Uh oh!

Conversation

annuaicoder commented Jun 2, 2025

Summary

Why

Notes

Uh oh!

Uh oh!