v.0.0.4

veekaybee released this 06 Jul 18:17

· 61 commits to main since this release

What's Changed

The previous SBERT all-MiniLM-L6-v2 was not giving great results, so doing a couple of large changes to improve results of the model:

Using asymmetric search much more like a search engine would with msmarco-distilroberta-base-v3
Including more training data: book author and book review from the original goodreads dataset to pull out semantic meaning in the search results
Reconfiguring Redis indexer to write everything as a hash versus having multiple hash lookups
Remove Grafana for now - more noise than it's worth
Including the author in the resultset

Screenshot 2023-07-05 at 2 20 32 PM

Screenshot 2023-07-05 at 2 15 28 PM

Screenshot 2023-07-05 at 2 11 58 PM

A number of fixes: by @veekaybee in #31
Use PyTorch CPU only dependencies by @veekaybee in #33
Retrain embeddings using asymmetric semantic search by @veekaybee in #39
Fixing Docker Compose for prod by @veekaybee in #41
Indexing speed by @veekaybee in #42

Full Changelog: v.0.0.3...v.0.0.4

Contributors

veekaybee

Assets 2