Refactor embedding operations and improve configuration #6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #2
Refactor code to use
EmbeddingKey
class for embedding operations.Introduce
EmbeddingKey
class:EmbeddingKey
class to encapsulatetext
andmodel
insrc/operations.py
.src/operations.py
to useEmbeddingKey
instead of tuples for keys.write_embedding_to_table
,is_key_in_table
,list_keys_in_table
, andget_embedding_from_table
to useEmbeddingKey
.Refactor embedding operations:
EmbeddingOperations
class insrc/embedding_operations.py
to encapsulate embedding-related operations.pickle_embeddings
,duckdb_embeddings
, andget_similarity
intoEmbeddingOperations
class.Update embedding functions:
pickle_embeddings
andduckdb_embeddings
functions insrc/embedding.py
to useEmbeddingKey
.Add error handling:
src/openai_client.py
.src/connection.py
.Add configuration file:
config.yaml
with database name, model name, paths to files, OpenAI API key, and number of top documents.Add tests:
EmbeddingOperations
class intests/test_embedding_operations.py
.config.yaml
file intests/test_config.py
.