Minor tweaks to retrieval skill and file ingestion #9093

dusvyat · 2024-04-18T10:46:16Z

Description

The default chunk size and overlap values in file_handler have been increased for better performance.
In langchain_handler, checks have been added to use the same provider and model for embeddings if 'embedding_model_args' are not provided in input params.
Skill_tool's retriever_config has been renamed to 'config' for clarity.
Catch LLM output parser errors and return response if relevant or raise error if genuine error.

Fixes #issue_number

Type of change

(Please delete options that are not relevant)

🐛 Bug fix (non-breaking change which fixes an issue)
⚡ New feature (non-breaking change which adds functionality)
📢 Breaking change (fix or feature that would cause existing functionality not to work as expected)
📄 This change requires a documentation update

Verification Process

To ensure the changes are working as expected:

Test Location: Specify the URL or path for testing.
Verification Steps: Outline the steps or queries needed to validate the change. Include any data, configurations, or actions required to reproduce or see the new functionality.

Additional Media:

I have attached a brief loom video or screenshots showcasing the new functionality or change.

Checklist:

My code follows the style guidelines(PEP 8) of MindsDB.
I have appropriately commented on my code, especially in complex areas.
Necessary documentation updates are either made or tracked in issues.
Relevant unit and integration tests are updated or added.

The default chunk size and overlap values in file_handler have been increased for better performance. In langchain_handler, checks have been added to use the same provider and model for embeddings if 'embedding_model_args' are not provided in input params. Skill_tool's retriever_config has been renamed to 'config' for clarity.

This update modifies the langchain handler to accommodate errors during the invocation process. Specifically, it implements exception handling for the agent_executor's invoke method. If an error occurs, instead of crashing, it will now return the error message. However, if the error doesn't match a specific format, the exception will still be raised.

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

dusvyat · 2024-04-18T11:31:33Z

cc @ZoranPandovski

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py

This change introduces logging for situations where 'vector_store_config' is not present in the `langchain_handler` tool configuration. It also modifies the condition that checks for the absence of this property to add a persisting directory. Furthermore, it refactors the code in `langchain_handler.py` related to the absence of 'embedding_model_args'.

The default vector store setup was removed from the langchain handler. The 'vector_store_config' is no longer automatically assigned, and an alert for this missing parameter is no longer logged. The code was adjusted to operate without the previously used 'mindsdb_path'.

tmichaeldb

Nice changes, let's get this merged for the hackathon

dusvyat added 2 commits April 18, 2024 13:42

dusvyat commented Apr 18, 2024

View reviewed changes

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py Show resolved Hide resolved

dusvyat requested review from tmichaeldb and ea-rus April 18, 2024 11:30

Update file_handler.py

63ca0e6

ea-rus reviewed Apr 18, 2024

View reviewed changes

mindsdb/integrations/handlers/langchain_handler/langchain_handler.py Outdated Show resolved Hide resolved

dusvyat requested a review from ea-rus April 19, 2024 06:11

This was referenced Apr 19, 2024

Fixes for RAG Integration to Python SDK #9098

Merged

Add File Retrieval to Agents mindsdb/mindsdb_python_sdk#102

Merged

tmichaeldb approved these changes Apr 19, 2024

View reviewed changes

tmichaeldb merged commit 8abf1c1 into staging Apr 19, 2024
13 checks passed

ZoranPandovski mentioned this pull request May 21, 2024

Release v24.5.4.0 #9234

Closed

hamishfagg deleted the retrieval-skill-minor-tweaks branch June 10, 2024 21:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor tweaks to retrieval skill and file ingestion #9093

Minor tweaks to retrieval skill and file ingestion #9093

dusvyat commented Apr 18, 2024 •

edited

Loading

dusvyat commented Apr 18, 2024

tmichaeldb left a comment

Minor tweaks to retrieval skill and file ingestion #9093

Minor tweaks to retrieval skill and file ingestion #9093

Conversation

dusvyat commented Apr 18, 2024 • edited Loading

Description

Type of change

Verification Process

Additional Media:

Checklist:

dusvyat commented Apr 18, 2024

tmichaeldb left a comment

Choose a reason for hiding this comment

dusvyat commented Apr 18, 2024 •

edited

Loading