Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Langchain Handler #8998

Merged
merged 10 commits into from Mar 28, 2024
Merged

Refactor Langchain Handler #8998

merged 10 commits into from Mar 28, 2024

Conversation

tmichaeldb
Copy link
Contributor

@tmichaeldb tmichaeldb commented Mar 26, 2024

Description

Fixes RAG-2.

This PR overhauls langchain_handler:

  • Make as many arguments optional (e.g. user_column, assisstant_column) as possible
  • Remove mode argument as it was untested & unused
  • Add test suite for different providers
  • Clean up LLM config into util methods
  • Clean up overall implementation

Type of change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ⚡ New feature (non-breaking change which adds functionality)
  • 📄 This change requires a documentation update

Verification Process

To ensure the changes are working as expected:

  • Test Location: ./tests/unit/ml_handlers/test_langchain.py
  • Verification Steps: Follow steps in README.md to create and query a model using the langchain handler.

Additional Media:

  • I have attached a brief loom video or screenshots showcasing the new functionality or change.

Checklist:

  • My code follows the style guidelines(PEP 8) of MindsDB.
  • I have appropriately commented on my code, especially in complex areas.
  • Necessary documentation updates are either made or tracked in issues.
  • Relevant unit and integration tests are updated or added.

Copy link
Contributor

@dusvyat dusvyat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, LGTM

@dusvyat
Copy link
Contributor

dusvyat commented Mar 27, 2024

@paxcema - any major comments ? Or am I fine to merge this one

Copy link
Member

@paxcema paxcema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slick set of changes, looks great 👍

@dusvyat dusvyat merged commit 3849238 into staging Mar 28, 2024
12 checks passed
@hamishfagg hamishfagg mentioned this pull request Apr 7, 2024
hamishfagg added a commit that referenced this pull request Apr 10, 2024
* safe extract for all tar files

* fix

* tests for:
- create empty table
- interval function

* added module and skeleton for GoogleServiceAccountOauth2Utilities

* added the NoCredentialsException

* raised NoCredentialsException if no creds provided

* added method of getting credentials: file and JSON

* added the method for download creds file

* refactored method for getting credentials: URL

* renamed class to GoogleServiceAccountOAuth2Manager

* added error handling

* imported the lclass to the main utils package

* added parsing for JSON credentials

* separated parsing credentials JSON to new function

* updated Vertex handler to use new credentials params

* updated Vertex client to use GoogleServiceAccountOAuth2Manager

* imported GoogleServiceAccountOAuth2Manager to main auth package

* removing redundant package lists and sentence-transformers library (#8971)

* removing redundant package lists and sentence-transformers library

* Add input column to RAG tests and update requirements

Added 'input_column' to RAG tests in `test_rag.py` to accommodate for changes in the function signatures. Furthermore, the RAG handler requirements have been updated to include 'sentence-transformers', required for HuggingFaceEmbeddings from the langchain-community package.

* Add allow_dangerous_deserialization in FAISS loading and fix typo in unit tests.

A new allow_dangerous_deserialization parameter was added to the FAISS loading in RAG handler settings, providing additional flexibility for deserialization. Additionally, a typo in the input column name within unit tests was fixed, which was causing the tests to fail.

---------

Co-authored-by: dusvyat <13661123+dusvyat@users.noreply.github.com>

* Fix tarfile safe extract

* sql dep

* Refactor Langchain Handler (#8998)

* Refactor langchain_handler

* Added more tests

* Add back mdb_read tool by default

* Remove prints

* Use updated import path

* Update test_llm_utils

* Add pydantic to reqs

* Remove pydantic from handler reqs not that it is in main reqs

* Remove explicit pydantic versioning to avoid conflict

* added method to get creds and used it in create and predict

* updated credentials parsing logic

* renamed GoogleOAuth2Manager to GoogleUserOAuth2Manager

* renamed google_oauth_utilities to google_user_oauth_utilities

* updated the README with new parameters

* removed unused json import

* Update query format in quickstart-tutorial.mdx

* fix: delete from table

* fix url in  huggingface_inference_api.mdx

* test

* Fix hardcoded API base, read it from conn args too

* added a raise_for_status() check when downloading creds

* raised a ValueError if parsing JSON creds fail

* updated the CREATE ML_ENGINE examples in README

* refactored the error message for when parsing fails

* added a check for import errors during model creation

* added comment to explain change

* split req file paths to flag and path

* Add pred_args

* updated hostname of sample dbs

* fixed slack chatbot tutorial

* added default values for creds params in client

* updated the order of params passed to client

* clean up handler install log messages (#8720)

* Docker build improvements (#8999)

* build once and push cache later

* Separate cache job

* add db seed and mssql disconnect (#8742)

* fixe the init for the EventStoreDB handler

* removed unusued imports from the Rockset handler

* removed unusued deps from the Rockset handler

* fixed types in connection args

* refactored req files to be installed from abs path

* removed unnecessary call to set the success of result

* updated Snowflake docs with tip to install git

* fixed grammar in last sentence

* added space between the two lines

* updated email integration docs

* fix: dn.query could not support projection

* added a check for standalone comments

* added a check for inline comments

* make alembic use mindsdb log level (#9028)

* Bump version (#9041)

* Updated LightFM Integration Docs with Tip to Install Linux Dev Pakcages (#9042)

* added a tip for installing the Linux dev packages

* fixed a couple of grammatical errors

* updated ollama docs (#8995)

* updated ollama docs

* updated note

* added tip for docker

* updates

* updates

---------

Co-authored-by: Max Stepanov <stpmax@yandex.ru>
Co-authored-by: andrew <elkin.andr@gmail.com>
Co-authored-by: Minura Punchihewa <minurapunchihewa17@gmail.com>
Co-authored-by: QuantumPlumber <44450703+QuantumPlumber@users.noreply.github.com>
Co-authored-by: dusvyat <13661123+dusvyat@users.noreply.github.com>
Co-authored-by: Ty <124617566+tmichaeldb@users.noreply.github.com>
Co-authored-by: Andrey <andrey@mindsdb.com>
Co-authored-by: Vlad Romanenko <vlad.romanenko@hotmail.com>
Co-authored-by: Arnav K <arnavkaushal09@gmail.com>
Co-authored-by: Zoran Pandovski <zoran.pandovski@gmail.com>
Co-authored-by: Martyna <martyna@mindsdb.com>
Co-authored-by: Minura Punchihewa <49385643+MinuraPunchihewa@users.noreply.github.com>
Co-authored-by: martyna-mindsdb <109554435+martyna-mindsdb@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants