Feature/gsk 2334 talk to my model mvp #1831

AbSsEnT · 2024-03-04T15:31:37Z

Description

Adding the new functionality called "Talk to my ML model". It allows to query prediction results, explanations, performance issues of the Giskard Model, using the natural language.

Type of Change

📚 Examples / docs / tutorials / dependencies update
🔧 Bug fix (non-breaking change which fixes an issue)
🥂 Improvement (non-breaking change which improves an existing feature)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
🔐 Security fix

Checklist

I've read the CODE_OF_CONDUCT.md document.
I've read the CONTRIBUTING.md guide.
I've updated the code style using make codestyle.
I've written tests for all new methods and classes that I created.
I've written the docstring in Google format for all the methods and classes that I used.

…-2334-talk-to-my-model-mvp

…ich returns a prediction from the row of the dataset.

… is called. Replaced placeholders and dummy variables to the real objects.

…-2334-talk-to-my-model-mvp

…skard-AI/giskard into feature/gsk-2335-query-prediction-for-the-row-from-the-dataset

…f 'tools' API.

…flow. Performed prompt engineering for the tool description and LLM instruction.

…-2334-talk-to-my-model-mvp

…-2335-query-prediction-for-the-row-from-the-dataset

…skard-AI/giskard into feature/gsk-2335-query-prediction-for-the-row-from-the-dataset

…ction-for-the-row-from-the-dataset Implementation of the "PredictFromDatasetTool"

…-2334-talk-to-my-model-mvp

…skard-AI/giskard into feature/gsk-2336-query-shap-prediction-explanation

…-to-tools' of github.com:Giskard-AI/giskard into feature/gsk-2336-query-shap-prediction-explanation

…-to-tools' of github.com:Giskard-AI/giskard into feature/gsk-2334-talk-to-my-model-mvp

…skard-AI/giskard into feature/gsk-2336-query-shap-prediction-explanation

…prediction-explanation Feature/gsk 2336 query shap prediction explanation

…common for all child Tools classes.

…-to-tools' of github.com:Giskard-AI/giskard into feature/gsk-2419-adapt-workflow-to-the-tools-api

…-2334-talk-to-my-model-mvp

luca-martial

Docs are fine for v1, lets adjust later

rabah-khalek

LGTM, good job @AbSsEnT

…nt-tool-calling-return

…s before.

…-2334-talk-to-my-model-mvp

sonarcloud · 2024-04-11T13:38:59Z

Quality Gate passed

Issues
4 New issues
0 Accepted issues

Measures
0 Security Hotspots
86.9% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

This reverts commit 5d1894d.

* Initial commit for the MVP of "Talk to my model" functionality. * Defined the basic pipeline of the 'talk' function. * Defined the Tool interface and the boilerplate for the first tool, which returns a prediction from the row of the dataset. * small addition * Added method to initialise tools objects, each time the method 'talk' is called. Replaced placeholders and dummy variables to the real objects. * Initial implementation of the "__call__" method. * Bug fixes. Adapted flow to currently use legacy 'functions' instead of 'tools' API. * Debugged "predict from dataset" tool workflow. Debugged the tool workflow. Performed prompt engineering for the tool description and LLM instruction. * Initial implementation of the 'SHAPExplanationTool'. * Added handling an errors, while calling tools. * Moved more attributes and properties to the BaseTool, since they are common for all child Tools classes. * Changed PredictFromDataset tool's specification. * Adapting model.py to the use of tools API. * Fully changed 'talk' method workflow to use tools API. * Added multiple toll calling for the SHAP explanation tool * Code refactoring. * Initial implementation of the IssuesScannerTool, which gives user an info about model's performance issues. * Refactoring. * Removed __futures__ import. * Started implementing prediction from user input tool. * Implemented the final PredictUserInputTool. * Put the shap explanations calculation logic into separate module. * Explicitly set target to the 'None', when creating Dataset, to omit warnings. * Distributed the tools across separate dedicated modules for easier maintenance. Code refactoring. * Implemented history (context) persistence to enable dialogue regime between LLM and the User. Formatted the output of the model.talk() method. * Small refactoring. * Executed pre-commit hooks on all files. * Update regarding new LLMClient API. * Updated `pdm.lock` * Finalised adaptation to the new LLMClient API for the 'talk' functionality. * Removed "_form_tool_calls" method. * Small fixes. * Updated pdm. * Updated pdm. * United the PredictDatasetInput and PredictUserInput tools into single tool. It improves taxonomy of tools, making it more distinct. Also, and what is important, it allowed to calculate SHAP values, when an input is built from the user input. * If we already see, that filtered dataset is of length 0, stop further potential filtering. Implemented fuzzy string features matching. * Created the new tool to calculate model's performance metrics. * Talk architecture polishing. * Improved the system prompt to: 1) Avoid providing generic answers; 2) Refuse to answer on a harmful questions. * System prompt improvement. * 1) Updated to the latest gpt-4-turbo version; 2) Fixed the bug with metrics calculation, when there is a need to filter dataset; * Bug fix. * Added better spacing to the instruct prompt. * Improved instruction to not provide generic answers. * Added docstrings. * Added docstrings. * Added docstrings. * Added docstrings. * Added docstrings. * Added docstrings. * Updated typing with the respect to not using the __futures__. * Replaced thefuzz.ratio() by the native difflib.SequenceMatcher().ratio() * Removed optional list casting. * Refactored the dataset filtering logic. Added comments. * Removed useless casting to list. * Simplified assignment expression. * Small fix. * Replaced by the object's method call. * Replaced the __str__ by the __repr__ * Moved fuzzy similarity threshold to the config. * Small fix. * Removed import BaseOpenAIClient from model.py * 1) The 'dataset' argument of the 'talk' is mandatory now. 2) An exception will appear, if the user will call the "IssuesScannerTool" through the 'talk', without providing the "scan_report" argument. * Added clarifying comments, on why to use non-top-level imports, as well as on background sample calculation. * Added the possibility of configuring Talk LLM model through the env variable. * Returned the from __future__ import annotations, since we accept such protocol. * Documented the reason, why to import functions not from the top-level. * Improved typing and docstrings. * [RESTORING] dataset is not mandatory parameter. * Created the new group 'talk' for the 'talk-to-my-ml' feature dependencies. * Regenerating pdm.lock * - Fixed ambiguity in calling for 'model performance'. Now, the metric calculation tool is called, instead of scan issues tool. - Fixed seed and temperature of the LLM client. - Put LLM client.complete() parameters into separate dict. - Now the scan tool is supplied with scan_report.to_markdown(template="hugging_face"), thus having more info on scan report, preventing hallucinations. * Regenerating pdm.lock * Created unit-tests for the 'talk' feature. * Small fix. * Regenerating pdm.lock * Committing missing pytest file with unit-tests for the 'talk' feature. * Update giskard/llm/talk/config.py Co-authored-by: Rabah Abdul Khalek <rabah.khalek@gmail.com> * Update giskard/llm/talk/config.py Co-authored-by: Rabah Abdul Khalek <rabah.khalek@gmail.com> * Update giskard/llm/talk/config.py * Update giskard/llm/talk/config.py * Update giskard/llm/talk/config.py Co-authored-by: Rabah Abdul Khalek <rabah.khalek@gmail.com> * Update giskard/llm/talk/config.py * Fixed typos with GPT. * Better exception raising logic. * 1) Specified, that model and dataset are mandatory parameters of tools. 2) Improved the logic of mapping pandas dtypes to json dtypes. * Update giskard/llm/talk/tools/metric.py Co-authored-by: Rabah Abdul Khalek <rabah.khalek@gmail.com> * Removed comments. * Made features_json_type as a property. * Added `features_dict` validation logic. * Replaced metrics calculation functions from sklearn to giskard * Fixed unit-tests by escaping regex-sensitive characters. * Re-made unit-tests. Mocked LLM responses to avoid dependence on OpenAI API calls. * Fixed CI/CD errors: 1) Added 'tabulate' package to the 'talk' dependency group'; 2) Improved error matching criteria in the 'talk' unit-tests. * Regenerating pdm.lock * Fixed CI/CD errors: Improved error matching criteria in the 'talk' unit-tests to make it compatible with python 3.9. * Delete pdm.lock * Regenerating pdm.lock * Created the docs page for the AI Quality Copilot. * Regenerating pdm.lock * Regenerating pdm.lock * Small docs fix. * Removed instruction because of redundancy. * Rewrote the initialization of all tools. Now only mandatory tool parameters can be passed. Also, improved docstrings. * Introduced PredictionMixin class to abstract away common prediction necessary methods of the Predict and Metric tools. Reduces code duplication. * Small docstring fix. * Added doc page for the AI Quality Copilot. * Returned old page. * Returned old page. * Once again, I added the doc page for the AI Quality Copilot. * Delete pdm.lock * Regenerating pdm.lock * Delete pdm.lock * Regenerating pdm.lock * Update talk_result.py --------- Co-authored-by: Hartorn <bazire@giskard.ai> Co-authored-by: BotLocker <bot.locker@users.noreply.github.com> Co-authored-by: Rabah Abdul Khalek <rabah.khalek@gmail.com>

This reverts commit 5d1894d.

AbSsEnT added 30 commits December 12, 2023 16:55

Initial commit for the MVP of "Talk to my model" functionality.

bfc87e5

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

e8a3b6e

…-2334-talk-to-my-model-mvp

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

3316bd4

…-2334-talk-to-my-model-mvp

Defined the basic pipeline of the 'talk' function.

4452fd2

Defined the Tool interface and the boilerplate for the first tool, wh…

847e8b8

…ich returns a prediction from the row of the dataset.

small addition

7ca2416

Added method to initialise tools objects, each time the method 'talk'…

b44386b

… is called. Replaced placeholders and dummy variables to the real objects.

Initial implementation of the "__call__" method.

b727b77

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

84318d1

…-2334-talk-to-my-model-mvp

Merge branch 'feature/gsk-2334-talk-to-my-model-mvp' of github.com:Gi…

71a08b5

…skard-AI/giskard into feature/gsk-2335-query-prediction-for-the-row-from-the-dataset

Bug fixes. Adapted flow to currently use legacy 'functions' instead o…

549c916

…f 'tools' API.

Debugged "predict from dataset" tool workflow. Debugged the tool work…

66673a3

…flow. Performed prompt engineering for the tool description and LLM instruction.

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

4deb3fc

…-2334-talk-to-my-model-mvp

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

0e98572

…-2335-query-prediction-for-the-row-from-the-dataset

Merge branch 'feature/gsk-2334-talk-to-my-model-mvp' of github.com:Gi…

e2781aa

…skard-AI/giskard into feature/gsk-2335-query-prediction-for-the-row-from-the-dataset

Merge pull request #1687 from Giskard-AI/feature/gsk-2335-query-predi…

66f7429

…ction-for-the-row-from-the-dataset Implementation of the "PredictFromDatasetTool"

Initial implementation of the 'SHAPExplanationTool'.

b46fd57

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

a836a1e

…-2334-talk-to-my-model-mvp

Merge branch 'feature/gsk-2334-talk-to-my-model-mvp' of github.com:Gi…

4dcd490

…skard-AI/giskard into feature/gsk-2336-query-shap-prediction-explanation

Added handling an errors, while calling tools.

777011a

Merge branch 'feature/gsk-2367-migrate-openai-api-call-from-functions…

19488bd

…-to-tools' of github.com:Giskard-AI/giskard into feature/gsk-2336-query-shap-prediction-explanation

Merge branch 'feature/gsk-2367-migrate-openai-api-call-from-functions…

cd2bdf3

…-to-tools' of github.com:Giskard-AI/giskard into feature/gsk-2334-talk-to-my-model-mvp

Merge branch 'feature/gsk-2334-talk-to-my-model-mvp' of github.com:Gi…

0a9315a

…skard-AI/giskard into feature/gsk-2336-query-shap-prediction-explanation

Merge pull request #1696 from Giskard-AI/feature/gsk-2336-query-shap-…

4db4482

…prediction-explanation Feature/gsk 2336 query shap prediction explanation

Moved more attributes and properties to the BaseTool, since they are …

6b86d48

…common for all child Tools classes.

Changed PredictFromDataset tool's specification.

bbcf347

Adapting model.py to the use of tools API.

e248aab

Merge branch 'feature/gsk-2367-migrate-openai-api-call-from-functions…

394b2bb

…-to-tools' of github.com:Giskard-AI/giskard into feature/gsk-2419-adapt-workflow-to-the-tools-api

Fully changed 'talk' method workflow to use tools API.

dc155ee

Added multiple toll calling for the SHAP explanation tool

9a70228

AbSsEnT added 4 commits April 9, 2024 17:54

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

b2c64f1

…-2334-talk-to-my-model-mvp

Returned old page.

f0d2818

Returned old page.

66764e4

Once again, I added the doc page for the AI Quality Copilot.

ad2be7f

AbSsEnT requested a review from rabah-khalek April 9, 2024 16:52

rabah-khalek requested a review from luca-martial April 9, 2024 17:34

luca-martial approved these changes Apr 10, 2024

View reviewed changes

Delete pdm.lock

5cf2beb

rabah-khalek approved these changes Apr 10, 2024

View reviewed changes

BotLocker and others added 3 commits April 10, 2024 14:24

Regenerating pdm.lock

f003572

Merge branch 'main' into feature/gsk-2334-talk-to-my-model-mvp

8d2af8b

Delete pdm.lock

c746770

andreybavt added the Lockfile Temporary label to update pdm.lock label Apr 10, 2024

Regenerating pdm.lock

05d2c82

rabah-khalek removed the Lockfile Temporary label to update pdm.lock label Apr 10, 2024

Update talk_result.py

b0d4e7a

rabah-khalek enabled auto-merge (squash) April 10, 2024 15:11

AbSsEnT added 4 commits April 10, 2024 19:33

Returned the logic of tool calling to the LLMClient.

479d91d

Merge branch 'feature/gsk-2334-talk-to-my-model-mvp' into openai-clie…

d195484

…nt-tool-calling-return

Modified the LLMClient to support tool calling functionality as it wa…

2982c02

…s before.

Merge branch 'main' of github.com:Giskard-AI/giskard into feature/gsk…

fe2f650

…-2334-talk-to-my-model-mvp

AbSsEnT requested a review from andreybavt April 11, 2024 13:09

rabah-khalek merged commit 5d1894d into main Apr 11, 2024
16 checks passed

rabah-khalek deleted the feature/gsk-2334-talk-to-my-model-mvp branch April 11, 2024 13:44

rabah-khalek restored the feature/gsk-2334-talk-to-my-model-mvp branch April 11, 2024 14:26

rabah-khalek added a commit that referenced this pull request Apr 11, 2024

Revert "Feature/gsk 2334 talk to my model mvp (#1831)"

451a3a0

This reverts commit 5d1894d.

rabah-khalek mentioned this pull request Apr 11, 2024

Revert "Feature/gsk 2334 talk to my model mvp" #1887

Merged

pierlj pushed a commit that referenced this pull request Apr 15, 2024

Revert "Feature/gsk 2334 talk to my model mvp (#1831)"

657b386

This reverts commit 5d1894d.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/gsk 2334 talk to my model mvp #1831

Feature/gsk 2334 talk to my model mvp #1831

AbSsEnT commented Mar 4, 2024

luca-martial left a comment

rabah-khalek left a comment

sonarcloud bot commented Apr 11, 2024

Feature/gsk 2334 talk to my model mvp #1831

Feature/gsk 2334 talk to my model mvp #1831

Conversation

AbSsEnT commented Mar 4, 2024

Description

Type of Change

Checklist

luca-martial left a comment

Choose a reason for hiding this comment

rabah-khalek left a comment

Choose a reason for hiding this comment

sonarcloud bot commented Apr 11, 2024

Quality Gate passed