-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Add RetrySqlQueryCreatorTool for handling failed SQL query generation #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: base-sha/44e9c005f114a3b74ad3ceb87698d4044c012875
Are you sure you want to change the base?
Conversation
|
This is a benchmark review for experiment This pull request was cloned from Experiment configurationreview_config:
# User configuration for the review
# - benchmark - use the user config from the benchmark reviews
# - <value> - use the value directly
user_review_config:
enable_ai_review: true
enable_rule_comments: false
enable_complexity_comments: benchmark
enable_security_comments: benchmark
enable_tests_comments: benchmark
enable_comment_suggestions: benchmark
enable_functionality_review: benchmark
enable_pull_request_summary: benchmark
enable_review_guide: benchmark
enable_approvals: true
ai_review_config:
# The model responses to use for the experiment
# - benchmark - use the model responses from the benchmark reviews
# - llm - call the language model to generate responses
model_responses:
comments_model: benchmark
comment_area_model: benchmark
comment_validation_model: benchmark
comment_suggestion_model: benchmark
complexity_model: benchmark
docstrings_model: benchmark
functionality_model: benchmark
security_model: benchmark
tests_model: benchmark
pull_request_summary_model: benchmark
review_guide_model: benchmark
# The pull request dataset to run the experiment on
pull_request_dataset:
- https://github.com/ghostbsd/ghostbsd-src/pull/328
- https://github.com/dan5e3s6ares/a-real-mock-api/pull/3
- https://github.com/unknowIfGuestInDream/document/pull/117
- https://github.com/code-Harsh247/yt_playlist_exporter/pull/13
- https://github.com/Fenigor/align-game/pull/21
- https://github.com/lehuygiang28/vnpay/pull/16
- https://github.com/nuxeo/nuxeo-drive/pull/5053
- https://github.com/skypointcloud/skypoint-langchain/pull/15
- https://github.com/4DNucleome/PartSeg/pull/1114
- https://github.com/4DNucleome/PartSeg/pull/1115
- https://github.com/4DNucleome/PartSeg/pull/1116
- https://github.com/dreamerminsk/tasked/pull/77
- https://github.com/dreamerminsk/tasked/pull/78
- https://github.com/dreamerminsk/tasked/pull/79
- https://github.com/dreamerminsk/tasked/pull/80
- https://github.com/medulla-tech/medulla/pull/619
- https://github.com/medulla-tech/medulla/pull/620
- https://github.com/medulla-tech/medulla/pull/621
- https://github.com/mraniki/MyLLM/pull/574
- https://github.com/alexsoyes/ai-driven-dev-community/pull/5
- https://github.com/alexsoyes/ai-driven-dev-community/pull/6
- https://github.com/cpp-lln-lab/CPP_HPC/pull/34
- https://github.com/cpp-lln-lab/CPP_HPC/pull/35
- https://github.com/Eliver-Salazar/PED/pull/4
- https://github.com/Eliver-Salazar/PED/pull/6
- https://github.com/Eliver-Salazar/PED/pull/7
- https://github.com/usama-maxenius/image-editor/pull/129
- https://github.com/usama-maxenius/image-editor/pull/125
- https://github.com/usama-maxenius/image-editor/pull/126
- https://github.com/usama-maxenius/image-editor/pull/127
- https://github.com/usama-maxenius/image-editor/pull/128
- https://github.com/elixir-cloud-aai/tus-storagehandler/pull/3
- https://github.com/iptux-src/iptux/pull/617
- https://github.com/jhanley634/dojo-2024-06-18-geocode/pull/8
- https://github.com/phenobarbital/asyncdb/pull/1155
- https://github.com/bengosney/cerberus/pull/962
- https://github.com/gdsfactory/klive/pull/11
- https://github.com/pozapas/awesome-crowdynamics/pull/3
- https://github.com/flet-dev/flet/pull/3582
- https://github.com/jackdewinter/pymarkdown/pull/1118
- https://github.com/erxes/erxes/pull/5496
- https://github.com/erxes/erxes/pull/5497
- https://github.com/erxes/erxes/pull/5499
- https://github.com/erxes/erxes/pull/5500
- https://github.com/erxes/erxes/pull/5503
- https://github.com/erxes/erxes/pull/5504
- https://github.com/erxes/erxes/pull/5501
- https://github.com/erxes/erxes/pull/5502
- https://github.com/alanrenouf/ECSExample/pull/1
- https://github.com/ICRAR/shark/pull/17
review_comment_labels:
- label: correct
question: Is this comment correct?
- label: helpful
question: Is this comment helpful?
- label: comment-type
question: Is the comment type correct?
- label: comment-area
question: Is the comment area correct?
- label: llm-test
question: |
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
# Benchmark reviews generated by running
# python -m scripts.experiment benchmark <experiment_name>
benchmark_reviews: []
|
Reviewer's Guide by SourceryThis pull request introduces a new tool, RetrySqlQueryCreatorTool, to handle failed SQL query generation by retrying the creation process. The SQL query creation workflow has been updated to integrate this new tool, and the prompt used for retrying SQL queries has been enhanced to provide more detailed instructions. File-Level Changes
Tips
|
SourceryAI
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @sourcery-ai-experiments-bot - I've reviewed your changes and they look great!
Here's what I looked at during the review
- 🟡 General issues: 8 issues found
- 🟡 Security: 1 issue found
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.
| from langchain_core.tools import StateTool | ||
| import re | ||
|
|
||
| ERROR = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Unused global variable
The ERROR variable is defined but never used in the code. Consider removing it if it's not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
| ) | ||
| executable_query = executable_query.strip('\"') | ||
| executable_query = re.sub('\\n```', '',executable_query) | ||
| self.db.run_no_throw(executable_query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Duplicate database query execution
The self.db.run_no_throw(executable_query) is called twice consecutively. This seems redundant and could be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
|
|
||
| def _extract_sql_query(self): | ||
| for value in self.state: | ||
| for value in reversed(self.state): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (performance): Reversed iteration over state
Reversing the state list might have performance implications if the list is large. Ensure this is necessary for the logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
| for value in reversed(self.state): | ||
| for key, input_string in value.items(): | ||
| if "sql_db_query_creator" in key: | ||
| if "tool='retry_sql_db_query_creator'" in key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Hardcoded tool name
Consider defining the tool names as constants to avoid hardcoding strings multiple times.
| if "tool='retry_sql_db_query_creator'" in key: | |
| RETRY_SQL_DB_QUERY_CREATOR = "tool='retry_sql_db_query_creator'" | |
| SQL_DB_QUERY_CREATOR = "tool='sql_db_query_creator'" | |
| for value in reversed(self.state): | |
| for key, input_string in value.items(): | |
| if RETRY_SQL_DB_QUERY_CREATOR in key: | |
| return input_string | |
| elif SQL_DB_QUERY_CREATOR in key: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
| ) | ||
| query_creator_chain = LLMChain(llm=self.sqlcreatorllm, prompt=prompt_input) | ||
|
|
||
| sql_query = query_creator_chain.run( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Error handling for query creation
There is no error handling for the query_creator_chain.run method. Consider adding try-except blocks to handle potential exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
| SQL_QUERY_CREATOR_RETRY = """ | ||
| You have failed in the first attempt to generate correct sql query. Please try again to rewrite correct sql query. | ||
| """ | ||
| Your task is convert an incorrect query resulting from user question to a correct query which is databricks sql compatible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick (typo): Grammar issue in prompt
The sentence should be 'Your task is to convert an incorrect query resulting from a user question to a correct query which is Databricks SQL compatible.'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
| ) | ||
| ) | ||
| sql_query = sql_query.replace("```","") | ||
| sql_query = sql_query.replace("sql","") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚨 issue (security): Potentially unsafe string replacement
Replacing 'sql' in the query string might lead to unintended consequences if 'sql' appears in the actual query. Consider a more targeted approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
| return input_string | ||
| return None | ||
|
|
||
| def _extract_error_message(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Error message extraction logic
The method _extract_error_message assumes that the error message will always contain 'Error'. This might not be robust for all error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of LLM test could this comment become?
- 👍 - this comment is really good/important and we should always make it
- 👎 - this comment is really bad and we should never make it
- no reaction - don't turn this comment into an LLM test
| run_manager: Optional[CallbackManagerForToolRun] = None, | ||
| ) -> str: | ||
| """Get the SQL query for the incorrect query.""" | ||
| return self._create_sql_query(user_input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Missing type hint for method parameter
Consider adding type hints for the user_input parameter in the _run method for better code clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment type correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the comment area correct?
Add RetrySqlQueryCreatorTool for handling failed SQL query generation
Thank you for contributing to LangChain!
If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
Summary by Sourcery
This pull request adds a new tool, RetrySqlQueryCreatorTool, to handle failed SQL query generation by retrying the creation of SQL queries based on incorrect queries and error messages. It also updates the SQL query creation process to utilize this new tool and enhances the prompt template for retrying SQL queries.