refactor: api/services/tools/api_tools_manage_service.py to session#32159
refactor: api/services/tools/api_tools_manage_service.py to session#32159asukaminato0721 wants to merge 21 commits intolanggenius:mainfrom
Conversation
Summary of ChangesHello @asukaminato0721, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant refactoring to the API tool management service by standardizing its database interaction layer. The core change involves migrating all database operations from direct Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors api_tools_manage_service.py to use a session factory for database operations, which is a good step towards better session management. However, the refactoring is incomplete and introduces several issues. Multiple methods use fetched database objects outside the session's scope, which can lead to DetachedInstanceError or inconsistent session usage. One update operation is missing transactional support, which is a critical bug. Furthermore, calls to other services that still rely on the global db.session break transactional atomicity. These issues need to be addressed to ensure the refactoring is robust and correct.
There was a problem hiding this comment.
Pull request overview
Refactors ApiToolManageService to migrate database access from Flask-SQLAlchemy’s db.session to the project’s session_factory (pure SQLAlchemy sessions), aligning with the ongoing session migration effort.
Changes:
- Replaced
db.sessionusage withsession_factory.create_session()across CRUD/query paths. - Wrapped create/delete flows in explicit transactional blocks (
session.begin()). - Updated read-only queries to use short-lived sessions.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request successfully refactors the database session management across api/core/tools/tool_label_manager.py and api/services/tools/api_tools_manage_service.py. By replacing the global db object with session_factory and explicit Session passing, the changes enhance modularity, testability, and better control over database transactions. The implementation correctly utilizes with session_factory.create_session() as session and session.begin() where appropriate, ensuring proper session lifecycle management. This is a positive and well-executed refactoring.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| if labels is not None: | ||
| ToolLabelManager.update_tool_labels( | ||
| ToolTransformService.workflow_provider_to_controller(workflow_tool_provider), labels, session | ||
| ) | ||
| return {"result": "success"} |
There was a problem hiding this comment.
Same issue as the API provider path: if workflow_tool_provider.id is assigned by the DB on flush/insert, calling update_tool_labels before a session.flush() can write bindings without a valid tool_id. Add session.flush() after session.add(workflow_tool_provider) and before updating labels.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """ | ||
| if db_tool is None: | ||
| raise ValueError("Tool not found") | ||
| with session_factory.create_session() as session, session.begin(): |
There was a problem hiding this comment.
Using session.begin() for a read-only operation is unnecessary. The _get_workflow_tool method only reads data. Based on the codebase convention, read-only operations should use with session_factory.create_session() as session: without the session.begin() call. This will avoid unnecessary transaction overhead.
| with session_factory.create_session() as session, session.begin(): | |
| with session_factory.create_session() as session: |
| .where(WorkflowToolProvider.tenant_id == tenant_id, WorkflowToolProvider.id == workflow_tool_id) | ||
| .first() | ||
| ) | ||
| with session_factory.create_session() as session, session.begin(): |
There was a problem hiding this comment.
Using session.begin() for a read-only operation is unnecessary. The list_single_workflow_tools method only reads data. Based on the codebase convention, read-only operations should use with session_factory.create_session() as session: without the session.begin() call. This will avoid unnecessary transaction overhead.
| with session_factory.create_session() as session, session.begin(): | |
| with session_factory.create_session() as session: |
| with session_factory.create_session() as session, session.begin(): | ||
| workflow_app = ( | ||
| session.query(App).where(App.id == db_tool.app_id, App.tenant_id == db_tool.tenant_id).first() | ||
| ) | ||
|
|
||
| workflow_app: App | None = ( | ||
| db.session.query(App).where(App.id == db_tool.app_id, App.tenant_id == db_tool.tenant_id).first() | ||
| ) | ||
|
|
||
| if workflow_app is None: | ||
| raise ValueError(f"App {db_tool.app_id} not found") | ||
|
|
||
| workflow = workflow_app.workflow | ||
| if not workflow: | ||
| raise ValueError("Workflow not found") | ||
|
|
||
| tool = ToolTransformService.workflow_provider_to_controller(db_tool) | ||
| workflow_tools: list[WorkflowTool] = tool.get_tools(tenant_id) | ||
| if len(workflow_tools) == 0: | ||
| raise ValueError(f"Tool {db_tool.id} not found") | ||
|
|
||
| tool_entity = workflow_tools[0].entity | ||
| # get output schema from workflow tool entity | ||
| output_schema = tool_entity.output_schema | ||
|
|
||
| return { | ||
| "name": db_tool.name, | ||
| "label": db_tool.label, | ||
| "workflow_tool_id": db_tool.id, | ||
| "workflow_app_id": db_tool.app_id, | ||
| "icon": json.loads(db_tool.icon), | ||
| "description": db_tool.description, | ||
| "parameters": jsonable_encoder(db_tool.parameter_configurations), | ||
| "output_schema": output_schema, | ||
| "tool": ToolTransformService.convert_tool_entity_to_api_entity( | ||
| tool=tool.get_tools(db_tool.tenant_id)[0], | ||
| labels=ToolLabelManager.get_tool_labels(tool), | ||
| tenant_id=tenant_id, | ||
| ), | ||
| "synced": workflow.version == db_tool.version, | ||
| "privacy_policy": db_tool.privacy_policy, | ||
| } | ||
| if workflow_app is None: | ||
| raise ValueError(f"App {db_tool.app_id} not found") | ||
|
|
||
| workflow = workflow_app.workflow | ||
| if not workflow: | ||
| raise ValueError("Workflow not found") | ||
|
|
||
| tool = ToolTransformService.workflow_provider_to_controller(db_tool) | ||
| workflow_tools: list[WorkflowTool] = tool.get_tools(tenant_id) | ||
| if len(workflow_tools) == 0: | ||
| raise ValueError(f"Tool {db_tool.id} not found") | ||
|
|
||
| tool_entity = workflow_tools[0].entity | ||
| # get output schema from workflow tool entity | ||
| output_schema = tool_entity.output_schema | ||
|
|
||
| return { | ||
| "name": db_tool.name, | ||
| "label": db_tool.label, | ||
| "workflow_tool_id": db_tool.id, | ||
| "workflow_app_id": db_tool.app_id, | ||
| "icon": json.loads(db_tool.icon), | ||
| "description": db_tool.description, | ||
| "parameters": jsonable_encoder(db_tool.parameter_configurations), | ||
| "output_schema": output_schema, | ||
| "tool": ToolTransformService.convert_tool_entity_to_api_entity( | ||
| tool=tool.get_tools(db_tool.tenant_id)[0], | ||
| labels=ToolLabelManager.get_tool_labels(tool), | ||
| tenant_id=tenant_id, | ||
| ), | ||
| "synced": workflow.version == db_tool.version, | ||
| "privacy_policy": db_tool.privacy_policy, | ||
| } |
There was a problem hiding this comment.
Creating a new session within _get_workflow_tool while using a db_tool object from a previous session will cause detached object issues. The db_tool object is loaded in the calling methods (lines 252-258 and 269-275) within one session, but then accessed (e.g., db_tool.app_id, db_tool.tenant_id, db_tool.name, etc.) within a new session created at line 286.
SQLAlchemy objects are bound to their originating session. When accessed from a different session context, lazy-loaded relationships and attributes may fail. Instead of creating a new session here, the session should be passed as a parameter from the calling methods, or the entire query logic should be moved into the calling methods' session context.
| user_tool_provider.tools = [ | ||
| ToolTransformService.convert_tool_entity_to_api_entity( | ||
| tool=tool.get_tools(tenant_id)[0], | ||
| with session_factory.create_session() as session, session.begin(): |
There was a problem hiding this comment.
Using session.begin() for a read-only operation is unnecessary. The list_tenant_workflow_tools method only reads data and doesn't perform any writes. Based on the codebase convention, read-only operations should use with session_factory.create_session() as session: without the session.begin() call, as seen in other similar methods like get_tool_labels and get_tools_labels in tool_label_manager.py. This will avoid unnecessary transaction overhead.
| with session_factory.create_session() as session, session.begin(): | |
| with session_factory.create_session() as session: |
| .first() | ||
| ) | ||
| return cls._get_workflow_tool(tenant_id, db_tool) | ||
| with session_factory.create_session() as session, session.begin(): |
There was a problem hiding this comment.
Using session.begin() for a read-only operation is unnecessary. The get_workflow_tool_by_tool_id method only reads data. Based on the codebase convention, read-only operations should use with session_factory.create_session() as session: without the session.begin() call. This will avoid unnecessary transaction overhead.
| .first() | ||
| ) | ||
| return cls._get_workflow_tool(tenant_id, db_tool) | ||
| with session_factory.create_session() as session, session.begin(): |
There was a problem hiding this comment.
Using session.begin() for a read-only operation is unnecessary. The get_workflow_tool_by_app_id method only reads data. Based on the codebase convention, read-only operations should use with session_factory.create_session() as session: without the session.begin() call. This will avoid unnecessary transaction overhead.
| with session_factory.create_session() as session, session.begin(): | |
| with session_factory.create_session() as session: |
Pyrefly DiffNo changes detected. |
Pyrefly DiffNo changes detected. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| session.add(provider) | ||
|
|
||
| # delete cache | ||
| cache.delete() | ||
|
|
||
| # update labels | ||
| ToolLabelManager.update_tool_labels(provider_controller, labels, session) |
There was a problem hiding this comment.
cache.delete() now runs inside the DB transaction. If the transaction later rolls back/raises, the cache will already be invalidated, causing cache/DB inconsistency. Consider moving cache invalidation to after the session.begin() block successfully commits (or use an after-commit hook) so cache is only cleared on success.
| with session_factory.create_session() as session: | ||
| stmt = select(ToolLabelBinding.label_name).where( | ||
| ToolLabelBinding.tool_id == provider_id, | ||
| ToolLabelBinding.tool_type == controller.provider_type.value, | ||
| ) | ||
| labels = session.scalars(stmt).all() |
There was a problem hiding this comment.
get_tool_labels always creates its own SQLAlchemy session. Callers like ApiToolManageService.list_api_tools already run inside a create_session() context, so this causes extra connections and breaks transactional consistency (labels may be read from a different transaction than the provider query). Consider accepting an optional session: Session | None parameter (creating one only when not provided) and have callers pass their existing session.
| with session_factory.create_session() as session: | ||
| labels = session.scalars(select(ToolLabelBinding).where(ToolLabelBinding.tool_id.in_(provider_ids))).all() | ||
|
|
There was a problem hiding this comment.
get_tools_labels opens a new session even when the caller is already within an existing session/transaction (e.g., WorkflowToolManageService.list_tenant_workflow_tools). This adds avoidable overhead and can return labels from a different transaction snapshot than the tool provider rows. Consider adding an optional session parameter (similar to update_tool_labels) so callers can reuse their session.
| with session_factory.create_session() as session, session.begin(): | ||
| workflow_app = ( | ||
| session.query(App).where(App.id == db_tool.app_id, App.tenant_id == db_tool.tenant_id).first() | ||
| ) |
There was a problem hiding this comment.
_get_workflow_tool creates a new session/transaction even though its callers (get_workflow_tool_by_tool_id / get_workflow_tool_by_app_id) already run inside create_session() + session.begin(). This nested-session pattern adds overhead and can lead to inconsistent reads across different transactions. Consider passing the caller's session into _get_workflow_tool (or removing the inner create_session() block) so all queries run in a single session.
Pyrefly DiffNo changes detected. |
Important
Fixes #<issue number>.Summary
part of #24115
Screenshots
Checklist
make lintandmake type-check(backend) andcd web && npx lint-staged(frontend) to appease the lint gods