-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: update langchain version #2641
Conversation
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
tools: Optional[Sequence[Union[ToolDict, GoogleTool]]] = None, | ||
functions: Optional[Sequence[FunctionDeclarationType]] = None, | ||
tools: Optional[Sequence[Union[_ToolDict, GoogleTool]]] = None, | ||
functions: Optional[Sequence[_FunctionDeclarationType]] = None, | ||
safety_settings: Optional[SafetySettingDict] = None, | ||
tool_config: Optional[Union[Dict, _ToolConfigDict]] = None, | ||
generation_config: Optional[Dict[str, Any]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several improvements that can be made to this code:
-
Import Statements: The order of imports has been rearranged more logically.
tools
should typically come beforefunctions
, and safety settings might need reordering if applicable. -
Optional Parameters Handling: Ensure that optional parameters like
stop
are explicitly set when they are not provided by using default arguments within the function definition. -
Documentation Comments: Add more documentation comments to clarify the purpose and usage of each parameter, especially
_tool_config
. -
Naming Consistency: Consider renaming variables and dictionaries for better readability and consistency with other parts of the library.
-
Safety Setting Dict Check: Ensure that safety settings are properly checked or used, depending on how you intend to handle them.
Here's an improved version of the code:
from typing import List, Dict, Optional, Sequence, Union, Any, Iterator, cast
from google.ai.generativelanguage_v1beta.types import (
Tool as GoogleTool,
)
from langchain_core.callbacks import CallbackManagerForLLMRun
from langchain_core.messages import BaseMessage
from langchain_core.outputs import ChatGenerationChunk
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_genai._function_utils import _ToolConfigDict, _ToolDict
from langchain_google_genai.chat_models import _chat_with_retry, _response_to_result, _FunctionDeclarationType, _SafetySettingsDict
from setting.models_provider.base_model_provider import MaxKBBaseModel
def _stream(
messages: List[BaseMessage],
max_tokens_per_message: int = 506,
temperature: float = 0.7,
top_p: float = 0.95,
presence_penalty: float = 0.,
frequency_penalty: float = 0.,
logit_bias: Optional[dict] = {},
stop: Optional[List[str]] = None,
run_manager: Optional[CallbackManagerForLLMRun] = None,
*,
tools: Optional[Sequence[Union[_ToolDict, GoogleTool]]] = None,
functions: Optional[Sequence[_FunctionDeclarationType]] = None,
# Explicitly document safety settings handling
safety_settings: Optional[_SafetySettingsDict] = None,
tool_config: Optional[Union[Dict, _ToolConfigDict]] = None,
generation_config: Optional[Dict[str, Any]] = None,
):
"""
Streams text completion using GPT API based on previous message history.
Args:
- messages (List[BaseMessage]): Messages preceding current request.
- max_tokens_per_message (int): Maximum number of tokens per message in output generated by LLM.
- temperature (float, default=0.7): Controls creativity vs. precision.
- top_p (float, default=0.95): Nucleus sampling top-p probability mass.
- presence_penalty (float, default=0.0): Punishes likelihood of repition.
- frequency_penalty (float, default=0.0): Penalizes use of previously emitted words.
- logit_bias (Optional[dict], default={}): Dictionary mapping token ids to bias values.
- stop (Optional[List[str]], default=None): List of strings at which to end text generation.
- run_manager (Optional[CallbackManagerForLLMRun], default=None): Manager for callbacks during LLM generation run.
- tools (Optional[Sequence[Union[_ToolDict, GoogleTool]]], default=None): Tools available to assistant during query.
- functions (Optional[Sequence[_FunctionDeclarationType]]), default=None): Available custom functions.
- safety_settings (_SafetySettingsDict, default=None): Settings to control content safety scoring.
- tool_config (Optional[Union[Dict, _ToolConfigDict]], default=None): Configuration for tools, potentially enhancing their functionality.
- generation_config (Optional[Dict[str, Any]], default=None): Additional configuration options for text generation.
Returns:
- Generator[ChatGenerationChunk]: A generator over chat-generation chunks.
"""
response_generator =_chat_with_retry(ChatGoogleGenerativeAI(), max_tokens_per_message=max_tokens_per_message,)
.generate_stream(messages, temperature, top_p, presence_penalty, frequency_penalty, logit_bias, stop, run_manager=run_manager).
for chunk in response_generator:
yield _response_to_result(chunk, generation_config)
Key Changes:
- Imports: Sorted and grouped appropriately.
- Comments: Added detailed docstring explaining the function and its parameters.
- Parameter Order: Ensured clarity between mandatory and optional parameters.
- Handling Safety Settings: Added documentation to emphasize proper handling of safety settings.
from setting.models_provider.base_model_provider import MaxKBBaseModel | ||
from sklearn.metrics.pairwise import cosine_similarity | ||
from pydantic.v1 import BaseModel, Field | ||
from pydantic import BaseModel, Field | ||
|
||
|
||
class OllamaReranker(MaxKBBaseModel, OllamaEmbeddings, BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code looks clean and follows PEP8 style conventions. However, there are a few minor improvements you can make:
-
Remove Duplicate Imports: The imports of
OllamaEmbeddings
from bothlangchain_community.embeddings
andsetting.models_provider.base_model_provider
can be merged into one. -
Use Aliases for Pydantic Fields: While
Field
is available in the new version of Pydantic (pydantic
) without an alias, adding aliases might help with readability if future updates to Pydantic change their behavior slightly.
Here's the revised code:
from typing import Sequence, Optional, Any, Dict
import pandas as pd
from langchain_core.callbacks import Callbacks
from langchain_core.documents import Document
from sklearn.metrics.pairwise import cosine_similarity
from pydantic.v1 import BaseModel, Field
class OllamaReranker(OllamaEmbeddings, MaxKBBaseModel, BaseModel):
def __init__(self, callbacks: Callbacks = None, vectors_df: pd.DataFrame = None):
super().__init__()
self.callbacks = callbacks
self.vectors_df = vectors_df
Additional Suggestions:
-
Add Default Values for Parameters: If some parameters need default values, consider assigning them in the constructor using
_config
.class OllamaReranker(OllamaEmbeddings, MaxKBBaseModel, BaseModel): # Existing code remains the same... def __init__(self, callbacks: Callbacks = None, vectors_df: pd.DataFrame = None, vector_column_name: str = "vector", search_threshold: float = 0.75): super().__init__() self.callbacks = callbacks self.vectors_df = vectors_df self.vector_column_name = vector_column_name self.search_threshold = search_threshold
These changes will improve the readability and maintainability of the code while maintaining compatibility with the latest versions of libraries like Pydantic and LangChain Community.
@@ -69,7 +69,7 @@ class LocalBaseReranker(MaxKBBaseModel, BaseDocumentCompressor): | |||
tokenizer: Any = None | |||
model: Optional[str] = None | |||
cache_dir: Optional[str] = None | |||
model_kwargs = {} | |||
model_kwargs: ClassVar = {} | |||
|
|||
def __init__(self, model_name, cache_dir=None, **model_kwargs): | |||
super().__init__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provided code snippet has several areas that need attention:
-
Docstring Formatting: The date in the docstring is
@date:2024/9/2 16:42
, which seems to be incorrect format. It should use dashes (:
) instead of slashes/
. Additionally, Python prefers uppercase letters for constants likeClassVar
.Suggestion: Update the date string to
@date: 2024-09-02 16:42
and renamemodel_kwargs
toMODEL_KWARGS
as it looks more Pythonic. -
Variable Renaming: It's generally considered best practice to use CamelCase naming convention for class variables. Thus, consider renaming
model_kwargs
tomodelKwargs
. -
Docstring Structure: While not strictly necessary, having proper formatting for docstrings can make code easier to read. For example, ensuring each parameter description is followed immediately after with a colon (
:
). -
Python Version Considerations: If you're working with Python 3.x, ensure compatibility checks are made where necessary. However, these changes do not affect general functionality, just coding style.
Here’s an improved version based on these suggestions:
# Modified part of the code
class LocalBaseReranker(MaxKBBaseModel, BaseDocumentCompressor):
tokenizer: Any = None
model: Optional[str] = None
cache_dir: Optional[str] = None
MODEL_KWARGS: ClassVar[Dict[str, Any]] = {}
def __init__(self, model_name: str, cache_dir: Optional[str] = None, model_kwargs: dict = {}):
super().__init__()
These modifications improve readability and maintainability while adhering to Python conventions.
ci: update langchain version