Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add langchain component for Vertex Standalone Ranking API #167

Merged
merged 13 commits into from
Apr 25, 2024

Conversation

Abhishekbhagwat
Copy link
Contributor

Vertex DIY RAG APIs helps to build complex RAG systems and provide more granular control, and are suited for custom use cases.
The ranking API takes in a list of documents and reranks those documents based on how relevant the documents are to a given query. Compared to embeddings that look purely at the semantic similarity of a document and a query, the ranking API can give you a more precise score for how well a document answers a given query.
Reference : https://cloud.google.com/generative-ai-app-builder/docs/ranking

We implement this similar to other ranking APIs that langchain already integrates with - like Cohere Ranker

@lkuligin lkuligin marked this pull request as ready for review April 25, 2024 03:12
@@ -17,6 +17,8 @@ langchain-community = ">=0.0.28"
google-api-core = "^2.17.1"
google-api-python-client = "^2.122.0"
grpcio = "^1.62.0"
google-cloud-discoveryengine = "^0.11.11"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it add it as a poetry group, please?

from google.api_core import exceptions as core_exceptions
from google.auth.credentials import Credentials
from google.cloud import discoveryengine_v1alpha
from langchain.retrievers.document_compressors.base import BaseDocumentCompressor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, import it from langchain_core

@@ -17,6 +17,8 @@ langchain-community = ">=0.0.28"
google-api-core = "^2.17.1"
google-api-python-client = "^2.122.0"
grpcio = "^1.62.0"
google-cloud-discoveryengine = "^0.11.11"
langchain = "^0.1.16"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't depend on langchain

@@ -103,4 +104,4 @@ markers = [
"asyncio: mark tests as requiring asyncio",
"compile: mark placeholder test used to compile integration tests without running them",
]
asyncio_mode = "auto"
asyncio_mode = "auto"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: new line

from google.cloud import discoveryengine_v1alpha


class VertexRankSDKManager:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this as a public class?

else None
)

def get_rank_service_client(self) -> discoveryengine_v1alpha.RankServiceClient:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it a private function instead of a class, please?

title_field: Optional[str] = Field(default=None)
credentials: Optional[Credentials] = Field(default=None)
credentials_path: Optional[str] = Field(default=None)
sdk_manager: VertexRankSDKManager = Field(default=None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need an additional sdk_manager provided. can we just re-use project_id / location_id attributes from this class instead?

from langchain_google_community.rank._sdk_manager import VertexRankSDKManager


class VertexAIRank(BaseDocumentCompressor):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, add to the __init__.py on the module's level for an easy import

@lkuligin lkuligin merged commit 4d55460 into langchain-ai:main Apr 25, 2024
15 checks passed
@Abhishekbhagwat Abhishekbhagwat deleted the ranking branch April 26, 2024 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants