Add langchain component for Vertex Standalone Ranking API #167

Abhishekbhagwat · 2024-04-19T07:55:35Z

Vertex DIY RAG APIs helps to build complex RAG systems and provide more granular control, and are suited for custom use cases.
The ranking API takes in a list of documents and reranks those documents based on how relevant the documents are to a given query. Compared to embeddings that look purely at the semantic similarity of a document and a query, the ranking API can give you a more precise score for how well a document answers a given query.
Reference : https://cloud.google.com/generative-ai-app-builder/docs/ranking

We implement this similar to other ranking APIs that langchain already integrates with - like Cohere Ranker

…n tests

lkuligin · 2024-04-25T03:19:19Z

libs/community/pyproject.toml

@@ -17,6 +17,8 @@ langchain-community = ">=0.0.28"
 google-api-core = "^2.17.1"
 google-api-python-client = "^2.122.0"
 grpcio = "^1.62.0"
+google-cloud-discoveryengine = "^0.11.11"


could it add it as a poetry group, please?

lkuligin · 2024-04-25T03:20:21Z

libs/community/langchain_google_community/ranker/rank.py

+from google.api_core import exceptions as core_exceptions
+from google.auth.credentials import Credentials
+from google.cloud import discoveryengine_v1alpha
+from langchain.retrievers.document_compressors.base import BaseDocumentCompressor


please, import it from langchain_core

lkuligin · 2024-04-25T03:20:33Z

libs/community/pyproject.toml

@@ -17,6 +17,8 @@ langchain-community = ">=0.0.28"
 google-api-core = "^2.17.1"
 google-api-python-client = "^2.122.0"
 grpcio = "^1.62.0"
+google-cloud-discoveryengine = "^0.11.11"
+langchain = "^0.1.16"


we shouldn't depend on langchain

…tualCompressionRetriever

lkuligin · 2024-04-25T07:00:58Z

libs/community/pyproject.toml

@@ -103,4 +104,4 @@ markers = [
  "asyncio: mark tests as requiring asyncio",
  "compile: mark placeholder test used to compile integration tests without running them",
 ]
-asyncio_mode = "auto"
+asyncio_mode = "auto"


nits: new line

lkuligin · 2024-04-25T07:13:48Z

libs/community/langchain_google_community/rank/_sdk_manager.py

+from google.cloud import discoveryengine_v1alpha
+
+
+class VertexRankSDKManager:


why do we need this as a public class?

lkuligin · 2024-04-25T07:14:06Z

libs/community/langchain_google_community/rank/_sdk_manager.py

+            else None
+        )
+
+    def get_rank_service_client(self) -> discoveryengine_v1alpha.RankServiceClient:


can we make it a private function instead of a class, please?

lkuligin · 2024-04-25T07:14:46Z

libs/community/langchain_google_community/rank/rank.py

+    title_field: Optional[str] = Field(default=None)
+    credentials: Optional[Credentials] = Field(default=None)
+    credentials_path: Optional[str] = Field(default=None)
+    sdk_manager: VertexRankSDKManager = Field(default=None)


I don't think we need an additional sdk_manager provided. can we just re-use project_id / location_id attributes from this class instead?

lkuligin · 2024-04-25T07:16:32Z

libs/community/langchain_google_community/rank/rank.py

+from langchain_google_community.rank._sdk_manager import VertexRankSDKManager
+
+
+class VertexAIRank(BaseDocumentCompressor):


please, add to the __init__.py on the module's level for an easy import

…RankSDKManager

Abhishekbhagwat and others added 4 commits April 19, 2024 12:05

move ranker to community, update class inheritance and fix integratio…

27746b0

…n tests

update dependencies

227fa9c

Merge branch 'main' into ranking

fed2110

updated naming convention to standardize as Rank

9b4837a

lkuligin marked this pull request as ready for review April 25, 2024 03:12

lkuligin approved these changes Apr 25, 2024

View reviewed changes

lkuligin reviewed Apr 25, 2024

View reviewed changes

Abhishekbhagwat added 3 commits April 25, 2024 12:11

fix lint errors

46257d0

update pyproject.toml to remove dependency on langchain

fd85414

add CustomRankingRetriever to remove dependency on langchain - Contex…

931c4fb

…tualCompressionRetriever

lkuligin reviewed Apr 25, 2024

View reviewed changes

Merge branch 'main' into ranking

28336f0

lkuligin reviewed Apr 25, 2024

View reviewed changes

Abhishekbhagwat added 2 commits April 25, 2024 17:14

fix: rebase with main, resolve merge confict, fix lint, remove Vertex…

a2c386c

…RankSDKManager

add VertexAIRank at module level

94ff950

lkuligin approved these changes Apr 25, 2024

View reviewed changes

Abhishekbhagwat added 3 commits April 25, 2024 18:01

update poetry dependency group for failing tests

aefce2e

added ignore type hint to fix failing tests

ebd29ae

remove explict mock patch from test

c831125

lkuligin merged commit 4d55460 into langchain-ai:main Apr 25, 2024
15 checks passed

Abhishekbhagwat deleted the ranking branch April 26, 2024 03:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add langchain component for Vertex Standalone Ranking API #167

Add langchain component for Vertex Standalone Ranking API #167

Abhishekbhagwat commented Apr 19, 2024

lkuligin Apr 25, 2024

lkuligin Apr 25, 2024

lkuligin Apr 25, 2024

lkuligin Apr 25, 2024

lkuligin Apr 25, 2024

lkuligin Apr 25, 2024

lkuligin Apr 25, 2024

lkuligin Apr 25, 2024

		from google.cloud import discoveryengine_v1alpha


		class VertexRankSDKManager:

		from langchain_google_community.rank._sdk_manager import VertexRankSDKManager


		class VertexAIRank(BaseDocumentCompressor):

Add langchain component for Vertex Standalone Ranking API #167

Add langchain component for Vertex Standalone Ranking API #167

Conversation

Abhishekbhagwat commented Apr 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment