# 랭체인(LangChain) Query Analysis Handle Multiple Retrievers 예제
## 작성자 : AISchool ( http://aischool.ai/%ec%98%a8%eb%9d%bc%ec%9d%b8-%ea%b0%95%ec%9d%98-%ec%b9%b4%ed%85%8c%ea%b3%a0%eb%a6%ac/ )
## Reference : https://python.langchain.com/docs/use_cases/query_analysis/how_to/multiple_retrievers/

질의 분석(query analysis) 기술을 사용하면 **어떤 검색 도구(retriever)를 사용할지 선택할 수 있습니다. 이를 사용하기 위해서는 retriever를 선택하는 논리를 추가해야 합니다.** 여기에 간단한 예시를 보여 드리겠습니다(가상 데이터 사용).

# 라이브러리 설치

In [None]:
!pip install -qU langchain langchain-community langchain-openai langchain-chroma

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m817.7/817.7 kB[0m [31m4.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m21.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m290.2/290.2 kB[0m [31m14.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m113.7/113.7 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m309.9/309.9 kB[0m [31m17.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m39.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m525.5/525.5 kB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m91.9/91.9 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━

# OpenAI API Key 설정

In [None]:
OPENAI_KEY = "여러분의_OPENAI_API_KEY"

# Create Index

우리는 더미 정보에 대해 벡터 저장소(vectorstore)를 만들 것입니다.

In [None]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings

texts = ["Harrison worked at Kensho"]
embeddings = OpenAIEmbeddings(model="text-embedding-3-small", openai_api_key=OPENAI_KEY)
vectorstore = Chroma.from_texts(texts, embeddings, collection_name="harrison")
retriever_harrison = vectorstore.as_retriever(search_kwargs={"k": 1})

texts = ["Ankush worked at Facebook"]
embeddings = OpenAIEmbeddings(model="text-embedding-3-small", openai_api_key=OPENAI_KEY)
vectorstore = Chroma.from_texts(texts, embeddings, collection_name="ankush")
retriever_ankush = vectorstore.as_retriever(search_kwargs={"k": 1})

# Query analysis

우리는 출력을 구조화하기 위해 **함수 호출(function calling)을 사용**할 것입니다. 이를 통해 여러 쿼리(multiple queries)를 반환하도록 할 것입니다.

In [None]:
from typing import List, Optional

from langchain_core.pydantic_v1 import BaseModel, Field


class Search(BaseModel):
    """Search for information about a person."""

    query: str = Field(
        ...,
        description="Query to look up",
    )
    person: str = Field(
        ...,
        description="Person to look things up for. Should be `HARRISON` or `ANKUSH`.",
    )

In [None]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI

system = """You have the ability to issue search queries to get information to help answer user information."""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system),
        ("human", "{question}"),
    ]
)
llm = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, openai_api_key=OPENAI_KEY)
structured_llm = llm.with_structured_output(Search)
query_analyzer = {"question": RunnablePassthrough()} | prompt | structured_llm

이를 통해 검색 도구들(retrievers) 간의 라우팅이 가능함을 알 수 있습니다.




In [None]:
query_analyzer.invoke("where did Harrison Work")

Search(query='work', person='HARRISON')

In [None]:
query_analyzer.invoke("where did ankush Work")

Search(query='workplace', person='ANKUSH')

# Retrieval with query analysis


그렇다면 이를 체인에 어떻게 포함시킬까요? **검색 도구(retriever)를 선택**하고 검색 쿼리를 전달하기 위해 간단한 논리만 필요합니다.

In [None]:
from langchain_core.runnables import chain

In [None]:
retrievers = {
    "HARRISON": retriever_harrison,
    "ANKUSH": retriever_ankush,
}

In [None]:
@chain
def custom_chain(question):
    response = query_analyzer.invoke(question)
    retriever = retrievers[response.person]
    return retriever.invoke(response.query)

In [None]:
custom_chain.invoke("where did Harrison Work")

[Document(page_content='Harrison worked at Kensho')]

In [None]:
custom_chain.invoke("where did ankush Work")

[Document(page_content='Ankush worked at Facebook')]