##### Query construction is process of transforming natural language query into the query language of the database or data source we're interacting with. As a lot of data is structured and stored in relational databases.

### Text-to-Metadata filter

Most vector stores provide the ability to limit your vector search based on metadata. During the embedding process, we can attach metadata key-value pairs to vectors in an index and then later specify filter expressions when you query the index.

In [1]:
from langchain_classic.chains.query_constructor.schema import AttributeInfo
from langchain_classic.retrievers.self_query.base import SelfQueryRetriever
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_postgres.vectorstores import PGVector

In [2]:
connection = 'postgresql+psycopg://langchain:langchain@localhost:6024/langchain'
collection_name = "Harry_Potter_Complete"
embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

db = PGVector(
    embeddings=embedding_model,
    connection=connection,
    collection_name=collection_name
)

In [3]:
fields = [
    AttributeInfo(
        name="genre",
        description="The genre of the movie",
        type="string or list[string]",
    ),
    AttributeInfo(
        name="year",
        description="The year the movie was released",
        type="integer",
    ),
    AttributeInfo(
        name="director",
        description="The name of the movie director",
        type="string",
    ),
    AttributeInfo(
        name="rating", 
        description="A 1-10 rating for the movie", 
        type="float"
    ),
]
description = "Brief summary of a movie"

In [4]:
llm = ChatOpenAI(model='gpt-3.5-turbo', temperature=0)

retriever = SelfQueryRetriever.from_llm(
    llm=llm,
    vectorstore=db,
    document_contents=description,
    metadata_field_info=fields
)

In [5]:
retriever.invoke("What's a highly rated (above 8.5) science fiction film?")

[]

This results in a retriever that will take a user query, and split it into:

• A filter to apply on the metadata of each document first

• A query to use for semantic search on the documents