# Sip & Search: Crafting Your Personal Wine Expert with LangChain

Indulge in the world of wines as you embark on a journey to create your very own wine expert with LangChain. In this captivating tutorial, discover the secrets of building a self-querying retriever for wine recommendations, right at your fingertips. Here's what awaits you:

1. **Library Setup:** Dive into the installation of essential libraries like `langchain`, `huggingface_hub`, `google-search-results`, and more, laying the groundwork for your wine exploration adventure.

2. **Setting the Scene:** Immerse yourself in the realm of wine expertise with a picturesque introduction, drawing inspiration from Sam's channel for insights into LLMs and beyond.

3. **Crafting the Retriever:** Learn the art of creating a self-querying retriever tailored specifically for wine recommendations, leveraging LangChain's powerful capabilities. Explore metadata setup, model initialization, and vectorstore integration for seamless retrieval.

4. **Example Data Delight:** Delve into a delectable assortment of example wine data, complete with tantalizing metadata, providing the perfect backdrop for your wine journey.

5. **Querying Magic:** Unleash the magic of self-querying as you explore various query types, from simple inquiries to composite filters, unlocking a treasure trove of wine recommendations with each question.

6. **Fine-Tuning with Filters:** Master the art of fine-tuning your wine search with filters, refining your results based on criteria like grape variety, country of origin, ratings, and more.

7. **Exploring Limitations:** Discover how to set limits on your search results, controlling the number of documents fetched, and ensuring a tailored wine recommendation experience.

8. **Expert Interaction:** Engage in expert-level wine discussions with a seamless Gradio interface, where you can ask questions and receive detailed recommendations with ease.

Craft your personal wine expert and elevate your wine journey to new heights with LangChain's self-querying retriever. Cheers to a world of wine exploration at your fingertips! 🍷

In [1]:
!pip -q install langchain huggingface_hub google-search-results tiktoken chromadb lark langchain-together sentence_transformers gradio -qqq

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/974.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.7/974.0 kB[0m [31m831.2 kB/s[0m eta [36m0:00:02[0m[2K     [91m━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.2/974.0 kB[0m [31m1.2 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m256.0/974.0 kB[0m [31m2.3 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━[0m [32m614.4/974.0 kB[0m [31m4.2 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m972.8/974.0 kB[0m [31m5.7 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m974.0/974.0 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90

In [4]:
!pip install langchain-community

Collecting langchain-community
  Downloading langchain_community-0.2.4-py3-none-any.whl (2.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m9.5 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl (28 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.21.3-py3-none-any.whl (49 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m49.2/49.2 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl (8.8 kB)
Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB)
Installing collected packages: mypy-extensio

In [2]:
from google.colab import userdata
import textwrap
import os

os.environ["TOGETHER_API_KEY"] = userdata.get('TOGETHER_API_KEY')

## Self-querying Retriever


![](https://i.imgur.com/Js698XPl.jpg)

This Example is adopted from https://youtu.be/f4LeWlt3T8Y?si=MCTX2AsOVNXGR97q - Follow Sam's channel for excellent content on LLMs & more


This Jupyter Notebook focuses on creating a self-querying retriever for wine recommendations using LangChain. It starts with importing necessary libraries, setting up metadata for wine documents, and initializing a Together model from OpenHermes-2p5-Mistral-7B. The notebook then creates a SelfQueryRetriever based on the LLM and vectorstore to search for relevant wine information based on user queries. The retriever can handle various query types, including simple queries, queries with filters, or composite filters. It also demonstrates how to limit the number of returned documents (k) using the enable_limit parameter. Finally, the notebook shows how to integrate this functionality into a Gradio interface for user interaction.

In [5]:
from langchain.schema import Document
#from langchain.embeddings.openai import OpenAIEmbeddings
#from langchain_together.embeddings import TogetherEmbeddings
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma

#embeddings = OpenAIEmbeddings()
#embeddings = TogetherEmbeddings(model="togethercomputer/m2-bert-80M-2k-retrieval")
embeddings = HuggingFaceEmbeddings(model_name="intfloat/multilingual-e5-small")

from operator import itemgetter

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda, RunnablePassthrough

  warn_deprecated(
  from tqdm.autonotebook import tqdm, trange


modules.json:   0%|          | 0.00/387 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/160k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/57.0 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/655 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/471M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/443 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.1M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/167 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/200 [00:00<?, ?B/s]

## Example data with metadata attached

In [6]:
docs = [
    Document(
        page_content="Complex, layered, rich red with dark fruit flavors",
        metadata={"name": "Opus One", "year": 2018, "rating": 96, "grape": "Cabernet Sauvignon", "color": "red", "country": "USA", "price_eur": 350},
    ),
    Document(
        page_content="Luxurious, sweet wine with flavors of honey, apricot, and peach",
        metadata={"name": "Château d'Yquem", "year": 2015, "rating": 98, "grape": "Sémillon", "color": "white", "country": "France", "price_eur": 650},
    ),
    Document(
        page_content="Full-bodied red with notes of black fruit and spice",
        metadata={"name": "Penfolds Grange", "year": 2017, "rating": 97, "grape": "Shiraz", "color": "red", "country": "Australia", "price_eur": 500},
    ),
    Document(
        page_content="Elegant, balanced red with herbal and berry nuances",
        metadata={"name": "Sassicaia", "year": 2016, "rating": 95, "grape": "Cabernet Franc", "color": "red", "country": "Italy", "price_eur": 225},
    ),
    Document(
        page_content="Highly sought-after Pinot Noir with red fruit and earthy notes",
        metadata={"name": "Domaine de la Romanée-Conti", "year": 2018, "rating": 100, "grape": "Pinot Noir", "color": "red", "country": "France", "price_eur": 20000},
    ),
    Document(
        page_content="Crisp white with tropical fruit and citrus flavors",
        metadata={"name": "Cloudy Bay", "year": 2021, "rating": 92, "grape": "Sauvignon Blanc", "color": "white", "country": "New Zealand", "price_eur": 30},
    ),
    Document(
        page_content="Rich, complex Champagne with notes of brioche and citrus",
        metadata={"name": "Krug Grande Cuvée", "year": 2010, "rating": 93, "grape": "Chardonnay blend", "color": "sparkling", "country": "France", "price_eur": 180},
    ),
    Document(
        page_content="Intense, dark fruit flavors with hints of chocolate",
        metadata={"name": "Caymus Special Selection", "year": 2018, "rating": 96, "grape": "Cabernet Sauvignon", "color": "red", "country": "USA", "price_eur": 160},
    ),
    Document(
        page_content="Exotic, aromatic white with stone fruit and floral notes",
        metadata={"name": "Jermann Vintage Tunina", "year": 2020, "rating": 91, "grape": "Sauvignon Blanc blend", "color": "white", "country": "Italy", "price_eur": 60},
    ),
    Document(
        page_content="Vibrant and fresh with a fine blend of minerality and citrus flavors, showcasing the terroir's signature",
        metadata={"name": "Trimbach Riesling Cuvée Frédéric Emile", "year": 2015, "rating": 95, "grape": "Riesling", "color": "white", "country": "France", "region": "Alsace", "price_eur": 90},
    ),
    Document(
        page_content="Richly textured with lychee, spice, and floral notes, a true expression of Alsace's terroir",
        metadata={"name": "Hugel & Fils Gewurztraminer Classic", "year": 2018, "rating": 92, "grape": "Gewurztraminer", "color": "white", "country": "France", "region": "Alsace", "price_eur": 45},
    ),
    Document(
        page_content="Elegant and complex, offering a seamless blend of chalky minerality and crisp apple notes",
        metadata={"name": "Domaine Zind-Humbrecht Pinot Gris", "year": 2019, "rating": 94, "grape": "Pinot Gris", "color": "white", "country": "France", "region": "Alsace", "price_eur": 50},
    ),
    Document(
        page_content="A luxurious, full-bodied wine with deep berry flavors and a hint of earthiness, representing the pinnacle of Bordeaux excellence",
        metadata={"name": "Château Latour", "year": 2010, "rating": 100, "grape": "Cabernet Sauvignon blend", "color": "red", "country": "France", "region": "Bordeaux", "price_eur": 1200},
    ),
    Document(
        page_content="An iconic, powerful Syrah from the Northern Rhône, offering a complex array of blackberry, smoke, and pepper notes",
        metadata={"name": "Guigal La Mouline", "year": 2015, "rating": 97, "grape": "Syrah", "color": "red", "country": "France", "region": "Rhône", "price_eur": 250},
    ),
    Document(
        page_content="A standout Chardonnay from Burgundy, with a perfect balance of oak and vibrant acidity, featuring apple, pear, and mineral notes",
        metadata={"name": "Louis Latour Corton-Charlemagne", "year": 2018, "rating": 96, "grape": "Chardonnay", "color": "white", "country": "France", "region": "Burgundy", "price_eur": 150},
    )
]

In [7]:
# get the wines in the store
vectorstore = Chroma.from_documents(docs, embeddings)

## Creating our self-querying retriever

In [8]:
from langchain_together import Together
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain.chains.query_constructor.base import AttributeInfo

metadata_field_info = [
    AttributeInfo(
        name="grape",
        description="The grape used to make the wine",
        type="string or list[string]",
    ),
    AttributeInfo(
        name="name",
        description="The name of the wine",
        type="string or list[string]",
    ),
    AttributeInfo(
        name="color",
        description="The color of the wine",
        type="string or list[string]",
    ),
    AttributeInfo(
        name="year",
        description="The year the wine was released",
        type="integer",
    ),
    AttributeInfo(
        name="country",
        description="The name of the country the wine comes from",
        type="string",
    ),
    AttributeInfo(
        name="rating", description="The Robert Parker rating for the wine 0-100", type="integer" #float
    ),
    AttributeInfo(
    name="price_eur", description="Average price in EUR", type="integer" #float
    ),
]
document_content_description = "Brief description of the wine"



In [9]:
llm = Together(
    model="teknium/OpenHermes-2p5-Mistral-7B",
    temperature=0.3,
    max_tokens=256,
    top_k=50,
    # together_api_key="..."
)

retriever = SelfQueryRetriever.from_llm(
    llm,
    vectorstore,
    document_content_description,
    metadata_field_info,
    verbose=True
)

In [10]:
# This example only specifies a relevant query
retriever.get_relevant_documents("What are some red wines")

  warn_deprecated(


[Document(page_content='Complex, layered, rich red with dark fruit flavors', metadata={'color': 'red', 'country': 'USA', 'grape': 'Cabernet Sauvignon', 'name': 'Opus One', 'price_eur': 350, 'rating': 96, 'year': 2018}),
 Document(page_content='Highly sought-after Pinot Noir with red fruit and earthy notes', metadata={'color': 'red', 'country': 'France', 'grape': 'Pinot Noir', 'name': 'Domaine de la Romanée-Conti', 'price_eur': 20000, 'rating': 100, 'year': 2018}),
 Document(page_content='Elegant, balanced red with herbal and berry nuances', metadata={'color': 'red', 'country': 'Italy', 'grape': 'Cabernet Franc', 'name': 'Sassicaia', 'price_eur': 225, 'rating': 95, 'year': 2016}),
 Document(page_content='Full-bodied red with notes of black fruit and spice', metadata={'color': 'red', 'country': 'Australia', 'grape': 'Shiraz', 'name': 'Penfolds Grange', 'price_eur': 500, 'rating': 97, 'year': 2017})]

In [11]:
retriever.get_relevant_documents("I want a wine that has fruity notes")

[Document(page_content='Crisp white with tropical fruit and citrus flavors', metadata={'color': 'white', 'country': 'New Zealand', 'grape': 'Sauvignon Blanc', 'name': 'Cloudy Bay', 'price_eur': 30, 'rating': 92, 'year': 2021}),
 Document(page_content='Intense, dark fruit flavors with hints of chocolate', metadata={'color': 'red', 'country': 'USA', 'grape': 'Cabernet Sauvignon', 'name': 'Caymus Special Selection', 'price_eur': 160, 'rating': 96, 'year': 2018}),
 Document(page_content='Complex, layered, rich red with dark fruit flavors', metadata={'color': 'red', 'country': 'USA', 'grape': 'Cabernet Sauvignon', 'name': 'Opus One', 'price_eur': 350, 'rating': 96, 'year': 2018}),
 Document(page_content='Exotic, aromatic white with stone fruit and floral notes', metadata={'color': 'white', 'country': 'Italy', 'grape': 'Sauvignon Blanc blend', 'name': 'Jermann Vintage Tunina', 'price_eur': 60, 'rating': 91, 'year': 2020})]

In [12]:
# This example specifies a query and a filter
retriever.get_relevant_documents("I want a wine that has fruity nodes and has a rating above 97")

[Document(page_content='Full-bodied red with notes of black fruit and spice', metadata={'color': 'red', 'country': 'Australia', 'grape': 'Shiraz', 'name': 'Penfolds Grange', 'price_eur': 500, 'rating': 97, 'year': 2017}),
 Document(page_content='An iconic, powerful Syrah from the Northern Rhône, offering a complex array of blackberry, smoke, and pepper notes', metadata={'color': 'red', 'country': 'France', 'grape': 'Syrah', 'name': 'Guigal La Mouline', 'price_eur': 250, 'rating': 97, 'region': 'Rhône', 'year': 2015})]

In [13]:
retriever.get_relevant_documents(
    "What wines come from Italy?"
)

[Document(page_content='Exotic, aromatic white with stone fruit and floral notes', metadata={'color': 'white', 'country': 'Italy', 'grape': 'Sauvignon Blanc blend', 'name': 'Jermann Vintage Tunina', 'price_eur': 60, 'rating': 91, 'year': 2020}),
 Document(page_content='Elegant, balanced red with herbal and berry nuances', metadata={'color': 'red', 'country': 'Italy', 'grape': 'Cabernet Franc', 'name': 'Sassicaia', 'price_eur': 225, 'rating': 95, 'year': 2016})]

In [14]:
# This example specifies a query and composite filter
retriever.get_relevant_documents(
    "What's a wine after 2015 but before 2020 that's all earthy"
)

[Document(page_content='Highly sought-after Pinot Noir with red fruit and earthy notes', metadata={'color': 'red', 'country': 'France', 'grape': 'Pinot Noir', 'name': 'Domaine de la Romanée-Conti', 'price_eur': 20000, 'rating': 100, 'year': 2018}),
 Document(page_content='Elegant, balanced red with herbal and berry nuances', metadata={'color': 'red', 'country': 'Italy', 'grape': 'Cabernet Franc', 'name': 'Sassicaia', 'price_eur': 225, 'rating': 95, 'year': 2016}),
 Document(page_content='Complex, layered, rich red with dark fruit flavors', metadata={'color': 'red', 'country': 'USA', 'grape': 'Cabernet Sauvignon', 'name': 'Opus One', 'price_eur': 350, 'rating': 96, 'year': 2018}),
 Document(page_content='Elegant and complex, offering a seamless blend of chalky minerality and crisp apple notes', metadata={'color': 'white', 'country': 'France', 'grape': 'Pinot Gris', 'name': 'Domaine Zind-Humbrecht Pinot Gris', 'price_eur': 50, 'rating': 94, 'region': 'Alsace', 'year': 2019})]

## Filter K

We can also use the self query retriever to specify k: the number of documents to fetch.

We can do this by passing enable_limit=True to the constructor.

In [15]:
retriever = SelfQueryRetriever.from_llm(
    llm,
    vectorstore,
    document_content_description,
    metadata_field_info,
    enable_limit=True,
    verbose=True,
)

In [16]:
# This example only specifies a relevant query - k= 2
retriever.get_relevant_documents("what are two that have a rating above 97")

[Document(page_content='Complex, layered, rich red with dark fruit flavors', metadata={'color': 'red', 'country': 'USA', 'grape': 'Cabernet Sauvignon', 'name': 'Opus One', 'price_eur': 350, 'rating': 96, 'year': 2018}),
 Document(page_content='Intense, dark fruit flavors with hints of chocolate', metadata={'color': 'red', 'country': 'USA', 'grape': 'Cabernet Sauvignon', 'name': 'Caymus Special Selection', 'price_eur': 160, 'rating': 96, 'year': 2018})]

In [17]:
retriever.get_relevant_documents("what are two wines that come from australia or New zealand")

[Document(page_content='Full-bodied red with notes of black fruit and spice', metadata={'color': 'red', 'country': 'Australia', 'grape': 'Shiraz', 'name': 'Penfolds Grange', 'price_eur': 500, 'rating': 97, 'year': 2017}),
 Document(page_content='Crisp white with tropical fruit and citrus flavors', metadata={'color': 'white', 'country': 'New Zealand', 'grape': 'Sauvignon Blanc', 'name': 'Cloudy Bay', 'price_eur': 30, 'rating': 92, 'year': 2021})]

In [18]:
# Provide a template following the LLM's original chat template.
template = """You are a wine expert. Answer the question in detail based only on the following context.
Come with some over the top wine expertise:
{context}

Question: {question}
Do not start with "Answer:"
"""
prompt = ChatPromptTemplate.from_template(template)

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

input_query = "what are two wines that come from australia or New zealand?"
output = chain.invoke(input_query)

In [19]:
# word wrap the output
print(textwrap.fill(output))

 Answer: One wine that comes from Australia is the full-bodied red
Penfolds Grange, which boasts rich notes of black fruit and spice.
This exquisite wine is a product of the renowned Shiraz grape, and the
2017 vintage is highly rated at 97 points. With a price tag of 500
EUR, it is a true testament to the exceptional quality of Australian
winemaking.  Another wine that hails from New Zealand is the crisp
white Cloudy Bay, characterized by its delightful tropical fruit and
citrus flavors. This wine is crafted from the Sauvignon Blanc grape
and is a shining example of the exceptional wines produced in New
Zealand. The 2021 vintage has earned a respectable rating of 92 points
and is available for a more affordable 30 EUR, making it an excellent
choice for wine enthusiasts seeking a refreshing and flavorful white
wine.


In [20]:
import gradio as gr

# Assuming your setup for retriever, prompt, llm, and StrOutputParser is already done

def recommend_wines(question):
    # Simulating the retrieval and processing chain for demonstration
    # In practice, replace the line below with your actual chain invocation
    # output = chain.invoke(question)
    output = chain.invoke(question)
    return output

interface = gr.Interface(
    fn=recommend_wines,
    inputs="text",
    outputs="text",
    title="Wine Recommendation Expert",
    description="Ask me for wine recommendations or information about wines!"
)

interface.launch()


Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://8a836cae36e5c54e26.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


