# Agentic GFLs with Cohere Web Agents and Weaviate

This notebook will illustrate how to build Agentic Generative Feedback Loops (GFLs) with Cohere Web Agents and the Weaviate Database. **Cohere Web Agents** are able to access the internet. This enables us to ask them questions about specific wines such as, the Espumante Moscatel Sparkling Rose from Brazil.

In this notebook we will load in the original xWines dataset, connect to Cohere Web Agents, and then run a GFL to add web descriptions of each wine to the xWines dataset!

# Load Original `xWines` Dataset

In [2]:
import pandas as pd

xwines_df = pd.read_csv("XWines_Slim_1K_wines.csv")
print(f"\033[92m{len(xwines_df)}\033[0m wines are in the dataset.\n")
xwines_df.head(3)

[92m1007[0m wines are in the dataset.



Unnamed: 0,WineID,WineName,Type,Elaborate,Grapes,Harmonize,ABV,Body,Acidity,Code,Country,RegionID,RegionName,WineryID,WineryName,Website,Vintages
0,100001,Espumante Moscatel,Sparkling,Varietal/100%,['Muscat/Moscato'],"['Pork', 'Rich Fish', 'Shellfish']",7.5,Medium-bodied,High,BR,Brazil,1001,Serra Gaúcha,10001,Casa Perini,http://www.vinicolaperini.com.br,"[2020, 2019, 2018, 2017, 2016, 2015, 2014, 201..."
1,100002,Ancellotta,Red,Varietal/100%,['Ancellotta'],"['Beef', 'Barbecue', 'Codfish', 'Pasta', 'Pizz...",12.0,Medium-bodied,Medium,BR,Brazil,1001,Serra Gaúcha,10001,Casa Perini,http://www.vinicolaperini.com.br,"[2016, 2015, 2014, 2013, 2012, 2011, 2010, 200..."
2,100003,Cabernet Sauvignon,Red,Varietal/100%,['Cabernet Sauvignon'],"['Beef', 'Lamb', 'Poultry']",12.0,Full-bodied,High,BR,Brazil,1001,Serra Gaúcha,10002,Castellamare,https://www.emporiocastellamare.com.br,"[2021, 2020, 2019, 2018, 2017, 2016, 2015, 201..."


In [3]:
xwines = xwines_df.to_dict(orient='records')

# Cohere Web Agent

In [4]:
import cohere
co = cohere.Client(api_key=cohere_api_key)

def query_web_agent(wine_name: str) -> str:
    response = co.chat(
        message=f"An overview of {wine_name} wine",
        connectors=[{"id": "web-search"}],
    )
    return response.text

query_web_agent(wine_name="Espumante Moscatel")

* 'allow_population_by_field_name' has been renamed to 'populate_by_name'
* 'smart_union' has been removed


'Espumante Moscatel is a sparkling wine made from the Moscato family of grapes, which is known for its sweeter taste. It was first created in Italy in 1850 by Carlo Gancia, the creator of the first Italian sparkling wine. The Moscatel variety was first produced in Brazil in 1978 by Martini and Rossi, who had recently set up a factory in Rio Grande do Sul.\n\nThe wine is known for its light, refreshing, and sweet taste, and is popular in Brazil, where it is mostly produced in the Serra Gaúcha region. It is also produced in other regions of Brazil, such as the Serra Catarinense and the Vale do Rio São Francisco.\n\nThe wine has a low alcohol content, typically between 7% and 11.5%. It is often served chilled, between 4°C and 6°C. It is commonly paired with desserts such as mousses, rabanadas, bolos, sorvetes, and biscoitos doces, but can also be served with savoury dishes such as pork, salmon, and shellfish.'

# Connect to Weaviate

In [5]:
import weaviate

weaviate_client = weaviate.connect_to_local()

# Create xWines Schema

In [7]:
import weaviate.classes.config as wvcc

xWines_collection = weaviate_client.collections.create(
    name="xWines",
    vectorizer_config=wvcc.Configure.Vectorizer.text2vec_cohere(
        model="embed-english-v3.0"
    ),
    properties=[
        wvcc.Property(name="WineID", data_type=wvcc.DataType.INT),
        wvcc.Property(name="WineName", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Type", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Elaborate", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Grapes", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="ABV", data_type=wvcc.DataType.NUMBER),
        wvcc.Property(name="Body", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Acidity", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Code", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Country", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="RegionID", data_type=wvcc.DataType.INT),
        wvcc.Property(name="RegionName", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="WineryID", data_type=wvcc.DataType.INT),
        wvcc.Property(name="WineryName", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Website", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="Vintages", data_type=wvcc.DataType.TEXT),
        wvcc.Property(name="WebDescription", data_type=wvcc.DataType.TEXT)
    ]
)

# Run Agentic GFL

In [8]:
from weaviate.util import get_valid_uuid
from uuid import uuid4
import time

start = time.time()
failed_wines = []
for idx, row in enumerate(xwines):
    try:
        temp_properties = row
        web_description = query_web_agent(wine_name=row["WineName"])
        temp_properties["WebDescription"] = web_description
        uuid = get_valid_uuid(uuid4())
        xWines_collection.data.insert(
            properties=temp_properties,
            uuid=uuid
        )
    except:
        failed_wines.append(row)
        print(f"\033[91mError with Wine {idx}!\033[0m\n")
    if idx % 100 == 1:
        print(f"Processed \033[92m{idx}\033[0m wines.\n")
        print(f"The GFL has been running for \033[92m{time.time() - start}\033[0m seconds.\n")
        print(f"The most recent WebDescription was:\n\n\t\033[92m{web_description}\033[0m\n")
    time.sleep(2) # wait 2 seconds to avoid rate limits

/Users/cshorten/Library/Caches/pypoetry/virtualenvs/weaviate-agents-8ZuJxa4C-py3.10/lib/python3.10/site-packages/pydantic/_internal/_model_construction.py:268: PydanticDeprecatedSince20: The `__fields__` attribute is deprecated, use `model_fields` instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.8/migration/


Processed [92m1[0m wines.

The GFL has been running for [92m43.243147134780884[0m seconds.

The most recent WebDescription was:

	[92mAncellotta is a dark-coloured wine grape variety that originated in Italy's Emilia-Romagna region. It is used as a minority blending component in sparkling red Lambrusco wines, but varietal examples can be found in Brazil, Argentina, and Switzerland. Ancellotta is used to add colour to otherwise pale red wines.

The Ancellotta grape is a vigorous red grape known for small, very dark berries. The high level of anthocyanins in the skins means that this grape is often used, in small amounts, to add a punch of colour to otherwise lightly hued red wines. The grape is sometimes used in concentrated musts used for coloration. Ancellotta grapes are known for ripe red fruit flavours and aromas, such as plum, blackberry, and blueberry; as well as a spiciness characterised as "sweet spice" or "baking spices".[0m

[91mError with Wine 55![0m

[91mError with 

In [14]:
response = xWines_collection.aggregate.over_all(total_count=True)

print(f"{response.total_count} objects in the Weaviate \033[92mxWines\033[0m collection.")

896 objects in the Weaviate [92mxWines[0m collection.


In [11]:
from typing import Any

# ToDo get Type of Weaviate Collection

def _save_to_collection(collection: Any, save_path: str) -> None:
    data = []
    for item in collection.iterator():
        values_dictionary = {}
        values_dictionary["uuid"] = str(item.uuid)
        for key in item.properties.keys():
            values_dictionary[key] = item.properties[key]
        data.append(values_dictionary)

    import json


    with open(save_path, "w") as json_file:
        json.dump(data, json_file, indent=4)
        
_save_to_collection(xWines_collection, "xWines_with_web_descriptions.json")

# Investigating Failed GFLs

In [10]:
failed_wines[0]

{'WineID': 101567,
 'WineName': 'Tinto',
 'Type': 'Red',
 'Elaborate': 'Assemblage/Blend',
 'Grapes': "['Castelão', 'Tinta Miúda', 'Camarate', 'Touriga Nacional', 'Alfrocheiro Preto']",
 'Harmonize': "['Beef', 'Pasta', 'Veal', 'Poultry']",
 'ABV': 13.0,
 'Body': 'Full-bodied',
 'Acidity': 'High',
 'Code': 'PT',
 'Country': 'Portugal',
 'RegionID': 1035,
 'RegionName': 'Lisboa',
 'WineryID': 11644,
 'WineryName': 'Quinta de Bons-Ventos',
 'Website': nan,
 'Vintages': '[2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012, 2011, 2010, 2009, 2008, 2007, 2006, 2005, 2004, 2003, 2002, 2001, 2000, 1999, 1992, 1991, 1990, 1988, 1986, 1981, 1980, 1961]',
 'WebDescription': 'Tinto wine, or "vino tinto", is a term used in Spain and Portugal to refer to any kind of red wine. The term originates from the Latin word "tinctus", which means dyed, stained, or tinted. This refers to the process of making red wine, where the skins of red grapes tint the white must until it turns red. \n\nTinto win

In [12]:
len(failed_wines)

111

In [15]:
query_web_agent(wine_name=failed_wines[0]["WineName"])

'Tinto is a term used to refer to red wine in Spain and Portugal. The term originates from the Latin word "tinctus", which means dyed, stained, or tinted. Tinto is used to describe the process of making red wine, where the skins of red grapes tint the white must until it turns red.\n\nSome popular red wines from Spain include:\n- Acustic Celler 2020 Acustic Tinto Montsant\n- Adega Algueira 2019 "Carravel" Mencia, Ribeira Sacra\n- Altavins 2020 Jorn Nou Garnacha Negra, Terre Alta\n- Alto Moncayo 2019 Garnacha, Campo de Borja\n- Alvaro Palacios 2019 Les Terrasses Priorat\n\nTinto is also the name of a wine brand, with a highly aromatic blend of red berries, dark chocolate, and spices.'

In [18]:
# Hmm, not clear what's wrong.
# Let's try simply retrying these failed wines

from weaviate.util import get_valid_uuid
from uuid import uuid4
import time

start = time.time()
second_round_failed_wines = []
for idx, row in enumerate(failed_wines):
    if idx > 2:
        break
    try:
        temp_properties = row
        web_description = query_web_agent(wine_name=row["WineName"])
        #print(f"Latest generated web_description:\n\t{web_description}\n")
        temp_properties["WebDescription"] = web_description
        uuid = get_valid_uuid(uuid4())
        xWines_collection.data.insert(
            properties=temp_properties,
            uuid=uuid
        )
    except:
        second_round_failed_wines.append(row)
        print(f"\033[91mError with Wine {idx}!\033[0m\n")
    if idx % 100 == 1:
        print(f"Processed \033[92m{idx}\033[0m wines.\n")
        print(f"The GFL has been running for \033[92m{time.time() - start}\033[0m seconds.\n")
        print(f"The most recent WebDescription was:\n\n\t\033[92m{web_description}\033[0m\n")
    time.sleep(2) # wait 2 seconds to avoid rate limits

[91mError with Wine 0![0m

[91mError with Wine 1![0m

Processed [92m1[0m wines.

The GFL has been running for [92m44.407567262649536[0m seconds.

The most recent WebDescription was:

	[92mRed Blend wines refer to any red wine that is created by blending two or more grape varieties together. Red blends are made all over the world and are known for their complexity, depth of flavour, and the balance of tannins, acidity, and fruit flavours. The specific combination of grapes and winemaking techniques used can vary greatly depending on the region and the winemaker's style.

Here is a breakdown of some popular red blend regions:
- Bordeaux, France: The Bordeaux region is considered the birthplace of the red blend, with the most famous being the "Claret" or "Bordeaux Blend." The blend typically consists of Cabernet Sauvignon, Merlot, Cabernet Franc, and Petit Verdot. Bordeaux wines are known for their rich and full-bodied flavour, deep ruby colour, and long, elegant finish.
- Califo

# Another Agentic GFL Idea for Future Work

In [23]:
import cohere
co = cohere.Client(api_key=cohere_api_key)

def query_web_agent(wine_name: str) -> str:
    response = co.chat(
        message=f"What is the current market price of the wine: {wine_name}?",
        connectors=[{"id": "web-search"}],
    )
    return response.text

query_web_agent(wine_name="Barolo")

'The price of Barolo wine varies depending on the quality of the wine, the prominence of the winery, and whether the wine belongs to an important cru (MGA). The price of a bottle of Barolo can range from $25 to $400 or even more. \n\nHere is a breakdown of the three tiers of Barolo wine prices:\n\n- High-volume "supermarket Barolo" produced by farmer cooperatives and merchants who buy grapes from growers: £15-30/ $25-50\n- Solid quality, small-volume Barolo produced by family wineries: £30-75/ $50-120\n- High-end "connoisseur Barolo" or Barolo Riserva: £50-220/ $120-400'

# Thanks for reading!

## - The modified XWines dataset can be found [here](https://huggingface.co/datasets/weaviate/xWines_GFL) on HuggingFace!

## - Learn more about [Generative Feedback Loops](https://github.com/weaviate/recipes/tree/main/weaviate-features/generative-feedback-loops) with Weaviate!

## - Why xWines? Learn more about the Weaviate Recommender Service [here](https://weaviate.io/workbench/recommender)!