# Neo4j GenAI For Recommendations and Content Generation
__An Example Using AWS__

In this example, you will learn how to use Neo4j Knowledge Graphs to make Large Language Models (LLMs) useful for more real-world use cases.

We walk through an example that uses real-world customer and product data from a fashion, style, and beauty retailer. We show how you can use a knowledge graph to ground an LLM, enabling it to build tailored marketing content personalized to each customer based on their interests and shared purchase histories. We use a pattern called Retrieval-Augmented Generation (RAG) to accomplish this.  Specifically, one that leverages not only vector search but also graph pattern matching and graph machine learning to provide more relevant personalized results to customers.

This notebook walks through the end-to-end process, including:
- Building the knowledge graph
- Vector search & text embedding
- Using graph patterns in Cypher to improve semantic search with context
- Further augmenting semantic search with knowledge graph inference & ML
- Building the LLM chain and demo app for generating content

## Setup

### Setup Sagemaker Studio Environment

To get started setting up this example, clone this repo into a [SageMaker Studio](https://aws.amazon.com/sagemaker/studio/) environment and then open this notebook.

### Enable AWS IAM permissions for Bedrock

The AWS identity you assume from your notebook environment (which is the [*Studio/notebook Execution Role*](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html) from SageMaker, or could be a role or IAM User for self-managed notebooks), must have sufficient [AWS IAM permissions](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies.html) to call the Amazon Bedrock service.

To grant Bedrock access to your identity, you can:

- Open the [AWS IAM Console](https://us-east-1.console.aws.amazon.com/iam/home?#)
- Find your [Role](https://us-east-1.console.aws.amazon.com/iamv2/home?#/roles) (if using SageMaker or otherwise assuming an IAM Role), or else [User](https://us-east-1.console.aws.amazon.com/iamv2/home?#/users)
- Select *Add Permissions > Create Inline Policy* to attach new inline permissions, open the *JSON* editor and paste in the below example policy:

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "BedrockFullAccess",
            "Effect": "Allow",
            "Action": ["bedrock:*"],
            "Resource": "*"
        }
    ]
}
```

> ⚠️ **Note:** With Amazon SageMaker, your notebook execution role will typically be *separate* from the user or role that you log in to the AWS Console with. If you'd like to explore the AWS Console for Amazon Bedrock, you'll need to grant permissions to your Console user/role too.


For more information on the fine-grained action and resource permissions in Bedrock, check out the Bedrock Developer Guide.

### Add Anthropic Model Access

In addition to the above, you will need to add access for the Anthropic Claude foundation model. Follow the steps [here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html#add-model-access) to add models.


### Start an AuraDS Instance

If you do not already have Neo4j Aura, you can use [this link](https://aws.amazon.com/marketplace/seller-profile?id=23ec694a-d2af-4641-b4d3-b7201ab2f5f9) to get access.  Click "View purchase options." and select Neo4j Aura Professional.  Then click on "Click here to set up your account." Billing information has been passed over from the AWS account we just came from. Under "Already have an account?" click "Log in." From there you will create an AuraDS instance and save the credentials.  YOu can use an 8GB instance (the smallest instance available) for purposes of this example.



### Install Dependencies

This will take a few minutes

In [2]:
%%capture
%pip install sentence_transformers langchain tiktoken python-dotenv gradio graphdatascience altair boto3
%pip install "vegafusion[embed]"

restart the kernel

In [3]:
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

Import initial dependencies

In [2]:
from graphdatascience import GraphDataScience
from dotenv import load_dotenv
import os
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
from langchain.embeddings import BedrockEmbeddings

### Setup Config & Credential Variables

To make this easy, you can write the config credentials variables directly into the below cell.
Alternatively, if you like, you can use an environments file instead by copying `ex.env.template` to `ex.env` and filling credentials and variables in there. This is a best practice for the future, but fine to skip for now.

In [3]:
# Neo4j
NEO4J_URI = 'bolt://34.202.229.218:7687' #change this
NEO4J_PASSWORD = 'terminologies-fire-planet' #change this
NEO4J_USERNAME = 'neo4j'
AURA_DS = True

# AI
AI_SERVICE_NAME = 'bedrock-runtime'
AI_REGION = 'us-east-1'

In [4]:
# You can skip this if not using a ex.env file. It will overwrite the above
if os.path.exists('ex.env'):
    load_dotenv('ex.env', override=True)

    # Neo4j
    NEO4J_URI = os.getenv('NEO4J_URI')
    NEO4J_USERNAME = os.getenv('NEO4J_USERNAME')
    NEO4J_PASSWORD = os.getenv('NEO4J_PASSWORD')
    AURA_DS = eval(os.getenv('AURA_DS').title())

    # AI
    AI_SERVICE_NAME = 'bedrock-runtime' # This should stay the same
    AI_REGION = os.getenv('AI_REGION')


## Knowledge Graph Building

<img src="img/hm-banner.png" alt="summary" width="2000"/>

We begin by building our knowledge graph. This example will leverage the [H&M Personalized Fashion Recommendations Dataset](https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/data), a sample of real customer purchase data that includes rich information around products including names, types, descriptions, department sections, etc.

Below is the graph data model we will use:

<img src="img/data-model.png" alt="summary" width="1000"/>


### Get Source Data
The data has been pre-formatted.  If you would like to re-generate the dataset from the source on Kaggle, see the `data-prep.ipynb` notebook.

In [5]:
pd.set_option('display.max_rows', 10)
pd.set_option('display.max_colwidth', 500)
pd.set_option('display.width', 0)

In [5]:
department_df = pd.read_csv('https://storage.googleapis.com/neo4j-workshop-data/genai-hm/department.csv')
department_df

Unnamed: 0,departmentNo,departmentName,sectionNo,sectionName
0,1676,Jersey Basic,16,Womens Everyday Basics
1,1339,Clean Lingerie,61,Womens Lingerie
2,3608,Tights basic,62,"Womens Nightwear, Socks & Tigh"
3,5883,Jersey Basic,26,Men Underwear
4,2032,Jersey,8,Mama
...,...,...,...,...
261,7510,Woven,28,Men Edition
262,3420,Small Accessories Extended,66,Womens Small accessories
263,5231,Jacket,31,Mens Outerwear
264,8090,Promotion/Other/Offer,29,Men Other


In [6]:
product_df = pd.read_csv('https://storage.googleapis.com/neo4j-workshop-data/genai-hm/product.csv')
product_df

Unnamed: 0,productCode,prodName,productTypeNo,productTypeName,productGroupName,garmentGroupNo,garmentGroupName,detailDesc
0,108775,Strap top,253,Vest top,Garment Upper body,1002,Jersey Basic,Jersey top with narrow shoulder straps.
1,110065,OP T-shirt (Idro),306,Bra,Underwear,1017,"Under-, Nightwear","Microfibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort."
2,111565,20 den 1p Stockings,304,Underwear Tights,Socks & Tights,1021,Socks and Tights,"Semi shiny nylon stockings with a wide, reinforced trim at the top. Use with a suspender belt. 20 denier."
3,111586,Shape Up 30 den 1p Tights,273,Leggings/Tights,Garment Lower body,1021,Socks and Tights,Tights with built-in support to lift the bottom. Black in 30 denier and light amber in 15 denier.
4,111593,Support 40 den 1p Tights,304,Underwear Tights,Socks & Tights,1021,Socks and Tights,"Semi shiny tights that shape the tummy, thighs and calves while also encouraging blood circulation in the legs. Elasticated waist."
...,...,...,...,...,...,...,...,...
8039,936862,EDC Marla dress,265,Dress,Garment Full body,1023,Special Offers,"Calf-length dress in a patterned Tencel™ lyocell weave with a V-neck, sewn in wrapover at the top and decorative ties at one side. 3/4-length dolman sleeves with narrow, covered elastication at the cuffs. Gathered seam at the waist with concealed elastication and a flared skirt with a gathered tier at the hem for added width. Unlined."
8040,936979,Class Filippa Necklace,77,Necklace,Accessories,1019,Accessories,Metal chain necklace with a pendant. Adjustable length.
8041,937138,Flirty Albin bracelet pk,68,Bracelet,Accessories,1019,Accessories,Metal chain bracelets. Two plain and two with pendants. Adjustable length.
8042,942187,ED Sasha tee,255,T-shirt,Garment Upper body,1005,Jersey Fancy,"Oversized, straight-cut T-shirt in a soft modal and cotton jersey blend with a ribbed neckline and low dropped shoulders."


In [7]:
article_df = pd.read_csv('https://storage.googleapis.com/neo4j-workshop-data/genai-hm/article.csv')
article_df

Unnamed: 0,articleId,productCode,departmentNo,prodName,productTypeName,graphicalAppearanceNo,graphicalAppearanceName,colourGroupCode,colourGroupName
0,108775015,108775,1676,Strap top,Vest top,1010016,Solid,9,Black
1,108775044,108775,1676,Strap top,Vest top,1010016,Solid,10,White
2,110065001,110065,1339,OP T-shirt (Idro),Bra,1010016,Solid,9,Black
3,111565001,111565,3608,20 den 1p Stockings,Underwear Tights,1010016,Solid,9,Black
4,111586001,111586,3608,Shape Up 30 den 1p Tights,Leggings/Tights,1010016,Solid,9,Black
...,...,...,...,...,...,...,...,...,...
13346,936862001,936862,3090,EDC Marla dress,Dress,1010001,All over pattern,52,Pink
13347,936979001,936979,4344,Class Filippa Necklace,Necklace,1010016,Solid,5,Gold
13348,937138001,937138,4345,Flirty Albin bracelet pk,Bracelet,1010016,Solid,5,Gold
13349,942187001,942187,1919,ED Sasha tee,T-shirt,1010016,Solid,9,Black


In [8]:
customer_df = pd.read_csv('https://storage.googleapis.com/neo4j-workshop-data/genai-hm/customer.csv')
customer_df

Unnamed: 0,customerId,fn,active,clubMemberStatus,fashionNewsFrequency,age,postalCode
0,00264b7d4cd6498292e8a355b699c2d07725d123f04867fc4cd204dc4fa286a5,1.0,1.0,ACTIVE,Regularly,53.0,2c29ae653a9282cce4151bd87643c907644e09541abc28ae87dea0d1f6603b1c
1,005c6d3bb66c86aab606814cd9995a12f99b3a44b58c72b89fdde2d7b3f42e0b,,,PRE-CREATE,NONE,,177b4a2258a85a2247daaa7cdffba96a74c741ea8a66059b80509a3f9a2a9c8d
2,00abec3de294e03d192db15b91e154853ee1c89415e7cd38e163dddd8ae0ff06,,,ACTIVE,NONE,49.0,86557a458110ac98f4ca80e5a815ba2e8ea086dd8039b007fe7604ee1587140f
3,00f311a42124fc44d117135f34e1fca29fcac271e6fbd07105b661030ddec423,1.0,1.0,ACTIVE,Regularly,55.0,1a80c5651ae36327a86e71d5b967cf62c31126d1b57ae0a42b99302a126ffb03
4,0132cd2eb3c6b1f66784f65f94ddd8352add2653e0caf5fc564fcfe4eb977863,,,ACTIVE,NONE,49.0,49f7ec29bcacbbf2120af5162f9f99c212e9dd26b48d794d2ad35a92941519ce
...,...,...,...,...,...,...,...
995,fdf1294f414faac2b00a725f5d80c34f98a744d9b8b3cef36e8e0297a8f77995,,,ACTIVE,NONE,32.0,0cd87888c3a13ebbb1e90cac6b9fbf34c51afa40865f55bc7675762ad1ef20f1
996,fe6faeed37fe86e885928d3ab30d8d9b072d6643c8aa15171d52cb1d63e07b1d,1.0,1.0,ACTIVE,Regularly,46.0,fe234b03107b233aec5695dc4c3fbe8e638338643f4e148396c078949e3909c3
997,fef793ec3a7d62d782824517355d74ded50964dce33009d605a4b08e425694b3,,,ACTIVE,NONE,46.0,5799a39cffe701ebdb12181348bf10f9e23abcc3868c43bfade2a2bffbb22be0
998,ffb925b11e1bb2e375d22a02d67907994eb8cb92ec2e7d0489f13a535d975762,,,ACTIVE,NONE,34.0,ebdd8c5c893683c3cf52c011d4e35024e46d183c95f0fa8b95df4eeef98f5bde


In [10]:
transaction_df = pd.read_csv('https://storage.googleapis.com/neo4j-workshop-data/genai-hm/transaction.csv')
transaction_df

Unnamed: 0,tDat,customerId,articleId,price,salesChannelId,txId
0,2018-09-20,0ddcd6055c5830c1fda493843d051edb04ce1bf888aa4becf5b839628396541d,653428002,0.135576,1,2445
1,2018-09-20,210f113fe87db5d6391e986dc06b8e4369e46284e3b98964bf41ced4199a551f,636587001,0.008458,1,6182
2,2018-09-20,210f113fe87db5d6391e986dc06b8e4369e46284e3b98964bf41ced4199a551f,640462002,0.032186,1,6183
3,2018-09-20,211a2ef477fcfc8fc40a63ffa70bb41086dd06ca85d4af83875485dbdf3419e6,645422002,0.014390,2,6188
4,2018-09-20,211a2ef477fcfc8fc40a63ffa70bb41086dd06ca85d4af83875485dbdf3419e6,645422002,0.014390,2,6189
...,...,...,...,...,...,...
23194,2020-09-22,b6be55f233772b5fc4a1ebedf36542fb3e1b6c15c23c7e29c19af814b896a923,921266007,0.016932,2,31779124
23195,2020-09-22,b6be55f233772b5fc4a1ebedf36542fb3e1b6c15c23c7e29c19af814b896a923,812530004,0.010153,2,31779125
23196,2020-09-22,b6be55f233772b5fc4a1ebedf36542fb3e1b6c15c23c7e29c19af814b896a923,942187001,0.016932,2,31779126
23197,2020-09-22,b6be55f233772b5fc4a1ebedf36542fb3e1b6c15c23c7e29c19af814b896a923,866731001,0.025407,2,31779127


### Connect to Neo4j

We will use the [Graph Data Science Python Client](https://neo4j.com/docs/graph-data-science-client/current/) to connect to Neo4j. This client makes it convenient to display results, as we will see later.  Perhaps more importantly, it allows us to easily run [Graph Data Science](https://neo4j.com/docs/graph-data-science/current/introduction/) algorithms from Python.

This client will only work if your Neo4j instance has Graph Data Science installed.  If not, you can still use the [Neo4j Python Driver](https://neo4j.com/docs/python-manual/current/) or use Langchain’s Neo4j Graph object that we will see later on.

In [6]:
# Use Neo4j URI and credentials according to our setup
gds = GraphDataScience(
    NEO4J_URI,
    auth=(NEO4J_USERNAME, NEO4J_PASSWORD),
    aura_ds=AURA_DS)

# Necessary if you enabled Arrow on the db - this is true for AuraDS
gds.set_database("neo4j")

### Create Constraints

Before loading data into Neo4j, it is usually best practice to create Key or Uniqueness constraints for nodes. These [constraints](https://neo4j.com/docs/cypher-manual/current/constraints/) act as an index with some validation on unique id properties and thus make `MATCH` statements run significantly faster. Not doing this can result in a VERY slow ingest, so this is a critical step.

In [13]:
# one uniqueness constraint for each node label
gds.run_cypher('CREATE CONSTRAINT unique_department_no IF NOT EXISTS FOR (n:Department) REQUIRE n.departmentNo IS UNIQUE')
gds.run_cypher('CREATE CONSTRAINT unique_product_code IF NOT EXISTS FOR (n:Product) REQUIRE n.productCode IS UNIQUE')
gds.run_cypher('CREATE CONSTRAINT unique_article_id IF NOT EXISTS FOR (n:Article) REQUIRE n.articleId IS UNIQUE')
gds.run_cypher('CREATE CONSTRAINT unique_customer_id IF NOT EXISTS FOR (n:Customer) REQUIRE n.customerId IS UNIQUE')

### Helper Functions

Since we normalized our data beforehand, we can load each node and relationship type separately in batches.
The Node and Relationship query patterns will follow the same template for different types.  The below functions simply automatically construct the queries and handle the batching.  They will print the queries they are using while loading so you can see the patterns.

Cypher for Loading Nodes follows a MATCH-MERGE pattern, while Cypher for loading relationships follows a MATCH-MATCH-MERGE pattern.


In [7]:
from typing import Tuple, Union
from numpy.typing import ArrayLike


def make_map(x):
    if type(x) == str:
        return x, x
    elif type(x) == tuple:
        return x
    else:
        raise Exception("Entry must of type string or tuple")


def make_set_clause(prop_names: ArrayLike, element_name='n', item_name='rec'):
    clause_list = []
    for prop_name in prop_names:
        clause_list.append(f'{element_name}.{prop_name} = {item_name}.{prop_name}')
    return 'SET ' + ', '.join(clause_list)


def make_node_merge_query(node_key_name: str, node_label: str, cols: ArrayLike):
    template = f'''UNWIND $recs AS rec\nMERGE(n:{node_label} {{{node_key_name}: rec.{node_key_name}}})'''
    prop_names = [x for x in cols if x != node_key_name]
    if len(prop_names) > 0:
        template = template + '\n' + make_set_clause(prop_names)
    return template + '\nRETURN count(n) AS nodeLoadedCount'


def make_rel_merge_query(source_target_labels: Union[Tuple[str, str], str],
                         source_node_key: Union[Tuple[str, str], str],
                         target_node_key: Union[Tuple[str, str], str],
                         rel_type: str,
                         cols: ArrayLike,
                         rel_key: str = None):
    source_target_label_map = make_map(source_target_labels)
    source_node_key_map = make_map(source_node_key)
    target_node_key_map = make_map(target_node_key)

    merge_statement = f'MERGE(s)-[r:{rel_type}]->(t)'
    if rel_key is not None:
        merge_statement = f'MERGE(s)-[r:{rel_type} {{{rel_key}: rec.{rel_key}}}]->(t)'

    template = f'''\tUNWIND $recs AS rec
    MATCH(s:{source_target_label_map[0]} {{{source_node_key_map[0]}: rec.{source_node_key_map[1]}}})
    MATCH(t:{source_target_label_map[1]} {{{target_node_key_map[0]}: rec.{target_node_key_map[1]}}})\n\t''' + merge_statement
    prop_names = [x for x in cols if x not in [rel_key, source_node_key_map[1], target_node_key_map[1]]]
    if len(prop_names) > 0:
        template = template + '\n\t' + make_set_clause(prop_names, 'r')
    return template + '\n\tRETURN count(r) AS relLoadedCount'


def chunks(xs, n=10_000):
    n = max(1, n)
    return [xs[i:i + n] for i in range(0, len(xs), n)]


def load_nodes(gds: GraphDataScience, node_df: pd.DataFrame, node_key_col: str, node_label: str, chunk_size=10_000):
    records = node_df.to_dict('records')
    print(f'======  loading {node_label} nodes  ======')
    total = len(records)
    print(f'staging {total:,} records')
    query = make_node_merge_query(node_key_col, node_label, node_df.columns.copy())
    print(f'\nUsing This Cypher Query:\n```\n{query}\n```\n')
    cumulative_count = 0
    for recs in chunks(records, chunk_size):
        res = gds.run_cypher(query, params={'recs': recs})
        cumulative_count += res.iloc[0, 0]
        print(f'Loaded {cumulative_count:,} of {total:,} nodes')


def load_rels(gds: GraphDataScience,
              rel_df: pd.DataFrame,
              source_target_labels: Union[Tuple[str, str], str],
              source_node_key: Union[Tuple[str, str], str],
              target_node_key: Union[Tuple[str, str], str],
              rel_type: str,
              rel_key: str = None,
              chunk_size=10_000):
    records = rel_df.to_dict('records')
    print(f'======  loading {rel_type} relationships  ======')
    total = len(records)
    print(f'staging {total:,} records')
    query = make_rel_merge_query(source_target_labels, source_node_key,
                                 target_node_key, rel_type, rel_df.columns.copy(), rel_key)
    print(f'\nUsing This Cypher Query:\n```\n{query}\n```\n')
    cumulative_count = 0
    for recs in chunks(records, chunk_size):
        res = gds.run_cypher(query, params={'recs': recs})
        cumulative_count += res.iloc[0, 0]
        print(f'Loaded {cumulative_count:,} of {total:,} relationships')

### Load Nodes

In [15]:
%%time
load_nodes(gds, department_df, 'departmentNo', 'Department')

staging 266 records

Using This Cypher Query:
```
UNWIND $recs AS rec
MERGE(n:Department {departmentNo: rec.departmentNo})
SET n.departmentName = rec.departmentName, n.sectionNo = rec.sectionNo, n.sectionName = rec.sectionName
RETURN count(n) AS nodeLoadedCount
```

Loaded 266 of 266 nodes
CPU times: user 11.6 ms, sys: 0 ns, total: 11.6 ms
Wall time: 529 ms


In [16]:
%%time
load_nodes(gds, product_df, 'productCode', 'Product')

staging 8,044 records

Using This Cypher Query:
```
UNWIND $recs AS rec
MERGE(n:Product {productCode: rec.productCode})
SET n.prodName = rec.prodName, n.productTypeNo = rec.productTypeNo, n.productTypeName = rec.productTypeName, n.productGroupName = rec.productGroupName, n.garmentGroupNo = rec.garmentGroupNo, n.garmentGroupName = rec.garmentGroupName, n.detailDesc = rec.detailDesc
RETURN count(n) AS nodeLoadedCount
```

Loaded 8,044 of 8,044 nodes
CPU times: user 327 ms, sys: 10.6 ms, total: 337 ms
Wall time: 2.11 s


In [17]:
%%time
load_nodes(gds, article_df.drop(columns=['productCode', 'departmentNo']), 'articleId', 'Article')

staging 13,351 records

Using This Cypher Query:
```
UNWIND $recs AS rec
MERGE(n:Article {articleId: rec.articleId})
SET n.prodName = rec.prodName, n.productTypeName = rec.productTypeName, n.graphicalAppearanceNo = rec.graphicalAppearanceNo, n.graphicalAppearanceName = rec.graphicalAppearanceName, n.colourGroupCode = rec.colourGroupCode, n.colourGroupName = rec.colourGroupName
RETURN count(n) AS nodeLoadedCount
```

Loaded 10,000 of 13,351 nodes
Loaded 13,351 of 13,351 nodes
CPU times: user 471 ms, sys: 0 ns, total: 471 ms
Wall time: 2 s


In [18]:
%%time
load_nodes(gds, customer_df, 'customerId', 'Customer')

staging 1,000 records

Using This Cypher Query:
```
UNWIND $recs AS rec
MERGE(n:Customer {customerId: rec.customerId})
SET n.fn = rec.fn, n.active = rec.active, n.clubMemberStatus = rec.clubMemberStatus, n.fashionNewsFrequency = rec.fashionNewsFrequency, n.age = rec.age, n.postalCode = rec.postalCode
RETURN count(n) AS nodeLoadedCount
```

Loaded 1,000 of 1,000 nodes
CPU times: user 39.7 ms, sys: 272 µs, total: 40 ms
Wall time: 647 ms


### Load Relationships

In [19]:
%%time
load_rels(gds, article_df[['articleId', 'departmentNo']], source_target_labels=('Article', 'Department'),
          source_node_key='articleId', target_node_key='departmentNo',
          rel_type='FROM_DEPARTMENT')

staging 13,351 records

Using This Cypher Query:
```
	UNWIND $recs AS rec
    MATCH(s:Article {articleId: rec.articleId})
    MATCH(t:Department {departmentNo: rec.departmentNo})
	MERGE(s)-[r:FROM_DEPARTMENT]->(t)
	RETURN count(r) AS relLoadedCount
```

Loaded 10,000 of 13,351 relationships
Loaded 13,351 of 13,351 relationships
CPU times: user 287 ms, sys: 3.51 ms, total: 291 ms
Wall time: 1.78 s


In [20]:
%%time
load_rels(gds, article_df[['articleId', 'productCode']], source_target_labels=('Article', 'Product'),
          source_node_key='articleId',target_node_key='productCode',
          rel_type='VARIANT_OF')

staging 13,351 records

Using This Cypher Query:
```
	UNWIND $recs AS rec
    MATCH(s:Article {articleId: rec.articleId})
    MATCH(t:Product {productCode: rec.productCode})
	MERGE(s)-[r:VARIANT_OF]->(t)
	RETURN count(r) AS relLoadedCount
```

Loaded 10,000 of 13,351 relationships
Loaded 13,351 of 13,351 relationships
CPU times: user 166 ms, sys: 24 µs, total: 166 ms
Wall time: 1.4 s


In [21]:
%%time
load_rels(gds, transaction_df, source_target_labels=('Customer', 'Article'),
          source_node_key='customerId', target_node_key='articleId', rel_key='txId',
          rel_type='PURCHASED')

staging 23,199 records

Using This Cypher Query:
```
	UNWIND $recs AS rec
    MATCH(s:Customer {customerId: rec.customerId})
    MATCH(t:Article {articleId: rec.articleId})
	MERGE(s)-[r:PURCHASED {txId: rec.txId}]->(t)
	SET r.tDat = rec.tDat, r.price = rec.price, r.salesChannelId = rec.salesChannelId
	RETURN count(r) AS relLoadedCount
```

Loaded 10,000 of 23,199 relationships
Loaded 20,000 of 23,199 relationships
Loaded 23,199 of 23,199 relationships
CPU times: user 682 ms, sys: 5.95 ms, total: 687 ms
Wall time: 3.27 s


### Convert Transaction Dates

In [22]:
gds.run_cypher('''
MATCH (:Customer)-[r:PURCHASED]->()
SET r.tDat = date(r.tDat)
''')

## Vector Search

In this section, we will build text embeddings out of product descriptions and demonstrate how to leverage the Neo4j vector index for vector search. We will also introduce the use of [LangChain](https://www.langchain.com/).


### Creating Text Embeddings

To start we need to make embeddings for our product nodes.

First, we will load our embedding model.

In [8]:
import boto3
import json
bedrock = boto3.client(
    service_name=AI_SERVICE_NAME,
    region_name=AI_REGION,
    endpoint_url=f'https://{AI_SERVICE_NAME}.{AI_REGION}.amazonaws.com'
)

In [9]:
embedding_model = BedrockEmbeddings(model_id="amazon.titan-embed-text-v1", client=bedrock)

Now let's create a dataframe with a text column to embed.  In this case, we will combine multiple text columns, such as product name, type, description, etc.  This provides the embedding model with more context.  Some products are missing a description (a small minority).  For our intents and purposes we will leave them out. In a more in-depth workflow, you would likely want to impute the missing values.

In [25]:
product_emb_df = product_df[['productCode', 'prodName', 'productTypeName', 'productGroupName', 'garmentGroupName', 'detailDesc']]
product_emb_df = product_emb_df[product_emb_df.detailDesc.notnull()]

In [26]:
def create_doc(row):
    return f'''
##Product
Name: {row.prodName}
Type: {row.productTypeName}
Group: {row.productGroupName}
Garment Type: {row.garmentGroupName}
Description: {row.detailDesc}
'''

product_emb_df['text'] = product_emb_df.apply(create_doc, axis=1)
product_emb_df = product_emb_df.drop(columns=['prodName', 'productTypeName', 'productGroupName', 'garmentGroupName', 'detailDesc'])
product_emb_df

Unnamed: 0,productCode,text
0,108775,\n##Product\nName: Strap top\nType: Vest top\nGroup: Garment Upper body\nGarment Type: Jersey Basic\nDescription: Jersey top with narrow shoulder straps.\n
1,110065,"\n##Product\nName: OP T-shirt (Idro)\nType: Bra\nGroup: Underwear\nGarment Type: Under-, Nightwear\nDescription: Microfibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.\n"
2,111565,"\n##Product\nName: 20 den 1p Stockings\nType: Underwear Tights\nGroup: Socks & Tights\nGarment Type: Socks and Tights\nDescription: Semi shiny nylon stockings with a wide, reinforced trim at the top. Use with a suspender belt. 20 denier.\n"
3,111586,\n##Product\nName: Shape Up 30 den 1p Tights\nType: Leggings/Tights\nGroup: Garment Lower body\nGarment Type: Socks and Tights\nDescription: Tights with built-in support to lift the bottom. Black in 30 denier and light amber in 15 denier.\n
4,111593,"\n##Product\nName: Support 40 den 1p Tights\nType: Underwear Tights\nGroup: Socks & Tights\nGarment Type: Socks and Tights\nDescription: Semi shiny tights that shape the tummy, thighs and calves while also encouraging blood circulation in the legs. Elasticated waist.\n"
...,...,...
8039,936862,"\n##Product\nName: EDC Marla dress\nType: Dress\nGroup: Garment Full body\nGarment Type: Special Offers\nDescription: Calf-length dress in a patterned Tencel™ lyocell weave with a V-neck, sewn in wrapover at the top and decorative ties at one side. 3/4-length dolman sleeves with narrow, covered elastication at the cuffs. Gathered seam at the waist with concealed elastication and a flared skirt with a gathered tier at the hem for added width. Unlined.\n"
8040,936979,\n##Product\nName: Class Filippa Necklace\nType: Necklace\nGroup: Accessories\nGarment Type: Accessories\nDescription: Metal chain necklace with a pendant. Adjustable length.\n
8041,937138,\n##Product\nName: Flirty Albin bracelet pk\nType: Bracelet\nGroup: Accessories\nGarment Type: Accessories\nDescription: Metal chain bracelets. Two plain and two with pendants. Adjustable length.\n
8042,942187,"\n##Product\nName: ED Sasha tee\nType: T-shirt\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Oversized, straight-cut T-shirt in a soft modal and cotton jersey blend with a ribbed neckline and low dropped shoulders.\n"


Now let’s embed the text.  We will chunk this into batches for efficiency.

In [29]:
%%time

count = 0
embeddings = []
for docs in chunks(product_emb_df.text, n=80):
    count += len(docs)
    embeddings.extend(embedding_model.embed_documents(docs))
    print(f'Embedded {count} of {product_emb_df.shape[0]}')

Embedded 80 of 8018
Embedded 160 of 8018
Embedded 240 of 8018
Embedded 320 of 8018
Embedded 400 of 8018
Embedded 480 of 8018
Embedded 560 of 8018
Embedded 640 of 8018
Embedded 720 of 8018
Embedded 800 of 8018
Embedded 880 of 8018
Embedded 960 of 8018
Embedded 1040 of 8018
Embedded 1120 of 8018
Embedded 1200 of 8018
Embedded 1280 of 8018
Embedded 1360 of 8018
Embedded 1440 of 8018
Embedded 1520 of 8018
Embedded 1600 of 8018
Embedded 1680 of 8018
Embedded 1760 of 8018
Embedded 1840 of 8018
Embedded 1920 of 8018
Embedded 2000 of 8018
Embedded 2080 of 8018
Embedded 2160 of 8018
Embedded 2240 of 8018
Embedded 2320 of 8018
Embedded 2400 of 8018
Embedded 2480 of 8018
Embedded 2560 of 8018
Embedded 2640 of 8018
Embedded 2720 of 8018
Embedded 2800 of 8018
Embedded 2880 of 8018
Embedded 2960 of 8018
Embedded 3040 of 8018
Embedded 3120 of 8018
Embedded 3200 of 8018
Embedded 3280 of 8018
Embedded 3360 of 8018
Embedded 3440 of 8018
Embedded 3520 of 8018
Embedded 3600 of 8018
Embedded 3680 of 8018
E

In [30]:
# Set as column of dataframe to prepare for loading
product_emb_df['textEmbedding'] = embeddings
product_emb_df

Unnamed: 0,productCode,text,textEmbedding
0,108775,\n##Product\nName: Strap top\nType: Vest top\nGroup: Garment Upper body\nGarment Type: Jersey Basic\nDescription: Jersey top with narrow shoulder straps.\n,"[0.20117188, 0.515625, 0.484375, -0.084472656, -0.70703125, -0.30078125, -0.12988281, 8.2969666e-05, -0.22851562, -0.34179688, -0.28320312, 0.42578125, 0.30664062, -0.390625, 0.29882812, -0.13378906, 0.28710938, 0.23242188, 0.114746094, -0.30273438, -0.06689453, 0.3046875, 0.045898438, 0.43945312, 0.47070312, 0.111816406, 0.05029297, -0.12451172, -0.107910156, -0.48046875, 0.030273438, 0.890625, -0.44921875, 0.0625, -0.18261719, 0.013793945, 0.30859375, 0.14648438, -0.75390625, 0.03930664, 0..."
1,110065,"\n##Product\nName: OP T-shirt (Idro)\nType: Bra\nGroup: Underwear\nGarment Type: Under-, Nightwear\nDescription: Microfibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.\n","[-0.15625, 0.32226562, 0.15429688, 0.3046875, -0.44921875, -0.25195312, -0.24707031, 0.00045776367, -0.37109375, -0.103027344, -0.005493164, 0.25195312, 0.29296875, -0.18164062, 0.3984375, -0.22558594, 0.27929688, -0.26757812, 0.09033203, -0.01965332, 0.026489258, -0.0013427734, 0.056396484, 0.14648438, 0.41015625, 0.11767578, -0.052246094, -0.09375, 0.084472656, 0.23535156, -0.71484375, 0.86328125, -0.12695312, 0.20117188, 0.2265625, 0.028808594, 0.21777344, -0.32421875, 0.27148438, -0.0262..."
2,111565,"\n##Product\nName: 20 den 1p Stockings\nType: Underwear Tights\nGroup: Socks & Tights\nGarment Type: Socks and Tights\nDescription: Semi shiny nylon stockings with a wide, reinforced trim at the top. Use with a suspender belt. 20 denier.\n","[-0.33398438, 0.4921875, 0.35742188, -0.14941406, -0.50390625, -0.00970459, -0.0056152344, 0.00022315979, -0.32226562, -0.20898438, 0.079589844, 0.2734375, 0.1171875, 0.1796875, 0.34375, -0.015197754, 0.50390625, -0.703125, 0.39257812, 0.07763672, -0.040527344, -0.072265625, -0.13085938, 0.39648438, 0.043701172, 0.23046875, -0.18261719, -0.44726562, 0.045654297, 0.061523438, -0.09033203, 0.3984375, -0.41992188, -0.03564453, 0.10595703, 0.03540039, 0.6875, -0.12988281, 0.19335938, 0.015258789..."
3,111586,\n##Product\nName: Shape Up 30 den 1p Tights\nType: Leggings/Tights\nGroup: Garment Lower body\nGarment Type: Socks and Tights\nDescription: Tights with built-in support to lift the bottom. Black in 30 denier and light amber in 15 denier.\n,"[-0.1953125, 0.5, 0.061523438, 0.018676758, -0.27539062, -0.12792969, -0.12890625, 0.00029563904, -0.80859375, -0.11035156, -0.390625, 0.6796875, -0.18847656, 0.3203125, 0.26953125, -0.20800781, 0.24804688, -0.796875, 0.24707031, 0.17773438, -0.36132812, 0.18359375, 0.12207031, 0.38867188, 0.45898438, -0.1875, -0.10839844, -0.27929688, -0.47460938, -0.13378906, 0.04736328, -0.14453125, -0.029174805, -0.048095703, -0.3671875, -0.037109375, 0.35351562, -0.020751953, 0.028198242, -0.18457031, -..."
4,111593,"\n##Product\nName: Support 40 den 1p Tights\nType: Underwear Tights\nGroup: Socks & Tights\nGarment Type: Socks and Tights\nDescription: Semi shiny tights that shape the tummy, thighs and calves while also encouraging blood circulation in the legs. Elasticated waist.\n","[-0.6171875, 0.4765625, 0.13867188, 0.22753906, -0.265625, -0.16015625, -0.1875, 0.00019741058, -0.578125, -0.22460938, -0.22363281, 0.28125, -0.03149414, 0.072753906, 0.41015625, -0.23144531, 0.44726562, -0.66015625, 0.1796875, 0.099609375, -0.14550781, 0.21191406, -0.053710938, 0.44335938, 0.25195312, 0.10449219, -0.15625, -0.61328125, -0.1796875, 0.037597656, 0.035888672, 0.29101562, -0.051757812, -0.111816406, -0.107910156, -0.35742188, 0.56640625, -0.14453125, 0.1796875, 0.06689453, -0...."
...,...,...,...
8039,936862,"\n##Product\nName: EDC Marla dress\nType: Dress\nGroup: Garment Full body\nGarment Type: Special Offers\nDescription: Calf-length dress in a patterned Tencel™ lyocell weave with a V-neck, sewn in wrapover at the top and decorative ties at one side. 3/4-length dolman sleeves with narrow, covered elastication at the cuffs. Gathered seam at the waist with concealed elastication and a flared skirt with a gathered tier at the hem for added width. Unlined.\n","[0.027832031, 0.22363281, 0.33007812, -0.080078125, -0.029174805, -0.095214844, -0.19335938, 0.0005264282, -0.42773438, -0.14550781, -0.12011719, 0.63671875, 0.22167969, -0.2109375, 0.080078125, 0.03173828, 0.052734375, 0.3359375, 0.66796875, -0.031982422, 0.06347656, 0.03466797, 0.17578125, 0.24121094, 0.51953125, -0.375, -0.10058594, -0.20996094, -0.10546875, -0.014099121, -0.03149414, 0.07128906, -0.34960938, 0.11767578, -0.009887695, -0.16699219, 0.27734375, -0.17871094, 0.203125, 0.2656..."
8040,936979,\n##Product\nName: Class Filippa Necklace\nType: Necklace\nGroup: Accessories\nGarment Type: Accessories\nDescription: Metal chain necklace with a pendant. Adjustable length.\n,"[-0.0017318726, 0.48242188, -0.013061523, 0.12792969, -0.31835938, -0.13769531, -0.087890625, 0.00019264221, 0.12597656, -0.079589844, 0.578125, 0.28710938, 0.56640625, 0.060058594, 0.07470703, 0.044433594, 0.37890625, -0.4609375, -0.24902344, 0.13964844, 0.265625, 0.21484375, -0.296875, 0.10205078, 0.36523438, -0.18652344, 0.63671875, 0.1484375, -0.40234375, -0.60546875, 0.34960938, 0.234375, -0.40234375, -0.52734375, -0.21777344, -0.234375, -0.2109375, -0.056640625, 0.41601562, -0.33789062..."
8041,937138,\n##Product\nName: Flirty Albin bracelet pk\nType: Bracelet\nGroup: Accessories\nGarment Type: Accessories\nDescription: Metal chain bracelets. Two plain and two with pendants. Adjustable length.\n,"[-0.08642578, 0.46484375, 0.3125, -0.20800781, -0.10498047, -0.22558594, 0.10449219, 0.00037002563, -0.48632812, -0.32421875, 0.609375, 0.48828125, 0.30859375, -0.21484375, 0.27539062, -0.13769531, 0.017700195, -0.06982422, -0.13769531, 0.32226562, -0.03491211, -0.083496094, -0.13867188, 0.640625, -0.35742188, 0.16015625, 0.10253906, 0.09423828, -0.10205078, -0.26367188, -0.0056152344, 0.46679688, -0.005126953, -0.14941406, -0.18359375, -0.26953125, 0.0065307617, -0.03100586, 0.22167969, 0.0..."
8042,942187,"\n##Product\nName: ED Sasha tee\nType: T-shirt\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Oversized, straight-cut T-shirt in a soft modal and cotton jersey blend with a ribbed neckline and low dropped shoulders.\n","[0.19042969, 0.24902344, 0.37890625, 0.2890625, -0.3515625, -0.072265625, 0.017089844, 0.0005760193, -0.16601562, -0.15917969, 0.040039062, -0.15136719, 0.16601562, -0.045410156, 0.048095703, -0.057617188, 0.17382812, 0.10546875, 0.24804688, 0.09716797, -0.023071289, 0.14648438, 0.016845703, 0.3984375, 0.1015625, -0.17871094, 0.2421875, -0.0012359619, -0.21777344, -0.31835938, -0.12207031, 0.375, -0.21484375, 0.076660156, -0.016235352, -0.028564453, 0.2734375, 0.02758789, 0.08154297, -0.0312..."


#### Create Vector Property

Now we will load the embeddings into Neo4j by MATCHing on ProductCode, then calling the `db.create.setNodeVectorProperty` to set the embedding property. This special function is used to set the properties as floats rather than double precision which requires more space.  This becomes important as these embedding vectors tend to be long and the size can add up quickly.

In [31]:
records = product_emb_df[['productCode', 'textEmbedding']].to_dict('records')
print(f'======  loading Product text embeddings ======')
total = len(records)
print(f'staging {total:,} records')
cumulative_count = 0
for recs in chunks(records, n=100):
    res = gds.run_cypher('''
    UNWIND $recs AS rec
    MATCH(n:Product {productCode: rec.productCode})
    CALL db.create.setNodeVectorProperty(n, "textEmbedding", rec.textEmbedding)
    RETURN count(n) AS propertySetCount
    ''', params={'recs': recs})
    cumulative_count += res.iloc[0, 0]
    print(f'Set {cumulative_count:,} of {total:,} text embeddings')

staging 8,018 records
Set 100 of 8,018 text embeddings
Set 200 of 8,018 text embeddings
Set 300 of 8,018 text embeddings
Set 400 of 8,018 text embeddings
Set 500 of 8,018 text embeddings
Set 600 of 8,018 text embeddings
Set 700 of 8,018 text embeddings
Set 800 of 8,018 text embeddings
Set 900 of 8,018 text embeddings
Set 1,000 of 8,018 text embeddings
Set 1,100 of 8,018 text embeddings
Set 1,200 of 8,018 text embeddings
Set 1,300 of 8,018 text embeddings
Set 1,400 of 8,018 text embeddings
Set 1,500 of 8,018 text embeddings
Set 1,600 of 8,018 text embeddings
Set 1,700 of 8,018 text embeddings
Set 1,800 of 8,018 text embeddings
Set 1,900 of 8,018 text embeddings
Set 2,000 of 8,018 text embeddings
Set 2,100 of 8,018 text embeddings
Set 2,200 of 8,018 text embeddings
Set 2,300 of 8,018 text embeddings
Set 2,400 of 8,018 text embeddings
Set 2,500 of 8,018 text embeddings
Set 2,600 of 8,018 text embeddings
Set 2,700 of 8,018 text embeddings
Set 2,800 of 8,018 text embeddings
Set 2,900 of 8,0

#### Create a Vector Index

The [Neo4j Vector Index](https://neo4j.com/docs/cypher-manual/current/indexes-for-vector-search/) enables efficient Approximate Nearest Neighbor (ANN) search with vectors. It uses the Hierarchical Navigable Small World (HNSW) algorithm.

The index doesn't fully create right away.  This can take a minute or two.

In [33]:
%%time

gds.run_cypher(f'CALL db.index.vector.createNodeIndex("product-text-embeddings", "Product", "textEmbedding", {len(product_emb_df.textEmbedding[0])}, "cosine")')

# wait for full index creation (timeout after 300 seconds)
gds.run_cypher('CALL db.awaitIndex("product-text-embeddings", 300)')

CPU times: user 6.49 ms, sys: 4.14 ms, total: 10.6 ms
Wall time: 26.7 s


#### Create Combined Text Property
This to mirror what was used in the text embedding above.  Creating this will help with RAG patterns later for our LLM.

In [37]:
gds.run_cypher("""
    MATCH(p:Product)
    SET p.text = '##Product\n' +
        'Name: ' + p.prodName + '\n' +
        'Type: ' + p.productTypeName + '\n' +
        'Group: ' + p.productGroupName + '\n' +
        'Garment Type: ' + p.garmentGroupName + '\n' +
        'Description: ' + p.detailDesc
    RETURN count(p) AS propertySetCount
    """)

Unnamed: 0,propertySetCount
0,8044


### Vector Search Using Cypher

To do vector search, we need to:
1. Take the search prompt and convert it to an embedding query vector
2. Use similarity search with that new vector to pull semantically similar documents

Below is an example of converting a search prompt into a query vector. We use our same embedding model to do this.

In [66]:
#search_prompt = 'denim jeans, loose fit, high-waist'
search_prompt = 'Oversized Sweaters'

In [11]:
query_vector = embedding_model.embed_query(search_prompt)
print(f'query vector length: {len(query_vector)}')
print(f'query vector sample: {query_vector[:10]}')

query vector length: 1536
query vector sample: [-1.5234375, 0.59375, 0.122558594, -0.27148438, -0.38476562, 0.29101562, -0.06640625, -0.00063323975, -0.65625, -0.42773438]


Now we can take that and use it in a Cypher query with the vector index to retrieve semantically similar documents.

In [12]:
gds.run_cypher('''
CALL db.index.vector.queryNodes("product-text-embeddings", 10, $queryVector)
YIELD node AS product, score
RETURN product.productCode AS productCode,
    product.text AS text,
    score
''', params={'queryVector': query_vector})

Unnamed: 0,productCode,text,score
0,679895,"##Product\nName: The holiday\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft, patterned knit with a ribbed polo neck, low dropped shoulders and wide ribbing at the cuffs and hem. Relaxed fit.",0.815362
1,789808,##Product\nName: Skylar Chunky Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft rib knit with low dropped shoulders and long sleeves.,0.809108
2,783925,"##Product\nName: Puff sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Jumper in a soft, fine knit containing some wool. Relaxed fit with gently dropped shoulders and ribbing around the neckline, cuffs and hem.",0.806341
3,672748,"##Product\nName: Sheffield Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Turtleneck jumper in a soft, textured knit with dropped shoulders, long sleeves and high slits in the sides.",0.806158
4,842001,"##Product\nName: Betsy Oversized\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Oversized, V-neck jumper in a soft, loose knit containing some wool and alpaca wool. Dropped shoulders, long, wide sleeves, wide ribbing around the neckline, cuffs and hem, and slits in the sides.",0.805118
5,730219,"##Product\nName: Wow Ole Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Top in lightweight sweatshirt fabric with a small embroidered motif at the top and ribbing around the neckline, cuffs and hem.",0.803214
6,873217,"##Product\nName: Vic Volume Sleeve Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Slightly shorter jumper in a soft double knit with a round neckline, dropped shoulders, long balloon sleeves with close-fitting ribbing at the cuffs, and a ribbed hem.",0.801381
7,502186,"##Product\nName: Tuck cropped sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Jumper in a soft, loose knit with low dropped shoulders and ribbing at the cuffs and hem. Longer at the back.",0.800403
8,812167,"##Product\nName: Macy\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Oversized jumper in a soft knit containing some wool with a ribbed polo neck, low dropped shoulders, long sleeves, and ribbing at the cuffs and hem. The polyester content of the jumper is recycled.",0.799537
9,620083,"##Product\nName: All in sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Cropped top in sweatshirt fabric with long sleeves and ribbing around the neckline, cuffs and hem.",0.798471


### Vector Search Using Langchain

We can also do this vector search with Langchain, a recommended approach going forward.  To do this, we use the Neo4jVector class and call the below method to set up from an existing index in the graph.

In [13]:
from langchain.vectorstores.neo4j_vector import Neo4jVector

In [14]:
kg_vector_search = Neo4jVector.from_existing_index(
    embedding=embedding_model,
    url=NEO4J_URI,
    username=NEO4J_USERNAME,
    password=NEO4J_PASSWORD,
    index_name='product-text-embeddings')

Langchain can handle embedding the query vector and retrieving from Neo4j behind the scenes, making our lives easier.  Langchain uses a similar query as above and retrieves the `text` property we set for each Product node.

In [15]:
res = kg_vector_search.similarity_search(search_prompt, k=10)
res

[Document(page_content='##Product\nName: The holiday\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft, patterned knit with a ribbed polo neck, low dropped shoulders and wide ribbing at the cuffs and hem. Relaxed fit.', metadata={'prodName': 'The holiday', 'garmentGroupName': 'Knitwear', 'garmentGroupNo': 1003, 'productCode': 679895, 'productTypeName': 'Sweater', 'productTypeNo': 252, 'detailDesc': 'Short jumper in a soft, patterned knit with a ribbed polo neck, low dropped shoulders and wide ribbing at the cuffs and hem. Relaxed fit.', 'productGroupName': 'Garment Upper body'}),
 Document(page_content='##Product\nName: Skylar Chunky Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft rib knit with low dropped shoulders and long sleeves.', metadata={'prodName': 'Skylar Chunky Sweater', 'garmentGroupName': 'Knitwear', 'garmentGroupNo': 1003, 'productCode': 789808, 'productType

In [16]:
# Visualize as a dataframe
pd.DataFrame([{'document': d.page_content} for d in res])

Unnamed: 0,document
0,"##Product\nName: The holiday\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft, patterned knit with a ribbed polo neck, low dropped shoulders and wide ribbing at the cuffs and hem. Relaxed fit."
1,##Product\nName: Skylar Chunky Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft rib knit with low dropped shoulders and long sleeves.
2,"##Product\nName: Puff sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Jumper in a soft, fine knit containing some wool. Relaxed fit with gently dropped shoulders and ribbing around the neckline, cuffs and hem."
3,"##Product\nName: Sheffield Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Turtleneck jumper in a soft, textured knit with dropped shoulders, long sleeves and high slits in the sides."
4,"##Product\nName: Betsy Oversized\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Oversized, V-neck jumper in a soft, loose knit containing some wool and alpaca wool. Dropped shoulders, long, wide sleeves, wide ribbing around the neckline, cuffs and hem, and slits in the sides."
5,"##Product\nName: Wow Ole Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Top in lightweight sweatshirt fabric with a small embroidered motif at the top and ribbing around the neckline, cuffs and hem."
6,"##Product\nName: Vic Volume Sleeve Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Slightly shorter jumper in a soft double knit with a round neckline, dropped shoulders, long balloon sleeves with close-fitting ribbing at the cuffs, and a ribbed hem."
7,"##Product\nName: Tuck cropped sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Jumper in a soft, loose knit with low dropped shoulders and ribbing at the cuffs and hem. Longer at the back."
8,"##Product\nName: Macy\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Oversized jumper in a soft knit containing some wool with a ribbed polo neck, low dropped shoulders, long sleeves, and ribbing at the cuffs and hem. The polyester content of the jumper is recycled."
9,"##Product\nName: All in sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Cropped top in sweatshirt fabric with long sleeves and ribbing around the neckline, cuffs and hem."


### Try Yourself

Experiment with your own prompts!

In [40]:
res = kg_vector_search.similarity_search('type your prompt here!', k=10)
pd.DataFrame([{'document': d.page_content} for d in res])

## Semantic Search with Context (Graph Patterns)
__Using Graph Patterns to Improve Context in Search & Retrieval__

Above, we saw how you can use the vector index to find semantic similar products in user searches.  This is an extremely powerful tool; however, it is not the end-all be-all.  It doesn't consider much of the customer data and isn't very personalized. Furthermore, some search
prompts, like "Oversized Sweater," are very general and can match a large number of products, many of which won't be relevant to the specific user conducting the search.

We have a rich knowledge graph full of customer information; let's see how to leverage it to improve search experience.

### Explore in Aura
To understand how to better leverage our graph, let's explore in Neo4j Browser on our Aura instance.

#### Exploring the Graph
First, let's validate the schema by calling the below
```
CALL db.schema.visualization()
```

We can also use Cypher to sample the graph. Run the below query in Browser and explore the results:
```
MATCH (p:Product)<-[v:VARIANT_OF]-(a:Article)<-[t:PURCHASED]-(c:Customer)
RETURN * LIMIT 150
```

You should get something that looks like the below.  Notice the multi-hop connections between customers based on purchases. This is valuable information encoded in our graph!

<img src="img/sample-query.png" alt="summary" width="1000"/>

#### Understanding Shared Customer Behavior

Now let's consider a single customer's purchase history.  We will choose the below customer by setting customerId as a parameter.

```
:params {customerId:'daae10780ecd14990ea190a1e9917da33fe96cd8cfa5e80b67b4600171aa77e0'}
```

Then we can run the below Cypher to pull history:

```
MATCH(c:Customer {customerId: $customerId})-[t:PURCHASED]->(:Article)
-[:VARIANT_OF]->(p:Product)
RETURN p.productCode AS productCode,
    p.prodName AS prodName,
    p.productTypeName AS productTypeName,
    p.garmentGroupName AS garmentGroupName,
    p.detailDesc AS detailDesc,
    t.tDat AS purchaseDate
ORDER BY t.tDat DESC
```
Expected results:
<img src="img/purchase-history.png" alt="summary" width="1000"/>

These purchases are ordered by transaction date. The most recent purchases should be the "Tove Top" and the "Rosemary Dress".

Now let's consider just the latest products in the above list and see what else we could recommend to customers who liked them.  The following Cypher query provides potential answers by finding the most popular products among customers who purchased these.

```
//pull the latest purchases
MATCH(c:Customer {customerId: $customerId})-[t:PURCHASED]->()
WITH max(t.tDat) AS latestPurchases
//find related products based on customer purchases
MATCH(c:Customer {customerId: $customerId})-[:PURCHASED {tDat: latestPurchases}]->(:Article)<-[:PURCHASED]-(:Customer)-[:PURCHASED]->(:Article)
    -[:VARIANT_OF]->(p:Product)
RETURN p.productCode AS productCode,
    p.prodName AS prodName,
    p.productTypeName AS productTypeName,
    p.garmentGroupName AS garmentGroupName,
    count(*) AS commonPurchaseScore,
    p.detailDesc AS detailDesc
ORDER BY commonPurchaseScore DESC
```

Expected results:
<img src="img/related-products.png" alt="summary" width="1000"/>

__You will see that some of the above results seem intuitive...but not all of them right away...and that is exactly the point!
There is information encoded inside the knowledge graph about customer preferences that isn't inferable from the product text documents.__

__This is one example of where enterprise-specific data, expressed as structured relationships, contains critical information that is impossible to find elsewhere. This is why, for many real-world applications, you should consider backing semantic search and GenAI with Knowledge Graphs.__

Now let’s see how to apply this pattern in our semantic search and retrieval!

### Personalizing Results Based on Customer Behavior in the Graph

As we saw in Browser, an important piece of information expressed in this graph, but not directly in the product documents and text embeddings, is customer purchasing behavior.  We saw that we can use graph patterns in Cypher to extract insights from these. Now that we know how this pattern works, we can apply it to our semantic search to make results more personalized.

To do this, we append a MATCH statement to the end of our initial vector search query.  Basically, once the product documents are returned, we can re-calculate how they would score according to the query above and use that to re-rank the search results.

Langchain makes this easy by allowing for a `retrieval_query` argument where we can put in the pattern we need.

In [17]:
CUSTOMER_ID = "daae10780ecd14990ea190a1e9917da33fe96cd8cfa5e80b67b4600171aa77e0"

kg_personalized_search = Neo4jVector.from_existing_index(
    embedding=embedding_model,
    url=NEO4J_URI,
    username=NEO4J_USERNAME,
    password=NEO4J_PASSWORD,
    index_name='product-text-embeddings',
    retrieval_query=f"""
    WITH node AS product, score AS searchScore

    OPTIONAL MATCH(product)<-[:VARIANT_OF]-(:Article)<-[:PURCHASED]-(:Customer)
    -[:PURCHASED]->(a:Article)<-[:PURCHASED]-(:Customer {{customerId: '{CUSTOMER_ID}'}})

    WITH count(a) AS purchaseScore, product.text AS text, searchScore, product.productCode AS productCode
    RETURN text,
        (1+purchaseScore)*searchScore AS score,
        {{productCode: productCode, purchaseScore:purchaseScore, searchScore:searchScore}} AS metadata
    ORDER BY purchaseScore DESC, searchScore DESC LIMIT 15
    """)

Now let's run it to see if/how our results have changed.

In [18]:
res = kg_personalized_search.similarity_search(search_prompt, k=100)

# Visualize as a dataframe
pd.DataFrame([{'productCode': d.metadata['productCode'],
               'document': d.page_content,
               'searchScore': d.metadata['searchScore'],
               'purchaseScore': d.metadata['purchaseScore']} for d in res])

Unnamed: 0,productCode,document,searchScore,purchaseScore
0,775996,"##Product\nName: Alex sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Boxy-style jumper in a soft knit containing some wool with a round neck, dropped shoulders and long sleeves. Ribbing around the neckline, cuffs and hem.",0.785467,8
1,736156,"##Product\nName: PRICE ITEM: Katya price\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Long-sleeved, off-the-shoulder top in soft, patterned sweatshirt fabric with elastication at the top and ribbing at the cuffs and hem. Soft brushed inside.",0.790936,2
2,669682,"##Product\nName: Irma sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Top in printed sweatshirt fabric with dropped shoulders, long sleeves and ribbing around the neckline, cuffs and hem.",0.790334,2
3,539291,"##Product\nName: Neve Off Shoulder\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Off-the-shoulder jumper in a soft, fine knit containing some wool with a wide foldover top, long sleeves and ribbing at the cuffs and hem.",0.787015,2
4,693917,"##Product\nName: Belinda\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Jumper in a soft rib knit with a V-neck front and back, dropped shoulders and long sleeves.",0.786306,2
...,...,...,...,...
10,679895,"##Product\nName: The holiday\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft, patterned knit with a ribbed polo neck, low dropped shoulders and wide ribbing at the cuffs and hem. Relaxed fit.",0.815362,0
11,789808,##Product\nName: Skylar Chunky Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Short jumper in a soft rib knit with low dropped shoulders and long sleeves.,0.809108,0
12,783925,"##Product\nName: Puff sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Jumper in a soft, fine knit containing some wool. Relaxed fit with gently dropped shoulders and ribbing around the neckline, cuffs and hem.",0.806341,0
13,672748,"##Product\nName: Sheffield Sweater\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Turtleneck jumper in a soft, textured knit with dropped shoulders, long sleeves and high slits in the sides.",0.806158,0


## Augmenting Semantic Search with Knowledge Graph Inference & ML

We saw above how to use graph pattern matching to personalize semantic search and make it more contextually relevant.

In addition to this, we also have [Graph Data Science algorithms and machine learning](https://neo4j.com/docs/graph-data-science/current/introduction/) which allows you to enrich your knowledge graph with additional properties, relationships, and graph metrics. These can in-turn be leveraged in search and retrieval to improve and augment results.

We will walk through an example of this below, where we use Graph Data Science to augment retrieval with additional product recommendations.


### Graph Embedding

We will begin by creating Node Embeddings.

In [19]:
pd.set_option('display.max_rows', 20)
pd.set_option('display.max_colwidth', 500)
pd.set_option('display.width', 0)

In [20]:
def clear_all_graphs():
    g_names = gds.graph.list().graphName.tolist()
    for g_name in g_names:
        g = gds.graph.get(g_name)
        g.drop()

#### Clear Past Analysis (If rerunning this Notebook)

In [21]:
clear_all_graphs()

In [22]:
gds.run_cypher('''
    MATCH(:Article)-[r:CUSTOMERS_ALSO_LIKE]->()
    CALL {
        WITH r
        DELETE r
    } IN TRANSACTIONS OF 1000 ROWS
    ''')

#### Apply Fast Random Projection (FastRP) Node Embedding

First, apply a graph projection to structure the portion of the graph we need in an optimized in-memory format for graph ML.

In [23]:
%%time

# graph projection
gds.run_cypher('''
   MATCH (a1:Article)<-[:PURCHASED]-(:Customer)-[:PURCHASED]->(a2:Article)
   WITH gds.graph.project("proj", a1, a2,
       {sourceNodeLabels: labels(a1),
       targetNodeLabels: labels(a2),
       relationshipType: "COPURCHASE"}) AS g
   RETURN g.graphName
   ''')

g = gds.graph.get("proj")

CPU times: user 7.64 ms, sys: 0 ns, total: 7.64 ms
Wall time: 2.32 s


Next, we will generate node embeddings for similarity calculation.  In this case, we will use FastRP (Fast Random Projection) which is a fast, scalable, and robust embedding algorithm. FastRP calculates embeddings using probabilistic sampling and linear algebra.

In [24]:
%%time
# embeddings (writing back Article embeddings in case we want to introspect later)
gds.fastRP.mutate(g, mutateProperty='embedding', embeddingDimension=128, randomSeed=7474, concurrency=4, iterationWeights=[0.0, 1.0, 1.0])
gds.graph.writeNodeProperties(g, ['embedding'], ['Article'])

CPU times: user 8.25 ms, sys: 0 ns, total: 8.25 ms
Wall time: 611 ms


writeMillis                   87
graphName                   proj
nodeProperties       [embedding]
propertiesWritten          13296
Name: 0, dtype: object

#### Explore Node Embeddings

In [25]:
graph_emb_df = gds.run_cypher('''
MATCH (p:Product)<-[:VARIANT_OF]-(a:Article)-[:FROM_DEPARTMENT]-(d)
RETURN a.articleId AS articleId,
    p.prodName AS productName,
    p.productTypeName AS productTypeName,
    d.departmentName AS departmentName,
    d.sectionName AS sectionName,
    p.detailDesc AS detailDesc,
    a.embedding AS embedding
''')

This is what a sample of the graph embeddings look like

In [26]:
graph_emb_df.loc[:3, ['articleId', 'embedding']]

Unnamed: 0,articleId,embedding
0,108775015,"[0.16353672742843628, 0.01389206200838089, -0.09718400239944458, -0.09007294476032257, 0.0450824610888958, -0.19723138213157654, -0.025347081944346428, -0.02713043801486492, -0.027621574699878693, -0.2381921112537384, 0.060289230197668076, -0.024042265489697456, -0.29962483048439026, 0.22438350319862366, -0.1413978636264801, -0.14609390497207642, 0.08577412366867065, 0.06979275494813919, -0.2750788927078247, -0.04091469198465347, 0.217533141374588, 0.07957644015550613, -0.1270035207271576, 0..."
1,108775044,"[0.2270691692829132, -0.17189866304397583, 0.05420907959342003, -0.25370126962661743, 0.046070631593465805, -0.24593515694141388, -0.128816157579422, -0.1815388947725296, -0.020767826586961746, -0.01790419965982437, -0.026331137865781784, 0.009392645210027695, 0.17194950580596924, -0.13308250904083252, 0.026070907711982727, 0.431831955909729, -0.12733137607574463, -0.10455979406833649, -0.10574892163276672, -0.1293722540140152, 0.24306458234786987, 0.07819913327693939, -0.10727328807115555, ..."
2,110065001,"[-0.04546318203210831, -0.006854437291622162, -0.16742649674415588, 0.2357560247182846, 0.09137497842311859, 0.021863341331481934, 0.0894181951880455, -0.3023601174354553, -0.08457326889038086, -0.1523570418357849, 0.2241903841495514, -0.05167902261018753, 0.1952899992465973, -0.03960183635354042, 0.048022035509347916, 0.2635676860809326, 0.41764384508132935, -0.06672824174165726, -0.17451226711273193, 0.07695617526769638, -0.04687446355819702, 0.12635508179664612, -0.07397030293941498, 0.24..."
3,111565001,"[0.194799542427063, 0.07724831998348236, 0.24875333905220032, 0.13024908304214478, 0.18629930913448334, -0.10413847863674164, 0.0004734473768621683, 0.049528490751981735, 0.214283287525177, 0.05470145866274834, -0.04866843670606613, 0.11055487394332886, 0.1459035724401474, 0.131142258644104, -0.0515199713408947, -0.20416304469108582, 0.02435876615345478, 0.031970951706171036, 0.026966720819473267, 0.29037514328956604, 0.17842623591423035, -0.011872466653585434, -0.11300095170736313, 0.091955..."


##### Visualize Node Embeddings
You may skip these next few cells.

In [27]:
# Skip this for Demo
from sklearn.manifold import TSNE

df = graph_emb_df.copy()
filtered_node_df = df[df.embedding.apply(lambda x: np.count_nonzero(x) > 0)].reset_index(drop=True)
# instantiate the TSNE model
tsne = TSNE(n_components=2, random_state=7474, init='random', learning_rate="auto")
# Use the TSNE model to fit and output a 2-d representation
E = tsne.fit_transform(np.stack(filtered_node_df['embedding'], axis=0))

coord_df = pd.concat([filtered_node_df, pd.DataFrame(E, columns=['x', 'y'])], axis=1)
coord_df

Unnamed: 0,articleId,productName,productTypeName,departmentName,sectionName,detailDesc,embedding,x,y
0,108775015,Strap top,Vest top,Jersey Basic,Womens Everyday Basics,Jersey top with narrow shoulder straps.,"[0.16353672742843628, 0.01389206200838089, -0.09718400239944458, -0.09007294476032257, 0.0450824610888958, -0.19723138213157654, -0.025347081944346428, -0.02713043801486492, -0.027621574699878693, -0.2381921112537384, 0.060289230197668076, -0.024042265489697456, -0.29962483048439026, 0.22438350319862366, -0.1413978636264801, -0.14609390497207642, 0.08577412366867065, 0.06979275494813919, -0.2750788927078247, -0.04091469198465347, 0.217533141374588, 0.07957644015550613, -0.1270035207271576, 0...",48.215420,1.254746
1,108775044,Strap top,Vest top,Jersey Basic,Womens Everyday Basics,Jersey top with narrow shoulder straps.,"[0.2270691692829132, -0.17189866304397583, 0.05420907959342003, -0.25370126962661743, 0.046070631593465805, -0.24593515694141388, -0.128816157579422, -0.1815388947725296, -0.020767826586961746, -0.01790419965982437, -0.026331137865781784, 0.009392645210027695, 0.17194950580596924, -0.13308250904083252, 0.026070907711982727, 0.431831955909729, -0.12733137607574463, -0.10455979406833649, -0.10574892163276672, -0.1293722540140152, 0.24306458234786987, 0.07819913327693939, -0.10727328807115555, ...",-43.670410,26.363695
2,110065001,OP T-shirt (Idro),Bra,Clean Lingerie,Womens Lingerie,"Microfibre T-shirt bra with underwired, moulded, lightly padded cups that shape the bust and provide good support. Narrow adjustable shoulder straps and a narrow hook-and-eye fastening at the back. Without visible seams for greater comfort.","[-0.04546318203210831, -0.006854437291622162, -0.16742649674415588, 0.2357560247182846, 0.09137497842311859, 0.021863341331481934, 0.0894181951880455, -0.3023601174354553, -0.08457326889038086, -0.1523570418357849, 0.2241903841495514, -0.05167902261018753, 0.1952899992465973, -0.03960183635354042, 0.048022035509347916, 0.2635676860809326, 0.41764384508132935, -0.06672824174165726, -0.17451226711273193, 0.07695617526769638, -0.04687446355819702, 0.12635508179664612, -0.07397030293941498, 0.24...",15.357679,8.970029
3,111565001,20 den 1p Stockings,Underwear Tights,Tights basic,"Womens Nightwear, Socks & Tigh","Semi shiny nylon stockings with a wide, reinforced trim at the top. Use with a suspender belt. 20 denier.","[0.194799542427063, 0.07724831998348236, 0.24875333905220032, 0.13024908304214478, 0.18629930913448334, -0.10413847863674164, 0.0004734473768621683, 0.049528490751981735, 0.214283287525177, 0.05470145866274834, -0.04866843670606613, 0.11055487394332886, 0.1459035724401474, 0.131142258644104, -0.0515199713408947, -0.20416304469108582, 0.02435876615345478, 0.031970951706171036, 0.026966720819473267, 0.29037514328956604, 0.17842623591423035, -0.011872466653585434, -0.11300095170736313, 0.091955...",74.363419,25.669960
4,111586001,Shape Up 30 den 1p Tights,Leggings/Tights,Tights basic,"Womens Nightwear, Socks & Tigh",Tights with built-in support to lift the bottom. Black in 30 denier and light amber in 15 denier.,"[0.10175400972366333, -0.19288033246994019, 0.025479715317487717, 0.11422780156135559, 0.13807672262191772, -0.20837050676345825, 0.09885844588279724, -0.11696606874465942, 0.1510676145553589, -0.2387872338294983, -0.12651680409908295, 0.3652939796447754, 0.08521981537342072, 0.06372103840112686, -0.3162726163864136, -0.23687461018562317, 0.21365612745285034, 0.002404250204563141, 0.043918803334236145, -0.1964809149503708, -0.0249843280762434, -0.18427029252052307, -0.05020210146903992, 0.09...",38.775784,25.304434
...,...,...,...,...,...,...,...,...,...
13291,936862001,EDC Marla dress,Dress,Campaigns,Womens Everyday Collection,"Calf-length dress in a patterned Tencel™ lyocell weave with a V-neck, sewn in wrapover at the top and decorative ties at one side. 3/4-length dolman sleeves with narrow, covered elastication at the cuffs. Gathered seam at the waist with concealed elastication and a flared skirt with a gathered tier at the hem for added width. Unlined.","[0.30163639783859253, 0.19000965356826782, 0.1051938384771347, 0.14571520686149597, 0.04735825955867767, 0.046574003994464874, 0.009401623159646988, -0.19895312190055847, -0.1485200822353363, -0.2463332563638687, 0.05826284736394882, 0.4833417534828186, -0.03002075105905533, 0.19663190841674805, -0.2691468298435211, -0.07815654575824738, -0.17757371068000793, 0.023756373673677444, 0.07557491213083267, 0.0955217182636261, 0.09620979428291321, 0.1608126312494278, -0.3925545811653137, -0.091149...",55.530907,-59.582829
13292,936979001,Class Filippa Necklace,Necklace,Jewellery,Womens Small accessories,Metal chain necklace with a pendant. Adjustable length.,"[0.1242004930973053, -0.12798795104026794, 0.10056436061859131, -0.050673700869083405, -0.041224025189876556, 0.13280335068702698, 0.20210997760295868, -0.10767792165279388, 0.16874153912067413, 0.05998634546995163, 0.07098144292831421, 0.06702642887830734, 0.0608292855322361, 0.2916552424430847, -0.2182622253894806, -0.1259959191083908, 0.1548045128583908, 0.09055173397064209, 0.20296929776668549, -0.16848692297935486, 0.13646462559700012, 0.09153448045253754, 0.05309563875198364, 0.2007957...",-35.253735,-2.853637
13293,937138001,Flirty Albin bracelet pk,Bracelet,Jewellery Extended,Womens Small accessories,Metal chain bracelets. Two plain and two with pendants. Adjustable length.,"[-0.10720515251159668, -0.17470087110996246, 0.1307799518108368, -0.2808590531349182, 0.3204616904258728, -0.06416068971157074, 0.22469662129878998, -0.11998633295297623, 0.051582593470811844, 0.24307465553283691, -0.021492883563041687, -0.06733442097902298, 0.21141648292541504, 0.17318715155124664, 0.09750582277774811, 0.1768089085817337, -0.08338433504104614, 0.12920521199703217, -0.1821601688861847, -0.19470183551311493, -0.19796964526176453, -0.1408807635307312, -0.10183313488960266, 0.1...",-15.238043,-47.419041
13294,942187001,ED Sasha tee,T-shirt,Jersey,H&M+,"Oversized, straight-cut T-shirt in a soft modal and cotton jersey blend with a ribbed neckline and low dropped shoulders.","[0.12566354870796204, -0.012941320426762104, -0.15599849820137024, -0.12118523567914963, -0.1819937825202942, 0.006172780878841877, -0.19913744926452637, -0.22987841069698334, 0.281649112701416, 0.22842726111412048, 0.03120540641248226, -0.03891172632575035, -0.1694663017988205, -0.13677716255187988, -0.019280079752206802, 0.016364671289920807, -0.05226295068860054, -0.14452752470970154, -0.048252444714307785, 0.2553097903728485, 0.0624568909406662, 0.05895322561264038, -0.3900794982910156, ...",22.431427,-94.231857


In [87]:
# Skip this for Demo
import altair as alt
from sklearn.manifold import TSNE

alt.data_transformers.enable("vegafusion")
chart = alt.Chart(coord_df.sample(n=5000, random_state=7474)).mark_circle(size=60).encode(
 x='x',
 y='y',
 tooltip=['productName', 'productTypeName', 'departmentName' , 'sectionName', 'detailDesc']
).properties(title="Article Embedding (2D Representation)", width=750, height=700)

chart = chart.configure_axis(titleFontSize=20)
chart.configure_legend(labelFontSize = 20)
chart

### K-Nearest Neighbors (KNN) Relationships

Now, we can do our similarity inference with K-Nearest Neighbor (KNN) and write back to the graph.
We will use a slightly low cutoff of 0.75 similarity score to extend the result size for exploration.

In [28]:
%%time
# KNN
_ = gds.knn.write(g, nodeProperties=['embedding'], nodeLabels=['Article'],
                  writeRelationshipType='CUSTOMERS_ALSO_LIKE', writeProperty='score',
                  sampleRate=1.0, initialSampler='randomWalk', concurrency=1, similarityCutoff=0.75, randomSeed=7474)
_

Knn:   0%|          | 0/100 [00:00<?, ?%/s]

CPU times: user 73.1 ms, sys: 7.92 ms, total: 81.1 ms
Wall time: 5.19 s


ranIterations                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              6
didConverge                                                                                                                                                                                                                                                                                                                                                                                                                                                                                

In [29]:
# clear graph projection once done
g.drop()

graphName                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               proj
database                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               neo4j
memoryUsage   

### Tailored Recommendations from Search

Now let's construct a KG store to retrieve recommendations based on search.

In [30]:
kg_search_recommendations = Neo4jVector.from_existing_index(
    embedding=embedding_model,
    url=NEO4J_URI,
    username=NEO4J_USERNAME,
    password=NEO4J_PASSWORD,
    index_name='product-text-embeddings',
    retrieval_query="""
    WITH node as searchProduct, score as searchScore
    MATCH(searchProduct)<-[:VARIANT_OF]-(:Article)-[r:CUSTOMERS_ALSO_LIKE]->(:Article)-[:VARIANT_OF]-(product)
    WITH  product, searchScore, sum(r.score*searchScore) AS recommenderScore
    RETURN product.text AS text,
    recommenderScore AS score,
    {productCode: product.productCode, productType: product.productTypeName, recommenderScore:recommenderScore} AS metadata
    ORDER BY score DESC LIMIT 100
    """
)

In [22]:
res = kg_search_recommendations.similarity_search(search_prompt, k=100)

# Visualize as a dataframe
pd.DataFrame([{'productCode': d.metadata['productCode'],
               'productType':d.metadata['productType'],
               'document': d.page_content,
               'recommenderScore': d.metadata['recommenderScore']} for d in res])

Unnamed: 0,productCode,productType,document,recommenderScore
0,562252,Trousers,"##Product\nName: Space 5 pkt tregging\nType: Trousers\nGroup: Garment Lower body\nGarment Type: Trousers\nDescription: Skinny-fit treggings in superstretch twill with an elasticated waist, fake front pockets and real back pockets.",5.482903
1,607347,T-shirt,##Product\nName: Beck L/S\nType: T-shirt\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Long-sleeved jersey top with a reversible sequin motif on the front (size 2-3Y with normal sequins). Slightly longer at the back with a gently rounded hem.,3.671158
2,658030,Trousers,"##Product\nName: Push Up Jegging L.W\nType: Trousers\nGroup: Garment Lower body\nGarment Type: Trousers Denim\nDescription: 5-pocket jeggings in washed, stretch denim with a low waist, zip fly and button, and skinny legs. Push up – denim with a superstretch function that showcases the body’s physique.",3.670661
3,863561,Bra,"##Product\nName: Alexis seamless top Rio Opt1\nType: Bra\nGroup: Underwear\nGarment Type: Under-, Nightwear\nDescription: Soft, non-wired bra top in ribbed fabric designed with the minimum number of seams for a seamless, comfortable feel against the skin. Adjustable shoulder straps and padded cups that shape the bust and provide good support. No fasteners.",2.771572
4,562252,Trousers,"##Product\nName: Space 5 pkt tregging\nType: Trousers\nGroup: Garment Lower body\nGarment Type: Trousers\nDescription: Skinny-fit treggings in superstretch twill with an elasticated waist, fake front pockets and real back pockets.",2.738821
...,...,...,...,...
95,884432,Underwear bottom,"##Product\nName: Tummy control thong\nType: Underwear bottom\nGroup: Underwear\nGarment Type: Under-, Nightwear\nDescription: Thong briefs in jersey with a high waist and lined gusset. The briefs have a firm sculpting effect on the tummy.",0.925685
96,572399,Sweater,"##Product\nName: Elin Sweatshirt\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Jersey Basic\nDescription: Oversized top in lightweight sweatshirt fabric with dropped shoulders, long sleeves and ribbing around the neckline, cuffs and hem. Brushed inside.",0.925685
97,731119,Earring,##Product\nName: Flirty Miki stud pk\nType: Earring\nGroup: Accessories\nGarment Type: Accessories\nDescription: Metal stud earrings in various sizes and designs. Size from 0.3 cm to 2 cm.,0.925685
98,639552,Skirt,"##Product\nName: KYLE SKIRT\nType: Skirt\nGroup: Garment Lower body\nGarment Type: Skirts\nDescription: Short 5-pocket skirt in washed denim with hard-worn details, a zip fly and frayed hem.",0.925685


## LLM For Generating Grounded Content

Let's use an LLM to automatically generate content for targeted marketing campaigns grounded with our knowledge graph using the above tools.
Here is a quick example for generating promotional messages, but you can create all sorts of content with this!

For our first message, let's consider a scenario where a user recently searched for products, but perhaps didn't commit to a purchase yet. We now want to send a message to promote relevant products.

In [31]:
# Import relevant libraries
from langchain.prompts import SystemMessagePromptTemplate, HumanMessagePromptTemplate, ChatPromptTemplate
from langchain.schema import StrOutputParser
from langchain.llms.bedrock import Bedrock

In [57]:
llm = Bedrock(model_id="anthropic.claude-v2", client=bedrock,
              model_kwargs = {
                  "temperature":0,
                  "anthropic_version":"bedrock-2023-05-31",
                  "max_tokens_to_sample": 2048
              })

### Create Knowledge Graph Stores for Retrieval

To ground our content generation, we need to define retrievers to pull information from our knowledge graph.  Let's make two stores:
1. Personalized Search Retriever (`kg_personalized_search`): Based on recent customer searches and purchase history, pull relevant products
2. Recommendations retriever (`kg_recommendations`): Based on recent customer searches, what else may we recommend to them?


In [58]:
# This will be a function so we can change per customer id
# We will use a mock URL for our sources in the metadata
def kg_personalized_search_gen(customer_id):
    return Neo4jVector.from_existing_index(
        embedding=embedding_model,
        url=NEO4J_URI,
        username=NEO4J_USERNAME,
        password=NEO4J_PASSWORD,
        index_name='product-text-embeddings',
        retrieval_query=f"""
        WITH node AS product, score AS searchScore

        OPTIONAL MATCH(product)<-[:VARIANT_OF]-(:Article)<-[:PURCHASED]-(:Customer)
        -[:PURCHASED]->(a:Article)<-[:PURCHASED]-(:Customer {{customerId: '{customer_id}'}})
        WITH count(a) AS purchaseScore, product, searchScore
        RETURN product.text + '\nurl: ' + 'https://representative-domain/product/' + product.productCode  AS text,
            (1.0+purchaseScore)*searchScore AS score,
            {{source: 'https://representative-domain/product/' + product.productCode}} AS metadata
        ORDER BY purchaseScore DESC, searchScore DESC LIMIT 5

    """
    )

In [59]:
# Use the same tailored search recommendations as above but with a smaller limit
kg_recommendations_bot1 = Neo4jVector.from_existing_index(
    embedding=embedding_model,
    url=NEO4J_URI,
    username=NEO4J_USERNAME,
    password=NEO4J_PASSWORD,
    index_name='product-text-embeddings',
    retrieval_query="""
    WITH node as searchProduct, score as searchScore
    MATCH(searchProduct)<-[:VARIANT_OF]-(:Article)-[r:CUSTOMERS_ALSO_LIKE]->(:Article)-[:VARIANT_OF]-(product)
    WITH  product, searchScore, sum(r.score*searchScore) AS recommenderScore
    RETURN product.text + '\nurl: ' + 'https://representative-domain/product/' + product.productCode  AS text,
    recommenderScore AS score,
    {source: 'https://representative-domain/product/' + product.productCode} AS metadata
    ORDER BY score DESC LIMIT 5
    """
)

### Prompt Engineering

Now let's define our prompts. We will combine two together:
1. A system prompt which, in this case, tells the LLM how to generate the message
2. A human prompt that just wraps the search prompt entered by the customer

This will allow us to pass the customer search to the retrievers but then also to the LLM for addition context when drafting the message.


In [72]:
general_system_template = '''
You are a personal assistant named Sally for a fashion, home, and beauty company called HRM.
write an email to {customerName}, one of your customers, to promote and summarize products relevant for them given the current season / time of year: {timeOfYear} .
Please only mention the Products listed below. Also add some other outfit recommendations based on the "Customer May Also Be Interested In" section. Do not come up with or add any new products to the list.
Each product comes with an https `url` field. Make sure to provide that https url with descriptive name text in markdown for each product.

---
# Relevant Products:
{searchProds}

# Customer May Also Be Interested In:
{recProds}
---
'''
general_user_template = "{searchPrompt}"
messages = [
    SystemMessagePromptTemplate.from_template(general_system_template),
    HumanMessagePromptTemplate.from_template(general_user_template),
]
prompt = ChatPromptTemplate.from_messages(messages)

### Create a Chain

Now let's put a chain together that will leverage the retrievers, prompts, and LLM model. This is where Langchain shines, putting RAG together in a simple way.

In addition to the personalized search and recommendations context, we will allow for some other parameters.

1. `timeOfYear`: The time of year as a date, season, month, etc. so the LLM can tailor the language appropriately.
2. `customerName`: Ordinarily, this can be pulled from the DB, but it has been scrubbed to maintain anonymity so we will provide our own name here.

You can potentially add other creative parameters here to help the LLM write relevant messages.


In [73]:
# Helper function
def format_docs(docs):
    return "\n\n".join([d.page_content for d in docs])

def chain_gen(customer_id):
    return ({'searchProds': (lambda x:x['searchPrompt']) | kg_personalized_search_gen(customer_id).as_retriever(search_kwargs={"k": 100}) | format_docs,
              'recProds': (lambda x:x['searchPrompt']) | kg_recommendations_bot1.as_retriever(search_kwargs={"k": 5}) | format_docs,
              'customerName': lambda x:x['customerName'],
              'timeOfYear': lambda x:x['timeOfYear'],
              "searchPrompt":  lambda x:x['searchPrompt']}
             | prompt
             | llm
             | StrOutputParser())

### Example Runs

In [74]:
chain = chain_gen(CUSTOMER_ID)

In [75]:
print(chain.invoke({'searchPrompt':search_prompt, 'customerName':'Alex Smith', 'timeOfYear':'Nov, 2023'}))

 Hello Alex,

I hope you are doing well! As the weather gets cooler, it's time to break out the cozy sweaters. Here are some oversized sweater options I think you would love for fall:

## Alex sweater
This boxy-style jumper has a round neckline and dropped shoulders for a relaxed, oversized fit. It's knit with a soft wool blend that will keep you warm. [Alex sweater](https://representative-domain/product/775996)

## PRICE ITEM: Katya price 
This long-sleeved, off-the-shoulder sweater has a slouchy, oversized shape. It's made of a super soft printed sweatshirt fabric. [PRICE ITEM: Katya price](https://representative-domain/product/736156)  

## Irma sweater
This printed sweatshirt top has an oversized, slouchy fit with dropped shoulders and long sleeves. It's perfect for a casual fall look. [Irma sweater](https://representative-domain/product/669682)

## Neve Off Shoulder
This off-the-shoulder jumper has a wide foldover neckline for a relaxed, slouchy fit. It's knit from a soft, fine wo

In [76]:
print(chain.invoke({'searchPrompt':"western boots", 'customerName':'Alex Smith', 'timeOfYear':'Nov, 2023'}))

 Dear Alex,

I hope you are doing well! With November here, it's the perfect time to update your fall wardrobe. Based on your interest in western boots, here are some great options we currently have available:

- [Harry hiking boot](https://representative-domain/product/817484) - Sturdy canvas hiking boots with a chunky platform. Perfect for outdoor activities.

- [Patsy Platform](https://representative-domain/product/752857) - Faux leather platform boots with decorative details. Great for adding height. 

- [Milla sockboot](https://representative-domain/product/809521) - Sleek suede sock boots with a block heel. So chic and comfortable.

- [West puffer boot waterproof SB](https://representative-domain/product/646691) - Cozy, waterproof boots lined with faux fur. Ideal for cold weather.

- [WILDER](https://representative-domain/product/458032) - Versatile ankle boots with a low heel. An everyday essential.

I'd also recommend layering some of these boots with these stylish tops:

- [CA

Feel free to experiment and try more!

### Demo App
Now let’s use the above tools to create a demo app with Gradio.  We will need to make a couple more functions, but otherwise easy to fire up from a Notebook!

In [77]:
# Create a means to generate and cache chains...so we can quickly try different customer ids
personalized_search_chain_cache = dict()
def get_chain(customer_id):
    if customer_id in personalized_search_chain_cache:
        return personalized_search_chain_cache[customer_id]
    chain = chain_gen(customer_id)
    personalized_search_chain_cache[customer_id] = chain
    return chain

In [78]:
import gradio as gr

def message_generator(*x):
    chain = get_chain(x[0])
    return chain.invoke({'searchPrompt':x[3], 'customerName':x[2], 'timeOfYear': x[1]})

customer_id = gr.Textbox(value=CUSTOMER_ID, label="Customer ID")
time_of_year = gr.Textbox(value="Nov, 2023", label="Time Of Year")
search_prompt_txt = gr.Textbox(value='Oversized Sweaters', label="Customer Interests(s)")
customer_name = gr.Textbox(value='Alex Smith', label="Customer Name")
message_result = gr.Markdown( label="Message")

demo = gr.Interface(fn=message_generator,
                    inputs=[customer_id, time_of_year, customer_name, search_prompt_txt],
                    outputs=message_result,
                    title="🪄 Message Generator 🥳")
demo.launch(share=True, debug=True)

Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://ed6f60ab371c4b408e.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


Failed to read from defunct connection ResolvedIPv4Address(('52.88.247.84', 7687)) (ResolvedIPv4Address(('52.88.247.84', 7687)))


Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://ed6f60ab371c4b408e.gradio.live




## Extra Credit: Demo App - Directly to Recommendations
There are lots of different ways we can configure this.  Let's try a shorter version that cuts right to personalized recommendations and makes an in-season pun.

### Personalized Recommendations

To do this, let's look at personalized recommendations for a bit.  To keep things simple, we will base this just on purchase history, not search, though we could do both if we wanted to (similar to what we did in the above Semantic Search with context section).

First, we will start by creating a Neo4jGraph object that we can then query. This is different from the vector-based retrievers above.

In [79]:
from langchain.graphs import Neo4jGraph

kg = Neo4jGraph(url=NEO4J_URI, username=NEO4J_USERNAME, password=NEO4J_PASSWORD)

In [80]:
res = kg.query('''
    MATCH(:Customer {customerId:$customerId})-[:PURCHASED]->(:Article)
    -[r:CUSTOMERS_ALSO_LIKE]->(:Article)-[:VARIANT_OF]->(product)
    RETURN product.productCode AS productCode,
        product.prodName AS prodName,
        product.productTypeName AS productType,
        product.text AS document,
        sum(r.score) AS recommenderScore
    ORDER BY recommenderScore DESC LIMIT $k
    ''', params={'customerId': CUSTOMER_ID, 'k':15})

#visualize as dataframe. result is list of dict
pd.DataFrame(res)

Unnamed: 0,productCode,prodName,productType,document,recommenderScore
0,731142,Lead Superskinny,Trousers,"##Product\nName: Lead Superskinny\nType: Trousers\nGroup: Garment Lower body\nGarment Type: Trousers\nDescription: Chinos in stretch twill with a zip fly and button, side pockets, welt back pockets and skinny legs.",17.999867
1,598806,Dixie tee,T-shirt,##Product\nName: Dixie tee\nType: T-shirt\nGroup: Garment Upper body\nGarment Type: Jersey Fancy\nDescription: Short top in soft cotton jersey with short sleeves. Contrasting colour trims around the neckline and sleeves.,14.999883
2,682848,Skinny RW Ankle Milo Zip,Trousers,"##Product\nName: Skinny RW Ankle Milo Zip\nType: Trousers\nGroup: Garment Lower body\nGarment Type: Trousers Denim\nDescription: 5-pocket, ankle-length jeans in washed stretch denim with hard-worn details, a regular waist, zip fly and button, and skinny legs with a zip at the hems. The jeans are made partly from recycled cotton.",14.999882
3,753724,Rosemary,Dress,"##Product\nName: Rosemary\nType: Dress\nGroup: Garment Full body\nGarment Type: Dresses Ladies\nDescription: Short dress in woven fabric with 3/4-length sleeves with an opening and ties at the cuffs, and a gently rounded hem. Unlined.",14.999872
4,511924,Leona Push Mirny,Bra,"##Product\nName: Leona Push Mirny\nType: Bra\nGroup: Underwear\nGarment Type: Under-, Nightwear\nDescription: Push-up bra in lace and mesh with underwired, moulded, padded cups for a larger bust and fuller cleavage. Lace racer back, narrow adjustable shoulder straps, a wide mesh strap at the back and metal fastener at the front.",13.999889
5,569974,DONT USE ROLAND HOOD,Hoodie,"##Product\nName: DONT USE ROLAND HOOD\nType: Hoodie\nGroup: Garment Upper body\nGarment Type: Jersey Basic\nDescription: Top in sweatshirt fabric with a lined drawstring hood, kangaroo pocket, long raglan sleeves and ribbing at the cuffs and hem.",13.999882
6,656401,PASTRY SWEATER,Sweater,"##Product\nName: PASTRY SWEATER\nType: Sweater\nGroup: Garment Upper body\nGarment Type: Knitwear\nDescription: Jumper in soft, textured-knit cotton with long raglan sleeves and ribbing around the neckline, cuffs and hem.",12.999896
7,660519,Haven back detail,Bra,"##Product\nName: Haven back detail\nType: Bra\nGroup: Underwear\nGarment Type: Under-, Nightwear\nDescription: Push-up bra in lace and mesh with underwired, moulded, padded cups for a larger bust and fuller cleavage. Lace racer back, narrow adjustable shoulder straps, a wide mesh strap at the back and a metal fastener at the front.",10.999906
8,752193,Banks,Hoodie,"##Product\nName: Banks\nType: Hoodie\nGroup: Garment Upper body\nGarment Type: Jersey Basic\nDescription: Long-sleeved top in sweatshirt fabric made from a cotton blend with a double-layered hood, gently dropped shoulders and ribbing at the cuffs and hem. Soft brushed inside.",9.999917
9,606711,Rylee flatform,Heeled sandals,"##Product\nName: Rylee flatform\nType: Heeled sandals\nGroup: Shoes\nGarment Type: Shoes\nDescription: Sandals with imitation suede straps, an elastic heel strap and wedge heels. Satin insoles and thermoplastic rubber (TPR) soles. Platform front 2 cm, heel 6 cm.",9.999916


### Creating The Demo App

Now we can create a function to retrieve for the LMM chain based off our personalized recommendations example.

In [81]:
def kg_recommendations_app2(customer_id, k=30):
    res = kg.query("""
    MATCH(:Customer {customerId:$customerId})-[:PURCHASED]->(:Article)
    -[r:CUSTOMERS_ALSO_LIKE]->(:Article)-[:VARIANT_OF]->(product)
    RETURN product.text + '\nurl: ' + 'https://representative-domain/product/' + product.productCode  AS text,
        sum(r.score) AS recommenderScore
    ORDER BY recommenderScore DESC LIMIT $k
    """, params={'customerId': customer_id, 'k':k})

    return "\n\n".join([d['text'] for d in res])

In [82]:
# test out
print(kg_recommendations_app2(CUSTOMER_ID))

##Product
Name: Lead Superskinny
Type: Trousers
Group: Garment Lower body
Garment Type: Trousers
Description: Chinos in stretch twill with a zip fly and button, side pockets, welt back pockets and skinny legs.
url: https://representative-domain/product/731142

##Product
Name: Dixie tee
Type: T-shirt
Group: Garment Upper body
Garment Type: Jersey Fancy
Description: Short top in soft cotton jersey with short sleeves. Contrasting colour trims around the neckline and sleeves.
url: https://representative-domain/product/598806

##Product
Name: Skinny  RW Ankle Milo Zip
Type: Trousers
Group: Garment Lower body
Garment Type: Trousers Denim
Description: 5-pocket, ankle-length jeans in washed stretch denim with hard-worn details, a regular waist, zip fly and button, and skinny legs with a zip at the hems. The jeans are made partly from recycled cotton.
url: https://representative-domain/product/682848

##Product
Name: Rosemary
Type: Dress
Group: Garment Full body
Garment Type: Dresses Ladies
Des

Next, we define our prompt.

In [83]:
general_system_template_app2 = '''
You are a personal assistant named Sally for a fashion, home, and beauty company called HRM.
write an email to {customerName}, one of your customers, to promote and summarize products that fasionably pair with what they searched for given the current season / time of year: {timeOfYear}.
Make an in-season pun too!
Please only choose from the Products listed below. Choose no more than 5. Do not come up with or add any new products to the list.
Each product description comes with a "url" field. make sure to link to the url with descriptive name text for each product so the customer can easily find them.

---
# Relevant Products:
{recProds}
---
'''

general_user_template_app2 = '''Something that goes with {searchPrompt}'''
messages_app2 = [
    SystemMessagePromptTemplate.from_template(general_system_template_app2),
    HumanMessagePromptTemplate.from_template(general_user_template_app2),
]
prompt_app2 = ChatPromptTemplate.from_messages(messages_app2)

Then we can construct a chain for this and run a test example.

In [84]:
from operator import itemgetter
from langchain.schema.runnable import RunnableLambda

chain_app2 = ({'recProds': itemgetter('customerId') |  RunnableLambda(kg_recommendations_app2),
             'customerName': lambda x:x['customerName'],
             'timeOfYear': lambda x:x['timeOfYear'],
             "searchPrompt":  lambda x:x['searchPrompt']}
            | prompt_app2
            | llm
            | StrOutputParser())

In [85]:
print(chain_app2.invoke({'customerId':CUSTOMER_ID, 'searchPrompt':"western boots", 'customerName':'Alex Smith', 'timeOfYear':'Nov, 2023'}))

 Dear Alex,

Howdy partner! I reckon with those western boots, you'll be ready to mosey on down to the ol' town square for a rootin' tootin' good time. Here are some giddy up and go options to pair with your boot-scootin' boots this season:

Check out the [Lead Superskinny](https://representative-domain/product/731142) trousers. These skinny chinos will help you saddle right up and avoid any wagon wheel gaps. 

The [Dixie tee](https://representative-domain/product/598806) is a perfect pony express top. The contrasting colors will help you stand out like a sore thumb at the hoedown. 

Stay warm by the campfire in the [PASTRY SWEATER](https://representative-domain/product/656401). This textured knit is softer than a handful of prairie dog fur. 

For you city slickers, the [Rylee flatform](https://representative-domain/product/606711) sandals add a touch of modern with their platform wedge heels, letting you keep one boot in the stirrups and one in the city.

And don't forget the [Karin h

Now, build the Gradio App and see how it works!

In [86]:
import gradio as gr

def message_generator_app2(*x):
    return chain_app2.invoke({'searchPrompt':x[3],
                              'customerName':x[2],
                              'timeOfYear': x[1],
                              'customerId': x[0]})

customer_id = gr.Textbox(value=CUSTOMER_ID, label="Customer ID")
time_of_year = gr.Textbox(value="Nov, 2023", label="Time Of Year")
customer_name = gr.Textbox(value='Alex Smith', label="Customer Name")
search_prompt_txt = gr.Textbox(value='Oversized Sweaters', label="Customer Interests(s)")
message_result = gr.Markdown( label="Message")

demo = gr.Interface(fn=message_generator_app2,
                    inputs=[customer_id, time_of_year, customer_name, search_prompt_txt],
                    outputs=message_result,
                    title="🪄 Message Generator - Recommendations 🥳")
demo.launch(share=True, debug=True)

Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://99fbf79dad876eca48.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


Keyboard interruption in main thread... closing server.
Killing tunnel 127.0.0.1:7860 <> https://99fbf79dad876eca48.gradio.live




## That's a Wrap!