## Website
https://medium.com/microsoftazure/recommendation-systems-enhanced-by-llms-fe1fc8e23a58

In [28]:
import os
import pandas as pd
import lancedb
from langchain_openai import AzureOpenAIEmbeddings
from langchain.vectorstores import LanceDB
from langchain.chains import RetrievalQA
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAI

from dotenv import load_dotenv
load_dotenv(verbose=True, override=True)

embeddings_api = AzureOpenAIEmbeddings(azure_deployment=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME"),
                                    openai_api_version=os.getenv("AZURE_OPENAI_API_VERSION")
                                    )

llm_chat_api = AzureChatOpenAI(openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
                            azure_deployment=os.environ["AZURE_OPENAI_CHAT_DEPLOYMENT_NAME"],
                            max_tokens = 1500
                            )

llm_instruct_api = AzureOpenAI(openai_api_version=os.environ["AZURE_OPENAI_API_VERSION"],
                            azure_deployment=os.environ["AZURE_OPENAI_INSTRUCT_DEPLOYMENT_NAME"],
                            max_tokens = 1500
                            )



In [29]:
card_rewards = pd.read_pickle(os.path.join(os.getcwd(), "data","card_rewards.pkl"))  
card_rewards.head(2)

Unnamed: 0,CardName,Discount,Description,text,n_tokens,vector
0,American Express Gold Card,€30,Get a €30 statement credit by spending €150 or...,CardName: American Express Gold Card. Discount...,42,"[-0.028317386284470558, -0.01587647572159767, ..."
1,American Express Gold Card,€25,Earn a €25 statement credit with a €99 purchas...,CardName: American Express Gold Card. Discount...,43,"[-0.021910404786467552, -0.013033478520810604,..."


we want to store our embeddings into a VectorDB, so that we can perform similarity search with the embedded query of the user. LangChain offers many integrations with 3rd party vector stores and, in this case, we are going to use LanceDB:

In [30]:
uri = "dataset/sample-card-lancedb"
db = lancedb.connect(uri)
table = db.create_table("card", card_rewards, mode="overwrite")

embeddings = embeddings_api

docsearch = LanceDB(connection = db, embedding = embeddings, table_name="card")

To test our vector store, let’s start with a simple similarity computation which returns the first most similar result using cosine similarity as distance metric:

In [31]:
query = "I'm looking for an grocery shop credit cad promotion. What card could you suggest to me?"
docs = docsearch.similarity_search(query, k=1)
docs

[Document(page_content='CardName: Mastercard Platinum. Discount: 5% cashback Description: Earn 5% cashback on all grocery store purchases. Offer valid until 11/30/2023.', metadata={'CardName': 'Mastercard Platinum', 'Discount': '5% cashback', 'Description': 'Earn 5% cashback on all grocery store purchases. Offer valid until 11/30/2023.', 'n_tokens': 39, 'vector': array([-0.01746309, -0.01555755, -0.00399054, ...,  0.00378824,
        -0.0067738 , -0.03033201], dtype=float32), '_distance': 0.30936431884765625})]

Use LangChain RetrievalQA chain, alongside the LanceDB as retriever.

In [32]:
# Import Azure OpenAI
query = "I'm looking for an grocery shop credit cad promotion. What card could you suggest to me?"

# qa = RetrievalQA.from_chain_type(llm=llm_chat_api, 
#                                 chain_type="stuff", 
#                                 retriever=docsearch.as_retriever(), 
#                                 return_source_documents=True
#                                 )

# result = qa({"query": query})
# print(f"Chat: {result['result']}" )

qa = RetrievalQA.from_chain_type(llm=llm_instruct_api, 
                                chain_type="stuff", 
                                retriever=docsearch.as_retriever(), 
                                return_source_documents=True
                                )

result = qa({"query": query})
print(f"Result: {result['result']}" )

Result:  The Mastercard Platinum or the Capital One SavorOne Cash Rewards Credit Card both offer cashback on grocery store purchases. The Revolut Standard Card also has a limited time offer for 2% cashback on grocery store purchases.


Since we used the return_source_documents=True parameter, we can also retrieve the source documents from our result variable:

In [33]:
result['source_documents'][0]

Document(page_content='CardName: Mastercard Platinum. Discount: 5% cashback Description: Earn 5% cashback on all grocery store purchases. Offer valid until 11/30/2023.', metadata={'CardName': 'Mastercard Platinum', 'Discount': '5% cashback', 'Description': 'Earn 5% cashback on all grocery store purchases. Offer valid until 11/30/2023.', 'n_tokens': 39, 'vector': array([-0.01746309, -0.01555755, -0.00399054, ...,  0.00378824,
       -0.0067738 , -0.03033201], dtype=float32), '_distance': 0.30936431884765625})

With RetrievalQA, you can pass a custom prompt that you can easily define using LangChain prompt templates. Let’s start with a simple prompt as follows:

In [35]:
from langchain.prompts import PromptTemplate

template = """You are a credit card recommender system that help me to decide which card I should use in my next payment. 
Use the following pieces of context to answer the question at the end. 
For each question, suggest three cards, with a short description of the plot and the reason why the user migth like it.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Your response:"""


PROMPT = PromptTemplate(
    template=template, input_variables=["context", "question"])

chain_type_kwargs = {"prompt": PROMPT}

query = "I'm looking for an grocery shop credit cad promotion. What card could you suggest to me?"

qa = RetrievalQA.from_chain_type(llm=llm_instruct_api, 
    chain_type="stuff", 
    retriever=docsearch.as_retriever(),
    return_source_documents=True, 
    chain_type_kwargs=chain_type_kwargs)


result = qa({'query':query})

print(f"Result: {result['result']}" )


Result:  I would suggest the Mastercard Platinum, Capital One SavorOne Cash Rewards Credit Card, and Capital One Walmart Rewards Card. The Mastercard Platinum offers a 5% cashback on all grocery store purchases, while the Capital One SavorOne Cash Rewards Credit Card has unlimited 3% cashback on dining and entertainment, popular streaming services, and grocery stores. The Capital One Walmart Rewards Card also offers 5% cashback on Walmart.com purchases, which includes pickup and delivery. These cards can provide significant savings on your grocery purchases.


In [45]:
from langchain.prompts import PromptTemplate

template_prefix = """You are a credit card recommender system that help me to decide which card the user should use in my next payment. 
Use the following pieces of context to answer the question at the end. 
For each question, take into account the context and the personal information provided by the user.
Take special attetion to the store were the user currently is and only suggest him/her to use the credit card that has the most benefits for the specific store.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}"""

user_info = """This is what we know about the user, and you can use this information to better tune your research:
PreviousPayments: {listPreviousPayments}
Store:{store}
"""

template_suffix= """Question: {question}
Your response:"""

user_info = user_info.format(age = 18, 
                            gender = 'female', 
                            listPreviousPayments="['SPORTING GOODS STORES', 'COMPUTERS, COMPUTER PERIPHERAL EQUIPMENT, SOFTWARE', 'TELECOMMUNICATION EQUIPMENT AND TELEPHONE SALES', 'BOOK STORES']",
                            store='STAPLES'
)

COMBINED_PROMPT = template_prefix +'\n'+ user_info +'\n'+ template_suffix
print(COMBINED_PROMPT)

You are a credit card recommender system that help me to decide which card the user should use in my next payment. 
Use the following pieces of context to answer the question at the end. 
For each question, take into account the context and the personal information provided by the user.
Take special attetion to the store were the user currently is and only suggest him/her to use the credit card that has the most benefits for the specific store.
If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}
This is what we know about the user, and you can use this information to better tune your research:
PreviousPayments: ['SPORTING GOODS STORES', 'COMPUTERS, COMPUTER PERIPHERAL EQUIPMENT, SOFTWARE', 'TELECOMMUNICATION EQUIPMENT AND TELEPHONE SALES', 'BOOK STORES']
Store:Travel Agency

Question: {question}
Your response:


In [46]:
PROMPT = PromptTemplate(
    template=COMBINED_PROMPT, input_variables=["context", "question"])

chain_type_kwargs = {"prompt": PROMPT}

query = "Can you suggest me which credit card should I use?"

qa = RetrievalQA.from_chain_type(llm=llm_instruct_api, 
    chain_type="stuff", 
    retriever=docsearch.as_retriever(),
    return_source_documents=True, 
    chain_type_kwargs=chain_type_kwargs)


result = qa({'query':query})
result['result']


print(f"Result: {result['result']}" )

Result:  Based on the information provided, I would recommend using the Capital One SavorOne Cash Rewards Credit Card. This card offers 3% cash back on dining, entertainment, popular streaming services, and grocery stores, which may be beneficial for your purchase at the travel agency. Additionally, there is no annual fee and you may be able to access a higher credit line after making your first 5 monthly payments on time.
