## FetchMatchAI Prompt Engineering Demo


In the following notebook, I have show-cased the following functionalities using GPT - 3.5 (16K context input limit) and Prompt Engineering:

- Considering a keyword as input, we can get a list of all offers associated with that keyword
- "Ask me anything" - This tool can help us to answer any dynamic question on-the-fly

In [1]:
#######################################################################
   # Importing Required Libraries
#######################################################################

#Importing relevant libraries and modules for all sections
from langchain.chat_models import ChatOpenAI
from langchain.chains import LLMChain
import pandas as pd

#Import Python DF Agent Related libraries
from langchain.agents import create_pandas_dataframe_agent
from langchain.chat_models import ChatOpenAI
from langchain.agents.agent_types import AgentType
from langchain.llms import OpenAI
import openai
import contextlib
import io
from IPython.display import display, HTML

pd.options.display.float_format = '{:.2f}'.format

In [2]:
#######################################################################
   # Import and Pre-Process Dataset
#######################################################################

# Read the dataset as input
brand_category = pd.read_csv('brand_category.csv')
categories = pd.read_csv('categories.csv')
offer_retailer = pd.read_csv('offer_retailer.csv')
FetchAI_metadata = pd.read_csv('FetchAI_metadata.csv')

# Joining all dataframes to get a final table
offers_brand_w_category = pd.merge(offer_retailer,brand_category, on="BRAND", how='left')
offers_brand_w_category_w_Deets=pd.merge(offers_brand_w_category,categories, left_on="BRAND_BELONGS_TO_CATEGORY", right_on="IS_CHILD_CATEGORY_TO", how='left')

# Turn all values to lower case
offers_brand_w_category_w_Deets = offers_brand_w_category_w_Deets.applymap(lambda s: s.lower() if type(s) == str else s) 

# Dropping certain columns and re-naming a few other columns
offers_brand_w_category_w_Deets = offers_brand_w_category_w_Deets.drop(columns=['IS_CHILD_CATEGORY_TO'])
offers_brand_w_category_w_Deets.rename(columns = {'BRAND_BELONGS_TO_CATEGORY':'Category',
                                                  'PRODUCT_CATEGORY':'Product'},
                                       inplace = True)



In [4]:
#######################################################################
   # Create Prompts, Python DF Agent and associated LLMChain Generators
#######################################################################


#Following is a prompt being used to 


offer_search_prompt_string_Search = PromptTemplate(
    input_variables = ['Combined_Search_String','offer_retailer'],
    template = '''
    
    -------------------------------------------------------------
    # Role and Responsibility
    -------------------------------------------------------------
    
    GOAL: Find the column in {offer_retailer} that contains the string or sub-string of {Combined_Search_String}
    
    -------------------------------------------------------------
    # Process
    -------------------------------------------------------------
    
    Step 1: Perform a similar, case insensitive search match for {Combined_Search_String} in {offer_retailer}
    Step 2: Return the column name from {offer_retailer}
    
    -------------------------------------------------------------
    # Data To Be Used
    -------------------------------------------------------------
    
    Dataset to Use:
    **********************
    {offer_retailer}
    
    Columns to Use:
    **********************
    1) Retailer
    2) Brand
    3) Category
    4) Product
    

    -------------------------------------------------------------
    # Conditions
    ------------------------------------------------------------- 
    - Only return ONE FINAL COLUMN NAME
    
   
    -------------------------------------------------------------
    # IMPORTANT INSTRUCTIONS
    -------------------------------------------------------------
    
    RETURN THE FINAL OUTPUT AS THE NAME OF THE COLUMN FROM {offer_retailer}
    FINAL RESULT SHOULD BE ONLY ONE WORD
    
    EXAMPLE 1:
    ^^^^^^^^^^^
    If the right column to search {Combined_Search_String} in is "Brand", then:
    Your Answer: "Brand"
    
    
    EXAMPLE 2:
    ^^^^^^^^^^^^
    If the right column to search {Combined_Search_String} in is "Product", then:
    Your Answer: "Product"
        
    EXAMPLE 3:
    ^^^^^^^^^^^^
    If the right column to search {Combined_Search_String} in is "Category", then:
    Your Answer: "Category"
    
    
    Column Name:
    ....
    
   
    ''')

#Create LLM Generators that will use the above prompt
search_query_generator_string = LLMChain(
    llm=ChatOpenAI(model_name='gpt-3.5-turbo-16k', temperature=0.2, max_tokens=10000), 
    prompt=offer_search_prompt_string_Search
)


#creating a python agent on the associated dataframe
def create_python_agent(df):
    python_agent = 0
    python_agent = create_pandas_dataframe_agent(OpenAI(temperature=0.2), df, verbose=True)
    return python_agent

## Using Python DF Agent to generate associated observations from the Python code

# Creating a Python DF Agent on a dataset with all small values
py_df_agent_analyzer = create_python_agent(offers_brand_w_category_w_Deets)



In [13]:
#######################################################################
   # Find Relevant offers based on an input keyword using Prompt Engg
#######################################################################


## Part 1: Perform a keyword search for user input in the offer field 

user_input = input("Enter your search keyword: ")

#Converting user_input to lower case
user_input = user_input.lower()

# Creating a Python DF Agent on a dataset with all small values
py_df_agent_analyzer = create_python_agent(offers_brand_w_category_w_Deets)

# Running the Python DF Agent on the output returned by above prompt
Offer_pull_list = py_df_agent_analyzer.run("GOAL: To find a list of offers. \n" +
                                           "that contain the keyword " + user_input
                                          )
    
    
print("Offers Pulled: ")

non_existant_string_check = "no offers that contain"
#There are no offers that contain the keyword cake.
if non_existant_string_check not in Offer_pull_list:
    print(Offer_pull_list)
else:
    
    #Part 2: Search the keyword in a particular column, then use that column as a filter to search for all the offers
    Right_Column_to_Search = search_query_generator_string.run({
        
        'Combined_Search_String': user_input,
        'offer_retailer': offers_brand_w_category_w_Deets
    })
    
    print("Column to Search In: " + Right_Column_to_Search)

    # Running the Python DF Agent on the output returned by above prompt
    Offer_pull_list = py_df_agent_analyzer.run("GOAL: To find a list of offers. \n" +
                                           "Step 1: Find a fuzzy match for the string: \n"
                                           + user_input +
                                           "\n in the column : " + Right_Column_to_Search +"\n"
                                           + "The offer should contain: " + user_input 
                                           + "Step 2: Return a list of offers."
                                           + "NOTE: Ignore Null values."
                                          )
    
    if non_existant_string_check not in Offer_pull_list:
        print(Offer_pull_list)
    else:
        print("Apologies! No offers found")
        
        


Enter your search keyword: KFC


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to filter the dataframe to find the offers that contain the keyword kfc
Action: python_repl_ast
Action Input: df[df['OFFER'].str.contains('kfc')][0m
Observation: [36;1m[1;3m                OFFER RETAILER BRAND Category  RECEIPTS CATEGORY_ID Product
71   spend $20 at kfc      kfc   kfc      NaN       NaN         NaN     NaN
80   spend $25 at kfc      kfc   kfc      NaN       NaN         NaN     NaN
501  spend $15 at kfc      kfc   kfc      NaN       NaN         NaN     NaN
672  spend $40 at kfc      kfc   kfc      NaN       NaN         NaN     NaN
858  spend $10 at kfc      kfc   kfc      NaN       NaN         NaN     NaN[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: The offers that contain the keyword kfc are: spend $20 at kfc, spend $25 at kfc, spend $15 at kfc, spend $40 at kfc, and spend $10 at kfc.[0m

[1m> Finished chain.[0m
Offers Pulled: 
The of

In [75]:
#---------------------------------------------------------------------------------------------------------------
# Ask me anything - This section dynamically takes user input and returns output from searching in the dataset
#---------------------------------------------------------------------------------------------------------------

Offer_pull_list = py_df_agent_analyzer.run("Give me all offers for receipt 13")  

    
print("Offers Pulled: ")
print(Offer_pull_list)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to filter the dataframe to get the offers
Action: python_repl_ast
Action Input: df[df['RECEIPTS'] == 13][0m
Observation: [36;1m[1;3m                                                 OFFER     RETAILER    BRAND   
62                       gorton's, at select retailers    shop rite  gortons  \
83     gorton's air fried butterfly shrimp, at walmart      walmart  gortons   
159  dove hand wash, select varieties, buy 2 at target       target     dove   
474                      gorton's, at select retailers  stop & shop  gortons   
559                       gorton's at select retailers  stop & shop  gortons   
782                       gorton's at select retailers      walmart  gortons   
828         dove hand wash, select varieties at target       target     dove   

         Category  RECEIPTS CATEGORY_ID Product  
62   frozen meals     13.00         NaN     NaN  
83   frozen meals     13.00         NaN     NaN