# Approach :
The csv file has been loaded into vector db. All descriptive columns have been combined into one. That is the page content and the article id is the metadata column. The user input is expected to be of three types


1.   User knows what they are looking for : in that case, search of that item in the database and look for the best match
2.   User doesn't have a particular item on their mind : ask some questions, summarise the response and return the best match product to the user
3.   User asks as unrelated question : Bot politely declines to provide an answer

No. 1 is fairly straightforward as the user's query is put through the Document retrieval module and a match is fetched.
No. 2 Langchain buffer memory and summary memory are used to remember and summarise the user's request, convert it to a query tone and then put through the database. The output is then again humanised and shown to the user.

In the interest of time, the implementation is really an MVP. I would like to ask if there is anything else the user might want to add. Also, right now the questions for #2 are hardcoded and I would like to switch it out with input and while: true loops.

# Packages

In [162]:
!pip -q install langchain openai chromadb

# Libraries

In [163]:
from langchain.chat_models import ChatOpenAI
from langchain.llms import OpenAI
from langchain.schema.messages import SystemMessage
from langchain.schema import HumanMessage
from langchain.chains import LLMChain,RetrievalQA
from langchain.memory import ConversationBufferMemory, ConversationSummaryMemory
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
from langchain.document_loaders import DataFrameLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain.prompts import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    MessagesPlaceholder,
    SystemMessagePromptTemplate,
)

# Preparing DB

In [164]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [165]:
import pandas as pd
articles = pd.read_csv("/content/drive/MyDrive/Copy of articles.csv")
print(articles.columns)

def concat_columns(df, cols_to_concat, new_col_name, sep=","):
    df[new_col_name] = df[cols_to_concat[0]]
    for col in cols_to_concat[1:]:
        df[new_col_name] = df[new_col_name] + sep + df[col]
    return df

cols_to_concat = ['prod_name', 'product_type_name', 'product_group_name','graphical_appearance_name', 'colour_group_name', 'garment_group_name','detail_desc']
new_col_name = 'concatenated_desc'
articles = concat_columns(articles, cols_to_concat, new_col_name, sep=",")
# articles = articles[:10]
print(len(articles))
articles.head(n=2)

Index(['article_id', 'prod_name', 'product_type_name', 'product_group_name',
       'graphical_appearance_name', 'colour_group_name', 'garment_group_name',
       'detail_desc'],
      dtype='object')
1000


Unnamed: 0,article_id,prod_name,product_type_name,product_group_name,graphical_appearance_name,colour_group_name,garment_group_name,detail_desc,concatenated_desc
0,695255001,Siv t-shirt,T-shirt,Garment Upper body,All over pattern,Dark Blue,Jersey Fancy,Short-sleeved top in soft viscose jersey with ...,"Siv t-shirt,T-shirt,Garment Upper body,All ove..."
1,821115007,RICHIE SKIRT,Skirt,Garment Lower body,Check,Pink,Skirts,"Short, pleated skirt in woven fabric with a hi...","RICHIE SKIRT,Skirt,Garment Lower body,Check,Pi..."


In [166]:
loader = DataFrameLoader(articles[['article_id', 'concatenated_desc']],page_content_column="concatenated_desc")
data=loader.load()
text_splitter = CharacterTextSplitter(chunk_size=250, chunk_overlap=10)
texts = text_splitter.split_documents(data)
print(len(texts))
texts[5]

1000


Document(page_content='KELLY SHIRT S.0,Shirt,Garment Upper body,Stripe,Blue,Blouses,Shirt in airy cotton with a collar, buttons down the front, long sleeves with buttoned cuffs, and a rounded hem.', metadata={'article_id': 697564030})

In [167]:
# Embed and store the texts
# Supplying a persist_directory will store the embeddings on disk
persist_directory = 'db'
import os

## here we are using OpenAI embeddings but in future we will swap out to local embeddings
embedding = OpenAIEmbeddings()

vectordb = Chroma.from_documents(documents=texts,
                                 embedding=embedding,
                                 persist_directory=persist_directory)

#make a retriever
retriever = vectordb.as_retriever(search_kwargs = {"k":2})
print(retriever.search_type)
print(retriever.search_kwargs)

similarity
{'k': 2}


# Functions to modulate AI responses

In [168]:
"""Function to convert AI's summary of user's request into a question"""
def AItoQuestion(input_txt):
    llm = ChatOpenAI()
    system_prompt_3 = """"the input statement will mention an object description.convert that description from a
    statement to a question that looks for the nearest match to the description. for example, if the user says:
    you want a new red top, then rephrase it as: is there anything that nearly matches a red top ?"""


    messages = [
        SystemMessage(
            content=system_prompt_3
        ),
        HumanMessage(
            content=input_txt
        ),
    ]
    op_txt = llm(messages)
    return op_txt

# get_question = str(AItoQuestion(get_text))[9:]
# get_question

In [169]:
"""Function to convert matched output to conversational tone"""
def ResultsToConversation(input_txt):
    llm = ChatOpenAI()
    system_prompt_3 = """"the input statement will contain the description of an object followed by a metadata that contains an article id.
    convert the input into a human -like message. For example,  if the input is : 'fancy black pants. metadata = {'article_id' : 75657766} ' then rephrase it as :
    'here is what i found ! :Fancy black pants, check out article id : 75657766 """


    messages = [
        SystemMessage(content=system_prompt_3),
          HumanMessage(content=input_txt)
    ]
    op_txt = llm(messages)
    return op_txt

# get_response = str(ResultsToConversation(op_txt))
# get_response

In [170]:
def HelptheUser(firstquestion):

    llm = ChatOpenAI()
    system_prompt = """"you work as an assistant at a garment store. your goal is to ask the user about the
    following:\n- the occasion they are shopping for\n- any particular type of garment they want\n- any preference of colors\n\n
    Ask for all these preferences from the user one by one and combine the responses in the end in one sentence.
    """

    # Prompt
    prompt = ChatPromptTemplate(
        messages=[
            SystemMessagePromptTemplate.from_template(
                system_prompt
            ),
            MessagesPlaceholder(variable_name="chat_history"),
            HumanMessagePromptTemplate.from_template("{question}"),
        ]
    )

    memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
    conversation = LLMChain(llm=llm, prompt=prompt, verbose=True, memory=memory)
    # conversation({"question": "hi, i want to buy something, can you help me ?"})
    conversation({"question": firstquestion})
    conversation({"question": "i would like to get something for the beach"})
    conversation({"question": "maybe a bikini"})
    conversation({"question": "shades of blue i like"})
    response = conversation({"question": "yes!"})
    get_text = response['text']
    return get_text


In [171]:
"""Function to determine if the user's request is towards a specific product or is looking for advice"""
def usersays(firstlinefromuser):
    llm = ChatOpenAI()
    system_prompt_3 = """"you are a helpful chatbot with a garment company. if the input statement contains an inquiry about a specific garment then
    return 'specific'. if the inquiry is about seeking help with choosing garments then return 'help'. If the user asks anything other than about garments
    then reply politely that you do not know about this"""


    messages = [
        SystemMessage(content=system_prompt_3),
          HumanMessage(content=firstlinefromuser)
    ]
    op_txt = llm(messages)
    return op_txt

"""Accepts the type of user query as input and then passes it along the conversational chain if it is a general advice or towards document retrieval if it is something specific"""

def QueryType(get_response):
    qa_chain = RetrievalQA.from_chain_type(llm=OpenAI(),
                                  chain_type="stuff",
                                  retriever=retriever,
                                  return_source_documents=True)
    if get_response == 'specific':
        llm_response = qa_chain(firstlinefromuser) #passing the generated question into retrieval chain to get the best match
        get_question = str(AItoQuestion(str(llm_response['source_documents'][0])))
        print("Best Match:")
        op_txt = str(llm_response['source_documents'][0])[9:]
        final_response = str(ResultsToConversation(op_txt))


    elif get_response == 'help':
      op_txt = HelptheUser(firstlinefromuser)
      get_question = str(AItoQuestion(op_txt))
      llm_response = qa_chain(get_question)
      op_txt = str(llm_response['source_documents'][0])[9:]
      final_response = str(ResultsToConversation(op_txt))

    else:
      final_response = get_response



    print(final_response)

# Putting it all together

In [172]:
# firstlinefromuser = "Do you have a red dress for a party ?"
firstlinefromuser = "Can you help me choose clothes ?"

get_response = str(usersays(firstlinefromuser))[9:-1]
QueryType(get_response)



[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: "you work as an assistant at a garment store. your goal is to ask the user about the
    following:
- the occasion they are shopping for
- any particular type of garment they want
- any preference of colors


    Ask for all these preferences from the user one by one and combine the responses in the end in one sentence.
    
Human: Can you help me choose clothes ?[0m

[1m> Finished chain.[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mSystem: "you work as an assistant at a garment store. your goal is to ask the user about the
    following:
- the occasion they are shopping for
- any particular type of garment they want
- any preference of colors


    Ask for all these preferences from the user one by one and combine the responses in the end in one sentence.
    
Human: Can you help me choose clothes ?
AI: Of course! I'd be happy to help you choose clothes. C

In [173]:
firstlinefromuser = "Do you have a red dress for a party ?"
get_response = str(usersays(firstlinefromuser))[9:-1]
QueryType(get_response)

Best Match:
content="Here is what I found! A beautiful paint dress for ladies. This dress is full body and made of solid, light red chiffon. It features a deep V-neck decorated with a pleated flounce that extends over each shoulder and down the back, transforming into narrow ribbons that cross and tie. There is a seam under the bust, a concealed zip on one side, and a flared, pleated skirt. The dress also comes with a satin lining. Don't miss out on this stunning piece! Check out the article id: 854232002."


In [174]:
firstlinefromuser = "Which house should i live in?"
get_response = str(usersays(firstlinefromuser))[9:-1]
QueryType(get_response)

I'm sorry, but I don't have information about which house you should live in. My expertise is in helping with garment-related inquiries. Is there anything specific about garments that I can assist you with?
