This will act as the prompt generation module that will act to generate prompts in response to the users initial input. Typically here we would use an api key such as one by open AI, but I'm choosing to run this locally and will use a free model from hugging face. 

In [4]:
from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM

Load a pretrained model and begin making the pipeline

In [36]:
model_name = 'google/flan-t5-large'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)\

llm_pipeline = pipeline('text2text-generation', 
    model=model, 
    tokenizer=tokenizer, 
    max_length=512, 
    temperature=0.1,
    do_sample=True,
    repetition_penalty=1.2)

Device set to use mps:0


We need to make a generation function to actually create the response to the user input

In [37]:
def generate_response(user_query, retrieved_context, sentiment, intent):
    # Flan-T5 prefers this specific structure:
    prompt = f"""
    Answer the following question based on the context provided. Be polite and helpful.

    Context: {retrieved_context}
    
    Question: {user_query}
    
    Answer:
    """
    
    # We call the pipeline
    response = llm_pipeline(prompt)
    
    # Clean the output just in case
    return response[0]['generated_text']

Test this pipeline to ensure proper end user feedback

In [38]:
mock_user_query = "Where is my refund? I've been waiting forever!"
mock_sentiment = "negative" 
mock_intent = "cancellation and refund"
mock_retrieved_policy = "Refund Policy: Money is returned to the original payment method within 5-7 business days."

print("\nGenerating Response locally...")
final_answer = generate_response(
    mock_user_query, 
    mock_retrieved_policy, 
    mock_sentiment, 
    mock_intent
)

print("\n--- FINAL CHATBOT OUTPUT ---")
print(final_answer)


Generating Response locally...

--- FINAL CHATBOT OUTPUT ---
5-7 business days


Key Takeaways, this model is more straightforward which may result in user feedback that is not overly friendly. Using a model where you pay per token such as a API call from chatgpt would be better suited for an issue such as this. However for the purpose of demenstration this model does just fine.