# Project Title:
**AI Guardian: An AI-Powered Consumer Protection Assistant**

## Project Overview:
The AI Guardian project aims to develop an AI-powered assistant designed to help consumers navigate the digital marketplace safely and confidently. The assistant will focus on providing reliable information, detecting scams and misinformation, and offering support for consumer rights and redress mechanisms.

## Key Features:
1. **Trustworthy Information Source**:
   - **Verified Answers**: Use natural language processing (NLP) to provide consumers with verified and citation-backed answers to their queries.
   - **Bias Detection**: Implement algorithms to detect and mitigate regional or cultural biases in information provided.

2. **Scam and Misinformation Detection**:
   - **Real-time Scam Alerts**: Use machine learning to identify and alert users about potential scams in real-time, whether in emails, advertisements, or online shopping platforms.
   - **Misinformation Filtering**: Analyze content from various sources and flag misinformation, providing users with accurate and verified alternatives.

3. **Consumer Rights Education**:
   - **Interactive Learning Modules**: Offer interactive and engaging modules to educate consumers about their rights, especially in relation to AI and digital services.
   - **Guided Redress Mechanisms**: Provide step-by-step guidance on how consumers can seek redress for issues with AI technologies or digital services.

4. **Personalized Recommendations and Alerts**:
   - **Custom Alerts**: Based on user preferences and past behavior, the assistant can offer personalized alerts about new scams or critical updates in consumer protection laws.
   - **Safe Shopping Recommendations**: Recommend safe and reliable online marketplaces and vendors based on AI analysis of consumer reviews and ratings.

## Import Libraries

In [1]:
import os 
import dotenv
import streamlit as st
from langchain_openai import ChatOpenAI
from langchain_groq import ChatGroq
from langchain_core.messages import AIMessage, HumanMessage
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ChatMessageHistory
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_community.embeddings import OllamaEmbeddings
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.memory import ChatMessageHistory
from typing import Dict
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableBranch

## Load API KEYS

In [2]:
dotenv.load_dotenv()
GROQ_API_KEY = os.getenv('GROQ_API_KEY')
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')

## Initializing Chat Model
Let's initialize the chat model which will serve as the chatbot's brain:

In [3]:
Model = "gpt-3.5-turbo-1106"

chat = ChatOpenAI(model=Model, temperature=0.2)

## Message passing
The simplest form of memory is simply passing chat history messages into a chain. Here's an example:

In [4]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)

chain = prompt | chat

chain.invoke(
    {
        "messages": [
            HumanMessage(
                content="Translate this sentence from English to French: I love programming."
            ),
            AIMessage(content="J'adore la programmation."),
            HumanMessage(content="What did you just say?"),
        ],
    }
)

AIMessage(content='I said "J\'adore la programmation," which means "I love programming" in French.', response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 61, 'total_tokens': 82}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-d965bb4c-6e8f-4552-a59c-d742e5a0342c-0')

We can see that by passing the previous conversation into a chain, it can use it as context to answer questions. This is the basic concept underpinning chatbot memory - the rest of the guide will demonstrate convenient techniques for passing or reformatting messages.

## Chat history
It's perfectly fine to store and pass messages directly as an array, but we can use LangChain's built-in message history class to store and load messages as well. Instances of this class are responsible for storing and loading chat messages from persistent storage. LangChain integrates with many providers - you can see a list of integrations here - but for this demo we will use an ephemeral demo class.

Here's an example of the API:

In [5]:
from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message(
    "Translate this sentence from English to French: I love programming."
)

demo_ephemeral_chat_history.add_ai_message("J'adore la programmation.")

demo_ephemeral_chat_history.messages

[HumanMessage(content='Translate this sentence from English to French: I love programming.'),
 AIMessage(content="J'adore la programmation.")]

We can use it directly to store conversation turns for our chain:

In [6]:
demo_ephemeral_chat_history = ChatMessageHistory()

input1 = "Translate this sentence from English to French: I love programming."

demo_ephemeral_chat_history.add_user_message(input1)

response = chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)

demo_ephemeral_chat_history.add_ai_message(response)

input2 = "What did I just ask you?"

demo_ephemeral_chat_history.add_user_message(input2)

chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
    }
)

AIMessage(content='You just asked me to translate the sentence "I love programming" from English to French.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 74, 'total_tokens': 92}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-de09fa13-c193-4edd-a5c4-5c52f7445aec-0')

## Automatic history management
The previous examples pass messages to the chain explicitly. This is a completely acceptable approach, but it does require external management of new messages. LangChain also includes an wrapper for LCEL chains that can handle this process automatically called RunnableWithMessageHistory.

To show how it works, let's slightly modify the above prompt to take a final input variable that populates a HumanMessage template after the chat history. This means that we will expect a chat_history parameter that contains all messages BEFORE the current messages instead of all messages:

In [7]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),
    ]
)

chain = prompt | chat

We'll pass the latest input to the conversation here and let the RunnableWithMessageHistory class wrap our chain and do the work of appending that input variable to the chat history.

Next, let's declare our wrapped chain:



In [8]:
from langchain_core.runnables.history import RunnableWithMessageHistory

demo_ephemeral_chat_history_for_chain = ChatMessageHistory()

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history_for_chain,
    input_messages_key="input",
    history_messages_key="chat_history",
)

This class takes a few parameters in addition to the chain that we want to wrap:

A factory function that returns a message history for a given session id. This allows your chain to handle multiple users at once by loading different messages for different conversations.
An input_messages_key that specifies which part of the input should be tracked and stored in the chat history. In this example, we want to track the string passed in as input.
A history_messages_key that specifies what the previous messages should be injected into the prompt as. Our prompt has a MessagesPlaceholder named chat_history, so we specify this property to match.
(For chains with multiple outputs) an output_messages_key which specifies which output to store as history. This is the inverse of input_messages_key.
We can invoke this new chain as normal, with an additional configurable field that specifies the particular session_id to pass to the factory function. This is unused for the demo, but in real-world chains, you'll want to return a chat history corresponding to the passed session:

In [9]:
chain_with_message_history.invoke(
    {"input": "Translate this sentence from English to French: I love programming."},
    {"configurable": {"session_id": "unused"}},
)

AIMessage(content='The translation of "I love programming" in French is "J\'adore la programmation."', response_metadata={'token_usage': {'completion_tokens': 20, 'prompt_tokens': 39, 'total_tokens': 59}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-c442431a-8216-4221-b103-fc6a94181a63-0')

In [10]:
chain_with_message_history.invoke(
    {"input": "What did I just ask you?"}, {"configurable": {"session_id": "unused"}}
)

AIMessage(content='You just asked me to translate the sentence "I love programming" from English to French.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 74, 'total_tokens': 92}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-b940ac30-a83a-4f0d-bf26-02bd0fe08369-0')

## Modifying chat history
Modifying stored chat messages can help your chatbot handle a variety of situations. Here are some examples:

### Trimming messages
LLMs and chat models have limited context windows, and even if you're not directly hitting limits, you may want to limit the amount of distraction the model has to deal with. One solution is to only load and store the most recent n messages. Let's use an example history with some preloaded messages:

In [11]:
demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("Hey there! I'm Nemo.")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("How are you today?")
demo_ephemeral_chat_history.add_ai_message("Fine thanks!")

demo_ephemeral_chat_history.messages

[HumanMessage(content="Hey there! I'm Nemo."),
 AIMessage(content='Hello!'),
 HumanMessage(content='How are you today?'),
 AIMessage(content='Fine thanks!')]

Let's use this message history with the RunnableWithMessageHistory chain we declared above:



In [12]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("human", "{input}"),
    ]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

chain_with_message_history.invoke(
    {"input": "What's my name?"},
    {"configurable": {"session_id": "unused"}},
)

AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5761849f-a925-4be3-b5a2-efd6cbde3652-0')

We can see the chain remembers the preloaded name.

But let's say we have a very small context window, and we want to trim the number of messages passed to the chain to only the 2 most recent ones. We can use the clear method to remove messages and re-add them to the history. We don't have to, but let's put this method at the front of our chain to ensure it's always called:

In [13]:
from langchain_core.runnables import RunnablePassthrough


def trim_messages(chain_input):
    stored_messages = demo_ephemeral_chat_history.messages
    if len(stored_messages) <= 2:
        return False

    demo_ephemeral_chat_history.clear()

    for message in stored_messages[-2:]:
        demo_ephemeral_chat_history.add_message(message)

    return True


chain_with_trimming = (
    RunnablePassthrough.assign(messages_trimmed=trim_messages)
    | chain_with_message_history
)

In [14]:
chain_with_trimming.invoke(
    {"input": "Where does P. Sherman live?"},
    {"configurable": {"session_id": "unused"}},
)

AIMessage(content="P. Sherman's address is 42 Wallaby Way, Sydney.", response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 53, 'total_tokens': 67}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f11b215f-8f2f-4a2f-8a3e-6d18b86a5472-0')

In [15]:
demo_ephemeral_chat_history.messages

[HumanMessage(content="What's my name?"),
 AIMessage(content='Your name is Nemo.', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 66, 'total_tokens': 72}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-5761849f-a925-4be3-b5a2-efd6cbde3652-0'),
 HumanMessage(content='Where does P. Sherman live?'),
 AIMessage(content="P. Sherman's address is 42 Wallaby Way, Sydney.", response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 53, 'total_tokens': 67}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f11b215f-8f2f-4a2f-8a3e-6d18b86a5472-0')]

In [16]:
chain_with_trimming.invoke(
    {"input": "What is my name?"},
    {"configurable": {"session_id": "unused"}},
)

AIMessage(content="I'm sorry, I don't have access to your personal information. Therefore, I don't know your name.", response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 61, 'total_tokens': 84}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-d184170c-1bb2-4d21-a96c-cedfd7fd7c06-0')

In [17]:
demo_ephemeral_chat_history.messages

[HumanMessage(content='Where does P. Sherman live?'),
 AIMessage(content="P. Sherman's address is 42 Wallaby Way, Sydney.", response_metadata={'token_usage': {'completion_tokens': 14, 'prompt_tokens': 53, 'total_tokens': 67}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-f11b215f-8f2f-4a2f-8a3e-6d18b86a5472-0'),
 HumanMessage(content='What is my name?'),
 AIMessage(content="I'm sorry, I don't have access to your personal information. Therefore, I don't know your name.", response_metadata={'token_usage': {'completion_tokens': 23, 'prompt_tokens': 61, 'total_tokens': 84}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-d184170c-1bb2-4d21-a96c-cedfd7fd7c06-0')]

## Summary memory
We can use this same pattern in other ways too. For example, we could use an additional LLM call to generate a summary of the conversation before calling our chain. Let's recreate our chat history and chatbot chain:

In [18]:
demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("Hey there! I'm Nemo.")
demo_ephemeral_chat_history.add_ai_message("Hello!")
demo_ephemeral_chat_history.add_user_message("How are you today?")
demo_ephemeral_chat_history.add_ai_message("Fine thanks!")

demo_ephemeral_chat_history.messages

[HumanMessage(content="Hey there! I'm Nemo."),
 AIMessage(content='Hello!'),
 HumanMessage(content='How are you today?'),
 AIMessage(content='Fine thanks!')]

We'll slightly modify the prompt to make the LLM aware that will receive a condensed summary instead of a chat history:

In [19]:
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a helpful assistant. Answer all questions to the best of your ability. The provided chat history includes facts about the user you are speaking with.",
        ),
        MessagesPlaceholder(variable_name="chat_history"),
        ("user", "{input}"),
    ]
)

chain = prompt | chat

chain_with_message_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: demo_ephemeral_chat_history,
    input_messages_key="input",
    history_messages_key="chat_history",
)

And now, let's create a function that will distill previous interactions into a summary. We can add this one to the front of the chain too:

In [20]:
def summarize_messages(chain_input):
    stored_messages = demo_ephemeral_chat_history.messages
    if len(stored_messages) == 0:
        return False
    summarization_prompt = ChatPromptTemplate.from_messages(
        [
            MessagesPlaceholder(variable_name="chat_history"),
            (
                "user",
                "Distill the above chat messages into a single summary message. Include as many specific details as you can.",
            ),
        ]
    )
    summarization_chain = summarization_prompt | chat

    summary_message = summarization_chain.invoke({"chat_history": stored_messages})

    demo_ephemeral_chat_history.clear()

    demo_ephemeral_chat_history.add_message(summary_message)

    return True


chain_with_summarization = (
    RunnablePassthrough.assign(messages_summarized=summarize_messages)
    | chain_with_message_history
)

Let's see if it remembers the name we gave it:

In [21]:
chain_with_summarization.invoke(
    {"input": "What did I say my name was?"},
    {"configurable": {"session_id": "unused"}},
)

AIMessage(content='You introduced yourself as Nemo.', response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 97, 'total_tokens': 104}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-757628ad-a63a-4650-b7db-19f0fcfb5b8a-0')

In [22]:
demo_ephemeral_chat_history.messages

[AIMessage(content='The conversation is between Nemo and an unnamed individual. Nemo introduces himself and the other person responds with a greeting. The other person asks Nemo how he is doing, and Nemo responds that he is fine.', response_metadata={'token_usage': {'completion_tokens': 44, 'prompt_tokens': 62, 'total_tokens': 106}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-bd5cb3d7-a413-4615-8d4d-23d6debd38fa-0'),
 HumanMessage(content='What did I say my name was?'),
 AIMessage(content='You introduced yourself as Nemo.', response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 97, 'total_tokens': 104}, 'model_name': 'gpt-3.5-turbo-1106', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-757628ad-a63a-4650-b7db-19f0fcfb5b8a-0')]