# Enterprise ChatBot using RAG with LangChain and OpenAI 

This repository contains a sample implementation of an enterprise chatbot using Retrieval-Augmented Generation (RAG) with LangChain and OpenAI. The chatbot is designed to answer questions based on a set of documents, providing accurate and contextually relevant responses.

## Features
- Document Ingestion: Load and process documents from various sources.
- Vector Store: Use FAISS to store and retrieve document embeddings.
- Language Model: Utilize OpenAI's GPT-5-nano for generating responses.
- Conversational Memory: Maintain context across multiple interactions.
- User Interface: Simple command-line interface for interaction.
## Requirements
- Python 3.8+
- OpenAI API Key
- Required Python packages (see `requirements.txt`)


In [1]:
import os
import numpy as np
import glob
from dotenv import load_dotenv
from langchain.document_loaders import DirectoryLoader, TextLoader
from langchain.agents import AgentType, initialize_agent, Tool, AgentExecutor, create_openai_tools_agent
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.schema import document, SystemMessage
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_chroma import Chroma
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationalRetrievalChain, RetrievalQA
import matplotlib.pyplot as plt
from sklearn.manifold import TSNE
import numpy as np
import plotly.graph_objects as go
from langchain.tools import DuckDuckGoSearchRun
from langgraph.prebuilt import create_react_agent
from langgraph.checkpoint.memory import MemorySaver
from langchain.prompts import PromptTemplate,  ChatPromptTemplate, MessagesPlaceholder
from langchain import hub
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder



In [2]:
# SET UP SOME CONSTANTS
MODEL = 'gpt-5-nano'
db_name = 'company_db'

In [3]:
# Load enviroment variables (open ai api key)
load_dotenv()

True

<!-- ### Read the Docs and split them into chunks -->

In [4]:
# Read documents using LangChain loaders
# Take everythin in all the sub-folder of the data

folders = [f for f in glob.glob(r"data/*") if not f.endswith('.ipynb')]

print(folders)

def add_metadata(doc, doc_type):
    doc.metadata['doc_type'] = doc_type
    return doc

text_loader_kwargs = {'encoding': 'utf-8'}

documents = []
for folder in folders:
    doc_type = os.path.basename(folder)
    loader = DirectoryLoader(folder, glob= "**/*.md", loader_cls= TextLoader, loader_kwargs= text_loader_kwargs, show_progress= True)
    folder_docs = loader.load()
    print(folder_docs)
    documents.extend([add_metadata(doc, doc_type) for doc in folder_docs])

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1500, 
                                      chunk_overlap = 200,
                                      separators= ["\n\n","\n","." ," ", ""])
chunks = text_splitter.split_documents(documents)

print(f'total number of chunks {len(chunks)}')
# print(f'Document types found: {set()}')



['data\\company', 'data\\employees', 'data\\products']


100%|██████████| 1/1 [00:00<00:00, 86.85it/s]


[Document(metadata={'source': 'data\\company\\ColumbusAI_Solutions.md'}, page_content='# Company Record\n\n# ColumbusAI Solutions\n\n## Overview\nColumbusAI Solutions is a global technology firm focused on artificial intelligence, automation, and data-driven consulting. Our mission is to empower enterprises to achieve autonomous operations and data-driven decisions through trustworthy AI. Our vision is to become a leading partner for organizations pursuing scalable, ethical, and responsible AI at every layer of the enterprise. Core focus: AI-driven automation, software and data deployment, predictive analytics, AI consulting and strategy, and custom model development. We develop and deploy AI-based products for automation, business intelligence, and enterprise digital transformation.\n\n## History and Foundation\nColumbusAI Solutions was founded in 2016 by Dr. Elena Koslov, a former AI scientist with deep experience in enterprise-scale data platforms. The founding team launched the com

100%|██████████| 5/5 [00:00<00:00, 94.93it/s]


[Document(metadata={'source': 'data\\employees\\Ernesto Jimenez.md'}, page_content='# HR Record\n\n# Ernesto Jimenez\n\n## Summary\n- **Date of Birth:** 1989-03-22\n- **Job Title:** Software Engineer\n- **Location:** Madrid, Spain\n\n## Professional Trajectory\n- **2014** - Software Engineer at CodePulse; backend services and APIs.\n- **2018** - Senior Software Engineer at TechWave; led microservices migration and performance optimization.\n- **2021** - Backend Engineer at AI Solutions Ltd; designed scalable services and API contracts.\n- **Current** - Senior Software Engineer at our company; focuses on distributed systems, containerized deployments, and cloud-native architecture.\n\n## Education & Certifications\n- BSc in Computer Science, University of Valencia\n- AWS Solutions Architect – Associate, 2019\n- Google Cloud Professional Cloud Developer, 2020\n\n## Awards & Recognitions\n- Hackathon Winner, Global Hack Challenge 2020\n- Open Source Contributor of the Year 2022\n\n## Skil

100%|██████████| 10/10 [00:00<00:00, 94.55it/s]

[Document(metadata={'source': 'data\\products\\AtlasFlow.md'}, page_content='# Product Record\n\n# AtlasFlow\n\n## Overview\nAtlasFlow is an AI-driven automation platform for workflow orchestration and robotic process automation (RPA). It enables enterprises to design, deploy, and monitor automated processes across cloud, on-prem, and hybrid environments. It targets IT operations, business process owners, and developers who want reliable, auditable automation at scale.\n\n## Core Features\n- AI-powered orchestration engine with deterministic scheduling and human-in-the-loop capabilities\n- RPA bot lifecycle management with versioning, fault tolerance, and centralized governance\n- AI-assisted action inference using OpenAI API and LangChain for natural-language automation and intent recognition\n- Cloud-native deployment with Kubernetes, Git-based pipelines, and native connectors to Salesforce, SAP, and Power BI integration\n- Policy-based governance, RBAC, auditing, and compliance repo




In [5]:
# Put the chuncks of data into a Vector Store that associates a Vector Embedding with each chunk

embeddings = OpenAIEmbeddings()

if os.path.exists(db_name):
    Chroma(persist_directory= db_name, embedding_function= embeddings).delete_collection()

# Create vector store
vectorstore = Chroma.from_documents(documents=chunks, embedding= embeddings, persist_directory= db_name)
collection = vectorstore._collection
print(f"Vectorstore created with {vectorstore._collection.count()} documents")

Vectorstore created with 29 documents


In [6]:
# Prework (with thanks to Jon R for identifying and fixing a bug in this!)

result = collection.get(include=['embeddings', 'documents', 'metadatas'])
vectors = np.array(result['embeddings'])
documents = result['documents']
metadatas = result['metadatas']
doc_types = [metadata['doc_type'] for metadata in metadatas]
colors = [['blue', 'green', 'red', 'orange'][['products', 'employees', 'contracts', 'company'].index(t)] for t in doc_types]

In [8]:
# We humans find it easier to visalize things in 2D!
# Reduce the dimensionality of the vectors to 2D using t-SNE
# (t-distributed stochastic neighbor embedding)
perplexity = max(5, min(30, len(vectors) // 3))
tsne = TSNE(n_components=2, random_state=42, perplexity= perplexity)
reduced_vectors = tsne.fit_transform(vectors)

# Create the 2D scatter plot
fig = go.Figure(data=[go.Scatter(
    x=reduced_vectors[:, 0],
    y=reduced_vectors[:, 1],
    mode='markers',
    marker=dict(size=5, color=colors, opacity=0.8),
    text=[f"Type: {t}<br>Text: {d[:100]}..." for t, d in zip(doc_types, documents)],
    hoverinfo='text'
)])

fig.update_layout(
    title='2D Chroma Vector Store Visualization',
    scene=dict(xaxis_title='x',yaxis_title='y'),
    width=800,
    height=600,
    margin=dict(r=20, b=10, l=10, t=40)
)

fig.show()

<!-- ### Use langchain to bring it all together -->

In [10]:
# Create a new chat with OpenAI
llm = ChatOpenAI(model_name = MODEL)

# set up the converstation memory for the chat
memory = ConversationBufferMemory(memory_key= 'chat_history', return_messages= True)

# the retriever is an abstraction over the VectorStore that will be used during RAG
retriever = vectorstore.as_retriever()

rag_chain = RetrievalQA.from_chain_type(
    llm = ChatOpenAI(model= MODEL),
    retriever = retriever,
    chain_type = 'stuff'
)

# set the rag chain as a function to pass it into a tool
def query_company_knowledge(query):
    return rag_chain.invoke(query)


# set up the tools (web search and specific company queries)

tools = [Tool(
    name='CompanyKnowledge',
    func= query_company_knowledge,
    description= "Use this to answer questions about company data and strategy."),
Tool(
    name= 'WebSearch',
    func= DuckDuckGoSearchRun().run,
    description= 'use this when you need to perfom a web search or search latest news related to the company')
]


llm = ChatOpenAI(model = MODEL)


# This has the correct format with all required variables
# prompt = hub.pull("hwchase17/react")

# Create the prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an AI strategist working at ColumbusAI Solutions.
Your mission is to analyze company data, propose innovative AI-based products,
and stay updated with the latest AI trends.

You have access to two tools:
1. 'CompanyKnowledge' — retrieves internal information from the company's data.
2. 'WebSearch' — searches the web for recent or external information.

If you receive a query about a person, event, or fact not found in internal data,
you **must** use the WebSearch tool before answering.
If both tools return nothing, respond with:
"I don't have reliable information on that yet."

Always reason clearly, use tools first, and only synthesize insights afterward."""),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("user", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# Create the agent using the built-in function
agent = create_openai_tools_agent(
    llm=llm,
    tools=tools,
    prompt=prompt
)

# Set up memory
memory = ConversationBufferMemory(
    memory_key='chat_history',
    return_messages=True,
    output_key='output'
)

# Create the agent executor
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    memory=memory,
    verbose=True,
    handle_parsing_errors=True,
    max_iterations=5
)


In [85]:
response = agent_executor.invoke({"input": "who do you think could make the best dashboard in the current employees?"})
print(response['output'])



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mMy pick would be Santiago Rojas.

Why Santiago:
- Role fit: Data Analyst with a proven track record in analytics, insights, and cross-team collaboration.
- Visualization skills: Proficient in Tableau and Power BI; strong data storytelling and visualization capabilities.
- Technical grounding: SQL, Python, R; experience with ETL and data wrangling, which helps in building reliable, end-to-end dashboards.
- Certifications and experience: Tableau Desktop Specialist; has worked as a data analyst in prior roles and at our company, which means he knows our data context.

Runner-ups:
- Helena Suarez (Senior Data Engineer): Excellent for dashboards that require robust data pipelines, governance, and scalable data sources. She’d be ideal as a data-source enabler and for dashboards needing strong data quality and pipeline design.
- Samuel Etoo (Senior Accountant): Best for finance-focused dashboards (GAAP reporting, budgeting, consolid

In [87]:
import gradio as gr
from langchain_openai import ChatOpenAI
from langchain.memory import ConversationBufferMemory
from langchain.chains import RetrievalQA
from langchain.agents import Tool, AgentExecutor, create_openai_tools_agent
from langchain_community.tools import DuckDuckGoSearchRun
from langchain.prompts import ChatPromptTemplate, MessagesPlaceholder
import time

# Assuming MODEL and vectorstore are already defined
# MODEL = "gpt-4"
# vectorstore = ... (your vector store)

# Create the retriever
retriever = vectorstore.as_retriever()

# Create the RAG chain
rag_chain = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model=MODEL),
    retriever=retriever,
    chain_type='stuff'
)

def query_company_knowledge(query: str) -> str:
    """Query internal company knowledge base."""
    return rag_chain.run(query)

# Set up the tools
tools = [
    Tool(
        name='CompanyKnowledge',
        func=query_company_knowledge,
        description="Use this to answer questions about company data and strategy."
    ),
    Tool(
        name='WebSearch',
        func=DuckDuckGoSearchRun().run,
        description='Use this when you need to perform a web search or search latest news.'
    )
]

# Create the prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", """You are an AI strategist working at ColumbusAI Solutions.
Your mission is to analyze company data, propose innovative AI-based products,
and stay updated with the latest AI trends.

You have access to two tools:
1. 'CompanyKnowledge' — retrieves internal information from the company's data.
2. 'WebSearch' — searches the web for recent or external information.

If you receive a query about a person, event, or fact not found in internal data,
you must use the WebSearch tool before answering.
If both tools return nothing, respond with: "I don't have reliable information on that yet."

Always reason clearly, use tools first, and only synthesize insights afterward."""),
    MessagesPlaceholder(variable_name="chat_history", optional=True),
    ("user", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad")
])

# Global memory
memory = ConversationBufferMemory(
    memory_key='chat_history',
    return_messages=True,
    output_key='output'
)

def chat_with_agent(message, history):
    """
    Process user message and yield streaming response with status updates.
    """
    if not message.strip():
        yield "", "⚠️ Please enter a message"
        return
    
    try:
        # Show initial status
        yield "", "🤔 Thinking..."
        
        # Create LLM
        llm = ChatOpenAI(model=MODEL, temperature=0)
        
        # Create agent
        agent = create_openai_tools_agent(llm=llm, tools=tools, prompt=prompt)
        
        # Create agent executor
        agent_executor = AgentExecutor(
            agent=agent,
            tools=tools,
            memory=memory,
            verbose=False,  # Set True to see logs
            handle_parsing_errors=True,
            max_iterations=5,
            return_intermediate_steps=True
        )
        
        # Status update: Searching
        yield "", "🔍 Searching for information..."
        
        # Invoke agent
        response = agent_executor.invoke({"input": message})
        
        # Check which tools were used
        intermediate_steps = response.get('intermediate_steps', [])
        tools_used = []
        if intermediate_steps:
            for step in intermediate_steps:
                tool_name = step[0].tool
                tools_used.append(tool_name)
        
        # Create status message
        if tools_used:
            tools_str = ", ".join(set(tools_used))
            status = f"✅ Used: {tools_str}"
        else:
            status = "✅ Response generated"
        
        # Get final response
        final_response = response['output']
        
        # Stream the response character by character
        streamed_text = ""
        for i, char in enumerate(final_response):
            streamed_text += char
            # Update every few characters for smoother streaming
            if i % 3 == 0 or i == len(final_response) - 1:
                yield streamed_text, status
                time.sleep(0.01)  # Small delay for visual effect
        
        # Final yield with complete response
        yield final_response, status
        
    except Exception as e:
        error_msg = f"❌ Error: {str(e)}"
        yield "", error_msg

def reset_conversation():
    """Reset conversation memory."""
    global memory
    memory.clear()
    return [], "", "💬 Conversation cleared. Ready for new questions!"

# Create Gradio interface with Blocks for more control
with gr.Blocks(title="ColumbusAI Solutions", theme=gr.themes.Soft()) as demo:
    
    # Header
    gr.Markdown("""
    # 🤖 ColumbusAI Solutions - AI Strategist Assistant
    
    Your intelligent assistant for company insights, AI trends, and strategic recommendations.
    """)
    
    # Main chat interface
    with gr.Row():
        with gr.Column(scale=4):
            chatbot = gr.Chatbot(
                height=500,
                label="Chat History",
                avatar_images=(None, "🤖"),
                bubble_full_width=False,
                show_copy_button=True
            )
            
            # Status indicator
            status = gr.Textbox(
                label="Status",
                value="💬 Ready to chat!",
                interactive=False,
                max_lines=1
            )
            
            # Input area
            with gr.Row():
                msg = gr.Textbox(
                    label="Your Message",
                    placeholder="Ask me anything...",
                    lines=2,
                    scale=4
                )
                submit = gr.Button("Send 📤", variant="primary", scale=1)
            
            # Action buttons
            with gr.Row():
                clear = gr.Button("🗑️ Clear Chat")
                retry = gr.Button("🔄 Retry Last")
        
        # Sidebar with examples and info
        with gr.Column(scale=1):
            gr.Markdown("""
            ### 💡 Example Questions
            
            **Company Info:**
            - What is our AI strategy?
            - Tell me about our products
            
            **People:**
            - Who is Juan Rojas?
            
            **Trends:**
            - Latest AI developments
            - AI market trends 2024
            
            **Innovation:**
            - Suggest new AI products
            - Competitive analysis
            
            ---
            
            ### 🔧 Features
            - ✅ Real-time streaming
            - 🔍 Internal knowledge search
            - 🌐 Web search capability
            - 💾 Conversation memory
            - 📊 Status indicators
            """)
    
    # Event handlers
    def submit_message(message, history):
        """Handle message submission."""
        # Add user message to history
        history = history + [[message, None]]
        return "", history
    
    def bot_response(history):
        """Generate bot response with streaming."""
        if not history or history[-1][1] is not None:
            return history, "💬 Ready"
        
        user_message = history[-1][0]
        
        # Stream the response
        for partial_response, status_msg in chat_with_agent(user_message, history[:-1]):
            history[-1][1] = partial_response
            yield history, status_msg
    
    # Connect events
    msg.submit(
        submit_message,
        inputs=[msg, chatbot],
        outputs=[msg, chatbot]
    ).then(
        bot_response,
        inputs=[chatbot],
        outputs=[chatbot, status]
    )
    
    submit.click(
        submit_message,
        inputs=[msg, chatbot],
        outputs=[msg, chatbot]
    ).then(
        bot_response,
        inputs=[chatbot],
        outputs=[chatbot, status]
    )
    
    clear.click(
        reset_conversation,
        outputs=[chatbot, msg, status]
    )
    
    retry.click(
        lambda hist: (hist[:-1] if hist else [], hist[-1][0] if hist else ""),
        inputs=[chatbot],
        outputs=[chatbot, msg]
    )
    
    # Footer
    gr.Markdown("""
    ---
    <center>
    <small>Powered by LangChain + OpenAI | ColumbusAI Solutions © 2024</small>
    </center>
    """)

# Launch
if __name__ == "__main__":
    demo.launch(
        # server_name="0.0.0.0",
        # server_port=7860,
        share=False,  # Set True to create public link
        debug=True,
        show_error=True
    )


You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style dictionaries with 'role' and 'content' keys.


The 'bubble_full_width' parameter is deprecated and will be removed in a future version. This parameter no longer has any effect.



* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.



This package (`duckduckgo_search`) has been renamed to `ddgs`! Use `pip install ddgs` instead.


This package (`duckduckgo_search`) has been renamed to `ddgs`! Use `pip install ddgs` instead.


This package (`duckduckgo_search`) has been renamed to `ddgs`! Use `pip install ddgs` instead.


This package (`duckduckgo_search`) has been renamed to `ddgs`! Use `pip install ddgs` instead.



Keyboard interruption in main thread... closing server.


In [7]:
import requests

response = requests.post("http://127.0.0.1:8000/chat", json={"query": "list my 4 previous prompts please?"})
print(response.json())


{'response': '- "hello"\n- "what was my previous prompt?"\n- "who is Juan Rojas?"\n- "who is Juan Rojas in my company?"'}
