# Part 4: End-to-End & Monitoring

We have built ingestion, retrieval, and evaluation pipelines. Now, let's wrap it up into a service that could (hypothetically) go to production.

## 1. The Production Pipeline
In production, you don't run cells. You have an API endpoint `POST /chat`.
We will simulate this with a `chat(query)` function.

In [None]:
import os
import logging
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.chains import RetrievalQA

# Setup Logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

load_dotenv()
llm = ChatOpenAI(model="gpt-4o")
embeddings = OpenAIEmbeddings()
vectorstore = Chroma(collection_name="rag_training_v1", embedding_function=embeddings)

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

def chat_endpoint(user_query: str):
    logger.info(f"Received query: {user_query}")
    try:
        response = qa_chain.invoke(user_query)
        logger.info("Generated response successfully.")
        return response['result']
    except Exception as e:
        logger.error(f"Error processing query: {e}")
        return "Sorry, something went wrong."

## 2. Interactive Loop
A simple loop to test our "endpoint".

In [None]:
print("Type 'exit' to quit.")
while True:
    q = input("User: ")
    if q.lower() == "exit":
        break
    
    answer = chat_endpoint(q)
    print(f"Bot: {answer}\n")

## 3. Monitoring (Concept)
In a real app, you would send traces to **LangSmith** or **Arize Phoenix**.
To do this with LangChain, you just need to set environment variables:
```bash
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=...
```
This automatically logs every retriever call, LLM usage, and latency.

## Final Words
You now have a complete, robust RAG setup:
1. **Ingestion**: Handling multiple file types.
2. **Retrieval**: Using advanced strategies only when needed.
3. **Evaluation**: Scientifically measuring quality.
4. **Production**: Logging and error handling.

**Next Steps:**
- Replace synthetic data with your real documents.
- Tune chunk sizes based on RAGAS metrics.
- Deploy this notebook as a Streamlit app in Deepnote.