
# Chapter 11: Project Walkthroughs and Capstone

This notebook provides hands-on walkthroughs of real-world Generative AI projects, including:
- A Retrieval-Augmented Generation (RAG) pipeline
- A conversational chatbot with LangChain
- A multimodal application combining text and vision

## Learning Objectives

- Combine components from previous chapters into complete applications
- Implement a functional GenAI-powered chatbot
- Build and test a RAG pipeline using LangChain + FAISS
- Explore multimodal inputs using CLIP or BLIP



## Project 1: Retrieval-Augmented Generation (RAG)

We’ll build a simple RAG system using Hugging Face + LangChain + FAISS.


In [None]:

from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

# Sample data
texts = [
    "The Eiffel Tower is located in Paris.",
    "Mount Everest is the tallest mountain in the world.",
    "The Great Wall of China is visible from space."
]

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = FAISS.from_texts(texts, embeddings)
retriever = vectorstore.as_retriever()

qa_chain = RetrievalQA.from_chain_type(llm=OpenAI(), retriever=retriever)
result = qa_chain.run("Where is the Eiffel Tower?")
print("RAG Output:", result)



## Project 2: LangChain Chatbot with Memory

This chatbot remembers previous conversation turns using buffer memory.


In [None]:

from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory()
chatbot = ConversationChain(llm=OpenAI(), memory=memory)

chatbot.predict(input="Hi, I am Alice.")
chatbot.predict(input="What is my name?")



## Project 3: Multimodal Image Captioning with BLIP

We'll use Salesforce BLIP to caption an image.


In [None]:

from transformers import BlipProcessor, BlipForConditionalGeneration
from PIL import Image
import requests

image = Image.open(requests.get("https://raw.githubusercontent.com/salesforce/BLIP/main/demo.jpg", stream=True).raw)
processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

inputs = processor(image, return_tensors="pt")
output = model.generate(**inputs)
caption = processor.decode(output[0], skip_special_tokens=True)

print("Caption:", caption)



## Capstone Project Suggestions

- AI Tutor: Chat + Retrieval + Grading
- Legal Assistant: Document retrieval + summarization + sentiment
- Medical Bot: Symptom checker + recommendation
- AR/VR: Visual captioning + prompt-based 3D narration

## Deployment Options

- Hugging Face Spaces
- Streamlit + FastAPI + Docker
- LangServe or Gradio for frontend

Plan your capstone in stages: data → pipeline → test → deploy → monitor



## Exercises

1. Add more documents to the RAG system and test with new questions.
2. Customize the chatbot to include tool access or role-play personalities.
3. Extend BLIP with visual question answering using BLIP-2.
4. Deploy one complete project using Streamlit or Docker.

## References

- LangChain: https://docs.langchain.com
- Hugging Face: https://huggingface.co
- BLIP Model: https://github.com/salesforce/BLIP
