# How to build simple RAG LLM App with LangChain

## Concepts included
* RAG

In [1]:
import os
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
openai_api_key = os.environ["OPENAI_API_KEY"]

In [2]:
MODEL_GPT = 'gpt-4o-mini'

In [3]:
# !pip list | grep langchain
!pip list | findstr langchain

langchain                                0.3.13
langchain-chroma                         0.1.4
langchain-community                      0.3.13
langchain-core                           0.3.43
langchain-experimental                   0.3.4
langchain-google-community               2.0.7
langchain-groq                           0.2.5
langchain-openai                         0.2.14
langchain-text-splitters                 0.3.4
langchainhub                             0.1.21


## Connect with LLM and start conversation with it

In [4]:
from langchain_openai import ChatOpenAI

# llm = ChatOpenAI(model="gpt-3.5-turbo")
llm = ChatOpenAI(model=MODEL_GPT)

### Track operation in LangSmith
* [Open LangSmith here](https://smith.langchain.com/)

In [5]:
# !pip install langchain-chroma langchain-community langchain-hub
# !pip install langchainhub

In [6]:
import bs4
from langchain import hub
from langchain_chroma import Chroma
from langchain_community.document_loaders import TextLoader
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# loader = TextLoader("./data/be-good.txt")
loader = TextLoader("../../data/be-good.txt")

docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)

splits = text_splitter.split_documents(docs)

vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

retriever = vectorstore.as_retriever()

prompt = hub.pull("rlm/rag-prompt")

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)


rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

rag_chain.invoke("What is this article about?")

'The article discusses principles for startups, emphasizing the importance of creating products that people want and not overly focusing on immediate profitability. It suggests that successful businesses can operate with a charitable mindset, drawing parallels with companies like Craigslist. Ultimately, it explores the idea that effective business practices can resemble charitable operations while still achieving success.'

In [7]:
rag_chain.invoke("What is this article about, answer in Czech language?")

'Tento článek se zabývá myšlenkou, že úspěšné podnikání může fungovat jako charita, přičemž zdůrazňuje, že je důležité vytvořit něco, co lidé chtějí. Autor diskutuje o příkladu Craigslistu, který sice není charitou, ale přistupuje k podnikání s benevolentním přístupem. Zároveň naznačuje, že dobré úmysly samy o sobě nezaručují pozitivní výsledky.'

### Prompt we got from LangChain hub

In [8]:
prompt

ChatPromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"), additional_kwargs={})])

In [9]:
prompt.input_variables

['context', 'question']

In [10]:
prompt.messages

[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], input_types={}, partial_variables={}, template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"), additional_kwargs={})]