# 第四周作业 RAG - Prompt Engineering

In [32]:
import os 


In [None]:
!pip install langchain_chroma

## Load dependencies

In [4]:
# 导入必要的库
import bs4
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain import hub
from langchain_core.prompts import ChatPromptTemplate
from langchain_openai import ChatOpenAI

## load document from web

In [5]:
# Load web doc via WebBaseLoader

bs4_strainer = bs4.SoupStrainer(class_=("post-title","post-header","post-content"))
loader = WebBaseLoader(
    web_path=("https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/"),
    bs_kwargs={"parse_only": bs4_strainer},
)
docs = loader.load()


In [11]:
print(len(docs[0].page_content))


29295


## Split doc

In [18]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000, chunk_overlap=200, add_start_index=True
)
## docs from above
all_splits = text_splitter.split_documents(docs)


In [19]:
# check chunks
print(len(all_splits))
print(len( all_splits[0].page_content ))
print(all_splits[1].page_content)


43
103
Prompt Engineering, also known as In-Context Prompting, refers to methods for how to communicate with LLM to steer its behavior for desired outcomes without updating the model weights. It is an empirical science and the effect of prompt engineering methods can vary a lot among models, thus requiring heavy experimentation and heuristics.
This post only focuses on prompt engineering for autoregressive language models, so nothing with Cloze tests, image generation or multimodality models. At its core, the goal of prompt engineering is about alignment and model steerability. Check my previous post on controllable text generation.


In [20]:
# print meta data 

print(all_splits[1].metadata)


{'source': 'https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/', 'start_index': 114}


## Embedding and Save
 - using Chroma

In [22]:
from langchain_chroma import Chroma

# vector_store = Chroma(
#     collection_name="lili_collection",
#     embedding_function=embeddings,
#     persist_directory="./chroma_lili_collection_db",
# )

vector_store = Chroma.from_documents(
    documents=all_splits,
    embedding=OpenAIEmbeddings()
)

In [23]:
type(vector_store)

langchain_chroma.vectorstores.Chroma

## Search Vector - Chroma


In [24]:
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={"k": 6})


In [25]:
type(retriever)

langchain_core.vectorstores.base.VectorStoreRetriever

In [26]:
retrieved_docs=retriever.invoke("What steps inclueds prompt engineering")


In [27]:
print( len(retrieved_docs))

6


In [29]:
print( retrieved_docs[0].page_content)


Prompt Engineering, also known as In-Context Prompting, refers to methods for how to communicate with LLM to steer its behavior for desired outcomes without updating the model weights. It is an empirical science and the effect of prompt engineering methods can vary a lot among models, thus requiring heavy experimentation and heuristics.
This post only focuses on prompt engineering for autoregressive language models, so nothing with Cloze tests, image generation or multimodality models. At its core, the goal of prompt engineering is about alignment and model steerability. Check my previous post on controllable text generation.


## Generate responds with GPT-4o-mini

In [30]:
# define LLM 
llm = ChatOpenAI(model="gpt-4o-mini")


In [33]:
## Make sure add env param LAINGCHAIN_API_KEY with value from LangSmith
prompt=hub.pull("rlm/rag-prompt")

In [35]:
print(prompt.messages)

[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))]


In [36]:
example_messages=prompt.invoke(
    {"context":"filler context", "question":"filler question"}
)
print(example_messages)

messages=[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: filler question \nContext: filler context \nAnswer:")]


In [37]:
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

In [39]:
## Define the RAG Chain via pipes
rag_chain=(
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm 
    | StrOutputParser()
)


In [41]:
## Invok rag_chain

for chunk in rag_chain.stream("What is Prompt Engineering?"):
    print(chunk, end="", flush=True)
    

Prompt Engineering, also known as In-Context Prompting, involves methods to communicate with language models (LLMs) to achieve desired outcomes without altering the model's weights. It focuses on aligning and steering the model's behavior through carefully designed prompts, requiring extensive experimentation to find effective strategies. The goal is to enhance model responsiveness and alignment with user intentions.

In [46]:
## other prompt 
prompt = hub.pull("empler-ai/rag-prompt")


In [47]:
print(prompt.messages)


[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know.\nQuestion: {question} \nContext: {context} \nAnswer:"))]


In [48]:
rag_chain=(
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm 
    | StrOutputParser()
)

for chunk in rag_chain.stream("What is Prompt Engineering?"):
    print(chunk, end="", flush=True)
    

Prompt Engineering, also known as In-Context Prompting, refers to methods for communicating with large language models (LLMs) to steer their behavior towards desired outcomes without updating the model weights. It is an empirical science that often requires extensive experimentation and heuristics since the effectiveness of prompt engineering methods can vary significantly among different models. The primary goal of prompt engineering is to achieve alignment and model steerability, particularly for autoregressive language models.

## result compare 

### Prompt 1 

- rlm/rag-prompt
response:
```
Prompt Engineering, also known as In-Context Prompting, involves methods to communicate with language models (LLMs) to achieve desired outcomes without altering the model's weights. It focuses on aligning and steering the model's behavior through carefully designed prompts, requiring extensive experimentation to find effective strategies. The goal is to enhance model responsiveness and alignment with user intentions.
```

### Prompt 2

- empler-ai/rag-prompt

response:
```
Prompt Engineering, also known as In-Context Prompting, refers to methods for communicating with large language models (LLMs) to steer their behavior towards desired outcomes without updating the model weights. It is an empirical science that often requires extensive experimentation and heuristics since the effectiveness of prompt engineering methods can vary significantly among different models. The primary goal of prompt engineering is to achieve alignment and model steerability, particularly for autoregressive language models.
```