tocdepth:	1

RAG Q&A

Retrieval Augmented Generation (RAG) combines language models with external knowledge. This use case integrates RAG with FlexFlow Serve for Q&A with documents.

Requirements

FlexFlow Serve setup.
Retriever setup for RAG.

Implementation

FlexFlow Initialization Initialize and configure FlexFlow Serve.
Data Retrieval Setup Setup a retriever for sourcing information relevant to user queries.
RAG Integration Integrate the retriever with FlexFlow Serve.
Response Generation Use the LLM with RAG to generate responses based on model's knowledge and retrieved information.
Shutdown The FlexFlow server automatically shuts down after generating the response.

Example

A complete code example for a web-document Q&A using FlexFlow can be found here:

Rag Q&A Example with incremental decoding
Rag Q&A Example with speculative inference

Example Implementation:

# imports

# compile and start server
ff_llm = FlexFlowLLM(...)
gen_config = ff.GenerationConfig(...)
ff_llm.compile_and_start(...)
ff_llm_wrapper = FF_LLM_wrapper(flexflow_llm=ff_llm)


# Load web page content
loader = WebBaseLoader("https://example.com/data")
data = loader.load()

# Split text
text_splitter = RecursiveCharacterTextSplitter(...)
all_splits = text_splitter.split_documents(data)

# Initialize embeddings
embeddings = OpenAIEmbeddings(...)

# Create VectorStore
vectorstore = Chroma.from_documents(all_splits, embeddings)

# Use VectorStore as a retriever
retriever = vectorstore.as_retriever()

# Apply similarity search
question = "Example Question"
docs = vectorstore.similarity_search(question)
max_chars_per_doc = 100
docs_text = ''.join([docs[i].page_content[:max_chars_per_doc] for i in range(len(docs))])

# Using a Prompt Template
prompt_rag = PromptTemplate.from_template(
   "Summarize the main themes in these retrieved docs: {docs_text}"
)

# Build Chain
llm_chain_rag = LLMChain(llm=ff_llm_wrapper, prompt=prompt_rag)

# Run
rag_result = llm_chain_rag(docs_text)

# Stop the server
ff_llm.stop_server()

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rag.rst

rag.rst

RAG Q&A

Requirements

Implementation

Example

Files

rag.rst

Latest commit

History

rag.rst

File metadata and controls

RAG Q&A

Requirements

Implementation

Example