# Response Synthesis and Prompting

In [1]:
%load_ext autoreload
%autoreload 2

from dotenv import load_dotenv

load_dotenv()
import nest_asyncio
nest_asyncio.apply()
import asyncio

import wandb
import weave
import pathlib
import pandas as pd
import json

In [2]:
WANDB_ENTITY = "rag-course"
WANDB_PROJECT = "dev"

wandb.require("core")

run = wandb.init(
    entity=WANDB_ENTITY,
    project=WANDB_PROJECT,
    group="Chapter 6",
)

weave_client = weave.init(f"{WANDB_ENTITY}/{WANDB_PROJECT}")

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Using wandb-core as the SDK backend. Please refer to https://wandb.me/wandb-core for more information.
[34m[1mwandb[0m: Currently logged in as: [33mparambharat[0m ([33mrag-course[0m). Use [1m`wandb login --relogin`[0m to force relogin


Logged in as Weights & Biases user: parambharat.
View Weave data at https://wandb.ai/rag-course/dev/weave


In [3]:
# Reload the data from Chapter 3
chunked_artifact = run.use_artifact(
    f"{WANDB_ENTITY}/{WANDB_PROJECT}/chunked_data:latest", type="dataset"
)
artifact_dir = chunked_artifact.download()
chunked_data_file = pathlib.Path(f"{artifact_dir}/documents.jsonl")
chunked_data = list(map(json.loads, chunked_data_file.read_text().splitlines()))
chunked_data[:2]

2024/07/30 12:23:55 [DEBUG] GET https://storage.googleapis.com/wandb-production.appspot.com/rag-course/dev/j8uh2i2o/artifact/961260984/wandb_manifest.json?Expires=1722326035&GoogleAccessId=gorilla-files-url-signer-man%40wandb-production.iam.gserviceaccount.com&Signature=K46MULEweoG3AQ7DX%2BD092siUiwU%2BdLSJe7AGvy6caJW4MzCQsGvni5i6UKoC2JgAYwgb8FyXu4Q208Tu%2F1BITZ8Qoxtfut3kmCQdimC39rMJgDru%2Bs45JHQGiZDIhaOlNDl0vXE0WLhGpsqbamJXlQNBHwGoBxG7fio87m7pM%2Fx12wLaxVQWKRHeoHsrN%2FbYtI8oBeuZVlnb7D0jSwDm0nnbh9FAZxkoHg2%2Bb2QnoVJhA%2BwMy6UwAd%2BfW76LagrGTT7%2FuG6t7j3ho8pEPadeeNt7FH1fNga%2FJs4qXzC0A1TGWEzSNT3DMVNLEHcDalQ9DtnLOt1fC%2FIYelCcm1csA%3D%3D


[{'cleaned_content': 'Anonymous Mode Are you publishing code that you want anyone to be able to run easily? Use Anonymous Mode to let someone run your code, see a W&B dashboard, and visualize results without needing to create a W&B account first. Allow results to be logged in Anonymous Mode with wandb.init(anonymous="allow") :::info Publishing a paper? Please cite W&B, and if you have questions about how to make your code accessible while using W&B, reach out to us at support@wandb.com.\n::: How does someone without an account see results? If someone runs your script and you have to set anonymous="allow":  Auto-create temporary account: W&B checks for an account that\'s already signed in. If there\'s no account, we automatically create a new anonymous account and save that API key for the session. Log results quickly: The user can run and re-run the script, and automatically see results show up in the W&B dashboard UI.\nThese unclaimed anonymous runs will be available for 7 days. Claim

In [4]:
from scripts.retriever import HybridRetrieverReranker
# Using the query enhancer, response generator, and RAG pipeline from the previous chapter

import cohere
from scripts.query_enhancer import QueryEnhancer
from scripts.response_generator import QueryEnhanedResponseGenerator
from scripts.rag_pipeline import QueryEnhancedRAGPipeline

query_enhancer = QueryEnhancer()




Split strings:   0%|          | 0/696 [00:00<?, ?it/s]

In [15]:
# lets improve the prompt with mode precise instructions

IMPROVED_PROMPT_V1 = open("prompts/improved_prompt_v1.txt").read()

print(IMPROVED_PROMPT_V1)

You are an AI assistant specializing in answering questions about Weights & Biases (W&B). Your task is to provide accurate, concise, and helpful responses based on retrieved documentation snippets. Follow these instructions carefully:

First, review the retrieved documentation snippets related to W&B
Then, consider the user's query
You should respond to the user in the following language:
{language}
We have identified the following intents based on the user's query:
{intents}

To formulate your response:
1. Carefully read and understand the content of each retrieved snippet.
2. Identify the most relevant information to answer the user's query.
3. Pay special attention to code snippets, function names, class names, and method names.
4. Provide a concise answer that addresses the user's query and the identified intents.
5. Use information from the retrieved snippets to support your response.
6. Explain code snippets, functions, classes, and methods when they are relevant to the query.
7.

In [None]:
response_generator = QueryEnhanedResponseGenerator(
    model="command-r", prompt=IMPROVED_PROMPT_V1, client=cohere.AsyncClient()
)



hybrid_retriever = HybridRetrieverReranker()

hybrid_retriever.index_data(chunked_data)

rag_pipeline = QueryEnhancedRAGPipeline(
    query_enhancer=query_enhancer,
    retriever=hybrid_retriever,
    response_generator=response_generator,
)

In [7]:
eval_dataset = weave.ref(
    "weave:///rag-course/dev/object/Dataset:9O0EmmPINmYjgbXW3kucVrDxlTUQJQs0fVZYJj2mtOk"
).get()

In [10]:
eval_dataset.rows[3]



In [None]:
from scripts.response_metrics import ALL_METRICS as RESPONSE_METRICS

response_evaluations = weave.Evaluation(
    name="Response_Evaluation",
    dataset=eval_dataset.rows[:10],
    scorers=RESPONSE_METRICS,
    preprocess_model_input=lambda x: {"query": x["question"]},
)
query_enhanced_response_scores = asyncio.run(
    response_evaluations.evaluate(rag_pipeline)
)

In [16]:
# We can improve the prompt with a example of the response format

IMPROVED_PROMPT_V2 = open("prompts/improved_prompt_v2.txt").read()
print(IMPROVED_PROMPT_V2)


You are an AI assistant specializing in answering questions about Weights & Biases (W&B). Your task is to provide accurate, concise, and helpful responses based on the retrieved documentation snippets. Follow these instructions carefully:

1. You will receive retrieved documentation snippets related to W&B. These snippets contain relevant information for answering the user's query.
2. You will also be given a user query.
3. You should respond to the user in the following language:
{language}
4. We have identified the following intents based on the user's query:
{intents}

5. Analyze the retrieved snippets:
   - Carefully read and understand the content of each snippet.
   - Identify the most relevant information to answer the user's query.
   - Pay special attention to code snippets, function names, class names, and method names.

6. Formulate your response:
   - Provide a concise answer that addresses the user's query.
   - Use information from the retrieved snippets to support your r

In [11]:

response_generator = QueryEnhanedResponseGenerator(
    model="command-r", prompt=IMPROVED_PROMPT_V2, client=cohere.AsyncClient()
)

rag_pipeline = QueryEnhancedRAGPipeline(
    query_enhancer=query_enhancer,
    retriever=hybrid_retriever,
    response_generator=response_generator,
)
query_enhanced_response_scores = asyncio.run(
    response_evaluations.evaluate(rag_pipeline)
)

In [17]:
# we can further improve the prompt to have chain-of-thought reasoning


IMPROVED_PROMPT_V3 = open("prompts/improved_prompt_v3.txt").read()

print(IMPROVED_PROMPT_V3)


You are an AI assistant specializing in Weights & Biases (W&B). Your task is to provide accurate, detailed, and helpful responses using retrieved documentation snippets. Follow these instructions:

1. You will receive documentation snippets and a user query.
2. Respond in the specified language: {language}
3. Identified intents: {intents}

### Process:
1. **Break Down the Query:** Divide the user's query into smaller steps and explain this breakdown.
2. **Analyze Snippets:**
   - Read each snippet.
   - Identify relevant information and explain its importance.
   - For code/functions/classes/methods:
     - Explain their purpose and functionality.
     - Describe their relevance to the query.
     - Provide a step-by-step breakdown if applicable.
3. **Formulate Response:**
   - Address each query step with detailed explanations.
   - Use snippets to support your response.
   - Break down code explanations into logical steps.
   - Use exact names from snippets for functions/classes/meth

In [None]:
response_generator = QueryEnhanedResponseGenerator(
    model="command-r", prompt=IMPROVED_PROMPT_V3, client=cohere.AsyncClient()
)

rag_pipeline = QueryEnhancedRAGPipeline(
    query_enhancer=query_enhancer,
    retriever=hybrid_retriever,
    response_generator=response_generator,
)

query_enhanced_response_scores = asyncio.run(
    response_evaluations.evaluate(rag_pipeline)
)

In [None]:
# We can use a better model to generate the response

In [None]:
response_generator = QueryEnhanedResponseGenerator(
    model="command-r-plus", prompt=IMPROVED_PROMPT_V3, client=cohere.AsyncClient()
)

rag_pipeline = QueryEnhancedRAGPipeline(
    query_enhancer=query_enhancer,
    retriever=hybrid_retriever,
    response_generator=response_generator,
)

query_enhanced_response_scores = asyncio.run(
    response_evaluations.evaluate(rag_pipeline)
)

In [None]:
# compare all the evals and see which one is the best