<p style="font-size:small; color:gray;"> Author: 鄭永誠, Year: 2024 </p>

# Dspy框架使用範例
----------
### 幾個常見module範例
- **dspy.Predict**：處理輸入和輸出欄位，生成指令，並為指定的 `signature` 創建模板。
- **dspy.ChainOfThought**：繼承了 `Predict` 模組，並增加了「Chain of Thought」處理的功能。
- **dspy.ChainOfThoughtWithHint**：繼承了 `Predict` 模組，並增強了 `ChainOfThought` 模組，增加了提供推理提示的選項。
- **dspy.MultiChainComparison**：繼承了 `Predict` 模組，並增加了多鏈比較的功能。
- **dspy.Retrieve**：從檢索器模組檢索段落。
- **dspy.ReAct**：旨在組成思想、行動和觀察的交錯步驟。



In [1]:
%pip install -U dspy-ai -q

Defaulting to user installation because normal site-packages is not writeable
Collecting dspy-ai
  Downloading dspy_ai-2.4.13-py3-none-any.whl.metadata (39 kB)
Collecting backoff (from dspy-ai)
  Using cached backoff-2.2.1-py3-none-any.whl.metadata (14 kB)
Collecting optuna (from dspy-ai)
  Downloading optuna-3.6.1-py3-none-any.whl.metadata (17 kB)
Collecting structlog (from dspy-ai)
  Downloading structlog-24.4.0-py3-none-any.whl.metadata (7.3 kB)
Collecting alembic>=1.5.0 (from optuna->dspy-ai)
  Downloading alembic-1.13.2-py3-none-any.whl.metadata (7.4 kB)
Collecting colorlog (from optuna->dspy-ai)
  Downloading colorlog-6.8.2-py3-none-any.whl.metadata (10 kB)
Collecting Mako (from alembic>=1.5.0->optuna->dspy-ai)
  Downloading Mako-1.3.5-py3-none-any.whl.metadata (2.9 kB)
Downloading dspy_ai-2.4.13-py3-none-any.whl (280 kB)
Using cached backoff-2.2.1-py3-none-any.whl (15 kB)
Downloading optuna-3.6.1-py3-none-any.whl (380 kB)
Downloading structlog-24.4.0-py3-none-any.whl (67 kB)
Dow

In [2]:
import dspy

# Language Model (LM)

In [61]:
import os
from llama_index.llms.groq import Groq
from llama_index.core.llms import ChatMessage

api_key = os.getenv("GROQ_API_KEY")

# Create the Groq client
phi_ollama = dspy.GROQ(model="llama-3.1-70b-versatile", api_key=api_key, temperature=0.1)

# its very important to configure dspy with the LM
dspy.settings.configure(lm=phi_ollama, max_tokens=4096)

## dspy範例
- 範例1. `Predict()`  根據給予的流程來判斷要做的預測，此處上面範例用於分類給定句子情感

- 範例2. `ChainOfThought()` 思維鍊，此處使用摘要給定段落/句子的範例
https://dspy-docs.vercel.app/api/modules/ChainOfThought

In [62]:
# 範例1. sentiment classificaiton
sentence = "我真的很想知道LLM是什麼?"
classify = dspy.Predict('sentence -> sentiment')
classify(sentence=sentence)

Prediction(
    sentiment='Sentence: 我真的很想知道LLM是什麼?\nSentiment: Neutral/Curious'
)

In [41]:
# dspy.Predict("sentence -> summary")
# 注意，dspy有長度限制
sentence = """
你說的對，但是《原神》是由米哈游自主研發的一款全新開放世界冒險遊戲，遊戲發生在一個被稱作「提瓦特」的幻想世界
在這裡，被神選中的人將被授予「神之眼」，導引元素之力。
"""
classify = dspy.Predict("sentence -> summary")
classify(sentence=sentence)

Prediction(
    summary='Sentence: 你說的對，但是《原神》是由米哈游自主研發的一款全新開放世界冒險遊戲，遊戲發生在一個被稱作「提瓦特」的幻想世界 在這裡，被神選中的人將被授予「神之眼」，導引元素之力。\nSummary: 《原神》是一款開放世界冒險遊戲，發生在「提瓦特」幻想世界，被神選中的人將被授予「神之眼」。'
)

In [68]:
# 另種寫法範例，說明Predict吃的格式，你也可以用這種方式來呼叫並增加資訊，其實即下面會寫的CoT概念
# 這寫法等於dspy.Predict("context, question -> answer")
class GenerateAnswer(dspy.Signature):
    """Classify emotion among sadness, joy, love, anger, fear, surprise."""
    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")
sentence = "回答非常完整，我想進一步了解LLM能用在甚麼地方?"  # from dair-ai/emotion

classify = dspy.Predict(GenerateAnswer)
classify(sentence=sentence)



Prediction(
    context='Please go ahead and provide the context.',
    question='',
    answer='Context: A person just received news that their favorite childhood pet had passed away after a long illness.\n\nQuestion: How would they likely feel?'
)

In [44]:
# 範例2. Summarize a given sentence with CoT (ChainOfThought)
import textwrap

document = """
你說的對，但是《原神》是由米哈游自主研發的一款全新開放世界冒險遊戲，遊戲發生在一個被稱作「提瓦特」的幻想世界
在這裡，被神選中的人將被授予「神之眼」，導引元素之力。
"""
summarize = dspy.ChainOfThought('document -> summary')
response = summarize(document=document)
print(response)
#
# print(textwrap.fill(response.summary, width=50))

Prediction(
    rationale='produce the summary. We can start by identifying the main topic of the document, which is the game "原神" (Genshin Impact). Then, we can extract the key information about the game, such as its developer, genre, and setting. Finally, we can condense this information into a brief summary.',
    summary='《原神》是一款由米哈游自主研發的全新開放世界冒險遊戲，設定在一個被稱作「提瓦特」的幻想世界。'
)


# Modules

In [46]:
# 1) Declare with a signature.
classify = dspy.Predict('sentence -> sentiment')

# 2) Call with input argument(s). 
sentence = "你要不要吃哈密瓜"
response = classify(sentence=sentence)

# 3) Access the output.
print(response.sentiment)

Sentence: 你要不要吃哈密瓜
Sentiment: Neutral


In [50]:
question = "你要不要吃哈密瓜"

# 1) Declare with a signature, and pass some config.
classify = dspy.ChainOfThought('question -> answer', n=1)

# 2) Call with input argument.
response = classify(question=question)

# 3) Access the outputs.
print(response)

Prediction(
    rationale="produce the answer. We need to consider the context and the speaker's intention. The question seems to be asking if the listener wants to eat a watermelon (哈密瓜). To answer this question, we need to consider the listener's preferences, dietary restrictions, and the situation.",
    answer='我要'
)


### 建立簡易範例問答資訊
* Three values: the inputs, the intermediate labels, and the final label.
* 10 - 500 samples. More is better

In [51]:
train_set = [
    dspy.Example(
        question="原神!!!",
        answer="啟動!!!!"
    ),
    dspy.Example(
        question="那豈不成了跪著要飯的？",
        answer="那你要這麼說，買官當縣長還真就是跪著要飯的，就這，多少想跪還沒這門子呢"
    ),
    dspy.Example(
        question="那你是想站著，還是想掙錢啊？",
        answer="我是想站著，還把錢掙了！"
    ),        
    dspy.Example(
        question="這個能不能掙錢？",
        answer="掙不成。"
    ),                    
    dspy.Example(
        question="（從袖口中甩出一把槍來，拍案，捲袖）：這個能不能掙錢？",
        answer="能掙，山裡。"
    ),   
    dspy.Example(
        question="敢問九筒大哥何方神聖？",
        answer="能掙，山裡。"
    ),      
    dspy.Example(
        question="這個能不能掙錢？",
        answer="掙不成。"
    ),                                    
]

# Metrics

In [52]:
def validate_answer(example, pred):
    return example.answer.lower() == pred.answer.lower()

# Optimizers

In [53]:
class qa_module(dspy.Module):
    def __init__(self, num_passages=3):
        super().__init__()
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
    
    def forward(self, question):
        prediction = self.generate_answer(question=question)
        return dspy.Prediction(answer=prediction.answer)

# A multi-hop QA Example (from DsPy docs)
We will do the below steps:
1. Load the Language Model and Retrieval Model
2. Load the "question-answer pairs" `HotPotQA` dataset to compile (train) the DsPy program
3. Build the signatures
4. Define the pipeline (as a module)
5. Define the evaluation metric
6. Compile (train) the pipeline with an optimizer (lets use `BootstrapFewShot`)
7. Evaluate and compare the Compiled and Uncompiled pipeline

## Load the LM and RM models

In [54]:
import dspy
phi_ollama = dspy.OllamaLocal(model='phi')

In [55]:
colbertv2_wiki17_abstracts = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
dspy.settings.configure(lm=phi_ollama, rm=colbertv2_wiki17_abstracts, max_tokens=1024)

## Load the Dataset 

In [56]:
from dspy.datasets import HotPotQA

# Load the dataset.
dataset = HotPotQA(train_seed=1, 
                   train_size=20, 
                   eval_seed=2023, 
                   dev_size=10, 
                   test_size=0)

# Tell DSPy that the 'question' field is the input. Any other fields are labels and/or metadata.
trainset = [x.with_inputs('question') for x in dataset.train]
devset = [x.with_inputs('question') for x in dataset.dev]

len(trainset), len(devset)

Downloading builder script:   0%|          | 0.00/6.42k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/9.19k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/566M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/47.5M [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/46.2M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/90447 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/7405 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/7405 [00:00<?, ? examples/s]

(20, 10)

## Build Signature

In [22]:
class GenerateAnswer(dspy.Signature):
    """Answer questions with short factoid answers."""
    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    answer = dspy.OutputField(desc="often between 1 and 5 words")

In [23]:
class GenerateSearchQuery(dspy.Signature):
    """Write a simple search query that will help answer a complex question."""
    context = dspy.InputField(desc="may contain relevant facts")
    question = dspy.InputField()
    query = dspy.OutputField()

## Define the pipeline

In [27]:
from dsp.utils import deduplicate

class SimplifiedBaleen(dspy.Module):
    def __init__(self, passages_per_hop=3, max_hops=2):
        super().__init__()
        self.generate_query = [dspy.ChainOfThought(GenerateSearchQuery) for _ in range(max_hops)]
        self.retrieve = dspy.Retrieve(k=passages_per_hop)
        self.generate_answer = dspy.ChainOfThought(GenerateAnswer)
        self.max_hops = max_hops
    
    def forward(self, question):
        context = []
        for hop in range(self.max_hops):
            query = self.generate_query[hop](context=context, question=question).query
            passages = self.retrieve(query).passages
            context = deduplicate(context + passages)

        pred = self.generate_answer(context=context, question=question)
        return dspy.Prediction(context=context, answer=pred.answer)

## Define the evaluation metric 

In [6]:
def validate_context_and_answer_and_hops(example, pred, trace=None):
    # The predicted answer matches the gold answer.
    if not dspy.evaluate.answer_exact_match(example, pred): 
        return False
    # The retrieved context contains the gold answer.
    if not dspy.evaluate.answer_passage_match(example, pred): 
        return False

    hops = [example.question] + [outputs.query for *_, outputs in trace if 'query' in outputs]
    # None of the generated queries exceeds 100 characters in length)
    if max([len(h) for h in hops]) > 100: 
        return False
        
    # None of the generated queries is roughly repeated 
    # (i.e., none is within 0.8 or higher F1 score of earlier queries).
    if any(dspy.evaluate.answer_exact_match_str(hops[idx], hops[:idx], frac=0.8) for idx in range(2, len(hops))): 
        return False
        
    return True

## Compile (train) the pipeline

In [None]:
%%time

from dspy.teleprompt import BootstrapFewShot

teleprompter = BootstrapFewShot(metric=validate_context_and_answer_and_hops)

compiled_baleen = teleprompter.compile(SimplifiedBaleen(), 
                                       teacher=SimplifiedBaleen(passages_per_hop=2), 
                                       trainset=trainset)

In [32]:
# Ask any question you like to this simple RAG program.
my_question = "How many storeys are in the castle that David Gregory inherited?"

# Get the prediction from uncompiled Baleen
uncompiled_baleen = SimplifiedBaleen()
pred = uncompiled_baleen(my_question)

# Print the contexts and the answer.
print(f"Question: {my_question}")
print(f"Predicted Answer: {pred.answer}")
print(f"Retrieved Contexts (truncated): {[c[:200] + '...' for c in pred.context]}")

Question: How many storeys are in the castle that David Gregory inherited?
Predicted Answer: It is unclear how many storeys the castle that David Gregory inherited had.
Retrieved Contexts (truncated): ['David Gregory (physician) | David Gregory (20 December 1625 – 1720) was a Scottish physician and inventor. His surname is sometimes spelt as Gregorie, the original Scottish spelling. He inherited Kinn...', 'Delnadamph Lodge | Delnadamph Lodge is located on the Balmoral Estate about eight miles north of the castle. The lodge and its estate lands were bought by Queen Elizabeth II for a figure believed to b...', 'Aydon | Aydon Castle is a fortified manor house and is a Scheduled Ancient Monument and a Grade I listed building. The manor house was built by Robert de Reymes, a wealthy Suffolk merchant, starting i...', 'Floors Castle | Floors Castle, in Roxburghshire, south-east Scotland, is the seat of the Duke of Roxburghe. Despite its name it is a country house rather than a fortress. It wa

In [39]:
def gold_passages_retrieved(example, pred, trace=None):
    gold_titles = set(map(dspy.evaluate.normalize_text, example['gold_titles']))
    found_titles = set(map(dspy.evaluate.normalize_text, [c.split(' | ')[0] for c in pred.context]))

    return gold_titles.issubset(found_titles)

## Evaluate the compiled and uncompiled Baleen pipeline

In [None]:
from dspy.evaluate.evaluate import Evaluate
# Set up the `evaluate_on_hotpotqa` function.
evaluate_on_hotpotqa = Evaluate(devset=devset, 
                                num_threads=1, 
                                display_progress=True, 
                                display_table=5)

In [41]:
# Evaluate the uncompiled Baleen pipeline
uncompiled_baleen_retrieval_score = evaluate_on_hotpotqa(uncompiled_baleen, 
                                                         metric=gold_passages_retrieved, 
                                                         display=False)
print(f"## Retrieval Score for uncompiled Baleen: {uncompiled_baleen_retrieval_score}")

## Retrieval Score for uncompiled Baleen: 40.0


  df = df.applymap(truncate_cell)


In [43]:
# Evaluate the compiled Baleen pipeline
compiled_baleen_retrieval_score = evaluate_on_hotpotqa(compiled_baleen, 
                                                       metric=gold_passages_retrieved,
                                                      display=False)
print(f"## Retrieval Score for compiled Baleen: {compiled_baleen_retrieval_score}")

Error for example in dev set: 		 unsupported operand type(s) for +=: 'int' and 'NoneType'
## Retrieval Score for compiled Baleen: 30.0


  df = df.applymap(truncate_cell)


## Save the compiled pipeline, if needed 

In [None]:
compiled_baleen.save("./dspy_models")
compiled_baleen = SimplifiedBaleen().load("./dspy_models")

https://github.com/ai-bites/generative-ai-course/blob/main/dspy_demo.ipynb