In [3]:
import os
os.chdir('../promptsmith')

### setting up dspy

In [4]:
from promptsmith.dspy_init import get_dspy
dspy, lm = get_dspy()

In [5]:
# Define a module (ChainOfThought) and assign it a signature (return an answer, given a question).
qa = dspy.ChainOfThought('question -> answer')

response = qa(question="How many floors are in the castle David Gregory inherited?")
print(response.answer)

The number of floors in the castle David Gregory inherited is not specified in the information provided.


### Access the last call to the LLM, with all metadata

In [6]:
len(lm.history)  # e.g., 3 calls to the LM

1

In [7]:
lm.history[-1].keys()

dict_keys(['prompt', 'messages', 'kwargs', 'response', 'outputs', 'usage', 'cost', 'timestamp', 'uuid', 'model', 'response_model', 'model_type'])

In [8]:
import pprint
pp = pprint.PrettyPrinter(indent=2)
pp.pprint(lm.history[-1])

{ 'cost': 8.235e-05,
  'kwargs': {},
  'messages': [ { 'content': 'Your input fields are:\n'
                             '1. `question` (str)\n'
                             'Your output fields are:\n'
                             '1. `reasoning` (str)\n'
                             '2. `answer` (str)\n'
                             'All interactions will be structured in the '
                             'following way, with the appropriate values '
                             'filled in.\n'
                             '\n'
                             '[[ ## question ## ]]\n'
                             '{question}\n'
                             '\n'
                             '[[ ## reasoning ## ]]\n'
                             '{reasoning}\n'
                             '\n'
                             '[[ ## answer ## ]]\n'
                             '{answer}\n'
                             '\n'
                             '[[ ## completed ## ]]\n'
               

### Request Multiple Variations

In [9]:
question = "What's something great about the ColBERT retrieval model?"

answer_a_question = dspy.ChainOfThought('question -> answer', n=5)

response = answer_a_question(question=question)

In [10]:
response.completions.answer

['A great aspect of the ColBERT retrieval model is its ability to effectively combine dense and sparse retrieval techniques, achieving high accuracy and efficiency through a late interaction mechanism.',
 'One great aspect of the ColBERT retrieval model is its use of late interaction, which enables efficient and scalable retrieval while maintaining high accuracy by combining the strengths of both dense and sparse retrieval methods.',
 "One great aspect of the ColBERT retrieval model is its efficient late interaction mechanism, which allows it to utilize BERT's powerful contextual embeddings while maintaining high retrieval speed, making it suitable for large-scale applications.",
 'One great thing about the ColBERT retrieval model is its efficient late interaction mechanism, which allows it to combine the powerful contextual embeddings of BERT with scalable retrieval, making it effective for large datasets while maintaining high accuracy.',
 'One great aspect of the ColBERT retrieval m

In [11]:
print(f"Reasoning: {response.reasoning}")
print(f"Answer: {response.answer}")

Reasoning: One of the great aspects of the ColBERT retrieval model is its ability to combine the advantages of both dense and sparse retrieval methods. ColBERT utilizes a two-step approach where it first creates dense representations of queries and documents, allowing for efficient similarity calculations, and then employs a late interaction mechanism to selectively compare these representations. This enables it to achieve high retrieval accuracy while maintaining efficiency, making it suitable for large-scale information retrieval tasks. Additionally, its architecture allows for flexible integration with existing dense retrieval systems, enhancing performance without extensive modifications.
Answer: A great aspect of the ColBERT retrieval model is its ability to effectively combine dense and sparse retrieval techniques, achieving high accuracy and efficiency through a late interaction mechanism.


### Check LLM Usage

In [12]:
response.get_lm_usage()


{'openai/gpt-4o-mini': {'completion_tokens': 800,
  'prompt_tokens': 173,
  'total_tokens': 973,
  'completion_tokens_details': {'accepted_prediction_tokens': 0,
   'audio_tokens': 0,
   'reasoning_tokens': 0,
   'rejected_prediction_tokens': 0,
   'text_tokens': None},
  'prompt_tokens_details': {'audio_tokens': 0,
   'cached_tokens': 0,
   'text_tokens': None,
   'image_tokens': None}}}

### Cool Example

In [13]:
feeling_analyzer = dspy.Predict('sentence, situation -> the_actual_feeling_of_the_person_in_the_sentence: str, reasoning: str')

sentence="i went outside after a long time being in a dark room"
situation="it's raining outside"
response = feeling_analyzer(sentence=sentence, situation=situation)

print(response.the_actual_feeling_of_the_person_in_the_sentence)
print(response.reasoning)

a mix of relief and disappointment
The person likely feels relief from finally being outside after being in a dark room for a long time, as it can be refreshing to experience natural light and fresh air. However, the disappointment comes from the fact that it is raining outside, which may dampen their mood and prevent them from fully enjoying the experience of being outdoors.


### Using a Judge

In [21]:
from promptsmith.judges.judge_meaning import JudgeMeaning

judge_meaning = dspy.Predict(JudgeMeaning)

input_text = (
    "Yesterday, I went to the grocery store to buy ingredients for dinner. "
    "I ended up buying fruits, vegetables, and pasta. When I got home, I realized I forgot the cheese."
)

output_text = (
    "I went shopping yesterday to get food. I bought some fruits, vegetables, and pasta."
)

result = judge_meaning(input_text=input_text, output_text=output_text)

print("Reasoning:", result.reasoning)
print("Score:", result.score)

Reasoning: The output text captures the general idea of going shopping and buying some items, but it omits several key details from the original text. Specifically, it does not mention that the shopping was for dinner ingredients, nor does it include the fact that cheese was forgotten, which is a significant detail that affects the overall meaning of the narrative. While the main actions (going shopping and buying certain items) are preserved, the context and completeness of the story are lost. Therefore, the changes are not acceptable as they alter the essential meaning of the original text.
Score: 0.5


### Using the task of restructuring text

In [16]:
from promptsmith.tasks.restructure_text import RestructureText

text_to_restructure = (
    "I was trying to fix the kitchen sink. At first, I thought it was a clog, but it turned out to be a broken pipe. "
    "Water was everywhere, and I had no tools. I called my friend who had some plumbing experience, and he came over. "
    "Together we shut off the water and replaced the pipe, which took us the entire afternoon."
)

restructure = dspy.ChainOfThought(RestructureText)
restructured_text = restructure(input_text=text_to_restructure)


print("\n📝 Original Text:")
print("----------------------")
print(text_to_restructure)
print("----------------------")

print("\n📘 Restructured Text:")
print("----------------------")
print(restructured_text.output_text)
print("----------------------")

print("\n🧠 Reasoning:")
print(restructured_text.reasoning)

print("\n🤖 DSPy History:")
print(dspy.inspect_history(n=1))


📝 Original Text:
----------------------
I was trying to fix the kitchen sink. At first, I thought it was a clog, but it turned out to be a broken pipe. Water was everywhere, and I had no tools. I called my friend who had some plumbing experience, and he came over. Together we shut off the water and replaced the pipe, which took us the entire afternoon.
----------------------

📘 Restructured Text:
----------------------
### Fixing the Kitchen Sink: A Plumbing Adventure

Recently, I faced a challenge while trying to fix my kitchen sink. Initially, I suspected that a clog was the issue, but I soon discovered that the real problem was a broken pipe.

As water spilled everywhere, I realized I didn't have the necessary tools to handle the situation. In a moment of urgency, I called my friend, who has some plumbing experience. He quickly came over to help me.

Together, we worked to shut off the water supply and replace the broken pipe. This task took us the entire afternoon, but we managed 

#### evaluating the restructured text using ensemble judge

In [17]:
from promptsmith.judges.ensemble_judge import EnsembleJudge
import os

judge_path = os.path.abspath("../promptsmith/judges/judge_restructure_text.yaml")

judge = EnsembleJudge(judge_path)
verdict = judge(input_text=text_to_restructure, output_text=restructured_text.output_text)

In [None]:
def display_verdict(verdict):

    print("\n📊 Evaluation Results:")
    print("----------------------")
    
    store = verdict._store

    # Find all score, reasoning, and weight fields
    score_fields = [k for k in store if k.endswith('_score') and k != 'combined_score']
    reasoning_fields = [k for k in store if k.endswith('_reasoning')]
    weight_fields = {k.replace('_weight', ''): store[k] for k in store if k.endswith('_weight')}

    # Display overall score
    overall = store.get('combined_score')
    if overall is None and score_fields:
        # Fallback: average of all scores
        overall = sum(store[k] for k in score_fields) / len(score_fields)
    print(f"\n🌟 Overall Score: {overall:.3f}\n")

    # For each judge, display name, score, weight, and reasoning
    # Sort for consistent order
    for field in sorted(score_fields):
        judge_key = field.replace('_score', '')
        judge_name = judge_key.replace('_', ' ').title()
        reasoning_field = field.replace('_score', '_reasoning')
        score = store[field]
        reasoning = store.get(reasoning_field, "")
        weight = weight_fields.get(judge_key, None)
        if weight is not None:
            print(f"### {judge_name} Analysis (score={score:.2f}, weight={weight})")
        else:
            print(f"### {judge_name} Analysis (score={score:.2f})")
        print(reasoning)
        print()  # Blank line between judges

In [18]:
display_verdict(verdict)


📊 Evaluation Results:
----------------------

🌟 Overall Score: 0.990

### Focus Relevance Analysis (score=1.00, weight=0.25)
The restructured text maintains a strong focus on the original message about fixing the kitchen sink. It closely follows the sequence of events, detailing the initial diagnosis of a clog, the discovery of a broken pipe, the lack of tools, and the assistance from a friend. Each sentence adds relevant information without drifting off-topic or introducing unrelated content. The narrative style enhances engagement while remaining true to the original experience. Overall, the rewrite effectively captures the essence of the original text without unnecessary filler or generalizations.

### Meaning Analysis (score=1.00, weight=0.25)
The restructured text maintains the essential meaning of the original input. Key ideas such as the initial assumption of a clog, the discovery of a broken pipe, the urgency of the situation due to water spilling everywhere, and the involveme