# Newsletter LM: Improbable Automata
Using Mirascope to build an LLM that can write a newsletter.

## Setup

In [1]:
import os
from datetime import datetime
from dotenv import load_dotenv; load_dotenv()
from mirascope.core import openai

## Basic Usage

In [2]:


@openai.call("gpt-4o-mini")
def get_capital(country: str) -> str:
    return f"What is the capital of {country}?"


response = get_capital("Japan")
print(response.content)


The capital of Japan is Tokyo.


## o1 thinking

In [3]:
# without chain of thought

history: list[dict[str, str]] = []

@openai.call("gpt-4o-mini")
def generate_answer(question: str) -> str:
    return f"Generate an answer to this question: {question}"


def run() -> None:
    question: str = "how many s's in the word mississssippi"
    response: str = generate_answer(question)
    print(f"(User): {question}")
    print(f"(Assistant): {response}")
    history.append({"role": "user", "content": question})
    history.append({"role": "assistant", "content": response})


run()

(User): how many s's in the word mississssippi
(Assistant): The word "mississippi" contains 5 's' characters.


In [4]:
from typing import Literal

# from mirascope.core import groq
from pydantic import BaseModel, Field


history: list[dict] = []


class COTResult(BaseModel):
    title: str = Field(..., desecription="The title of the step")
    content: str = Field(..., description="The output content of the step")
    next_action: Literal["continue", "final_answer"] = Field(
        ..., description="The next action to take"
    )


# @groq.call("llama-3.1-70b-versatile", json_mode=True, response_model=COTResult)
@openai.call("gpt-4o-mini", json_mode=True, response_model=COTResult)
def cot_step(prompt: str, step_number: int, previous_steps: str) -> str:
    return f"""
    You are an expert AI assistant that explains your reasoning step by step.
    For this step, provide a title that describes what you're doing, along with the content.
    Decide if you need another step or if you're ready to give the final answer.

    Guidelines:
    - Use AT MOST 5 steps to derive the answer.
    - Be aware of your limitations as an LLM and what you can and cannot do.
    - In your reasoning, include exploration of alternative answers.
    - Consider you may be wrong, and if you are wrong in your reasoning, where it would be.
    - Fully test all other possibilities.
    - YOU ARE ALLOWED TO BE WRONG. When you say you are re-examining
        - Actually re-examine, and use another approach to do so.
        - Do not just say you are re-examining.

    IMPORTANT: Do not use code blocks or programming examples in your reasoning. Explain your process in plain language.

    This is step number {step_number}.

    Question: {prompt}

    Previous steps:
    {previous_steps}
    """


# @groq.call("llama-3.1-70b-versatile")
@openai.call("gpt-4o-mini")
def final_answer(prompt: str, reasoning: str) -> str:
    return f"""
    Based on the following chain of reasoning, provide a final answer to the question.
    Only provide the text response without any titles or preambles.
    Retain any formatting as instructed by the original prompt, such as exact formatting for free response or multiple choice.

    Question: {prompt}

    Reasoning:
    {reasoning}

    Final Answer:
    """


def generate_cot_response(
    user_query: str,
) -> tuple[list[tuple[str, str, float]], float]:
    steps: list[tuple[str, str, float]] = []
    total_thinking_time: float = 0.0
    step_count: int = 1
    reasoning: str = ""
    previous_steps: str = ""

    while True:
        start_time: datetime = datetime.now()
        cot_result = cot_step(user_query, step_count, previous_steps)
        end_time: datetime = datetime.now()
        thinking_time: float = (end_time - start_time).total_seconds()

        steps.append(
            (
                f"Step {step_count}: {cot_result.title}",
                cot_result.content,
                thinking_time,
            )
        )
        total_thinking_time += thinking_time

        reasoning += f"\n{cot_result.content}\n"
        previous_steps += f"\n{cot_result.content}\n"

        if cot_result.next_action == "final_answer" or step_count >= 5:
            break

        step_count += 1

    # Generate final answer
    start_time = datetime.now()
    final_result: str = final_answer(user_query, reasoning).content
    end_time = datetime.now()
    thinking_time = (end_time - start_time).total_seconds()
    total_thinking_time += thinking_time

    steps.append(("Final Answer", final_result, thinking_time))

    return steps, total_thinking_time


def display_cot_response(
    steps: list[tuple[str, str, float]], total_thinking_time: float
) -> None:
    for title, content, thinking_time in steps:
        print(f"{title}:")
        print(content.strip())
        print(f"**Thinking time: {thinking_time:.2f} seconds**\n")

    print(f"**Total thinking time: {total_thinking_time:.2f} seconds**")


def run() -> None:
    question: str = "How many s's are in the word 'mississssippi'?"
    print("(User):", question)
    # Generate COT response
    steps, total_thinking_time = generate_cot_response(question)
    display_cot_response(steps, total_thinking_time)

    # Add the interaction to the history
    history.append({"role": "user", "content": question})
    history.append(
        {"role": "assistant", "content": steps[-1][1]}
    )  # Add only the final answer to the history


# Run the function

run()

(User): How many s's are in the word 'mississssippi'?
Step 1: Counting the s's in 'mississssippi':
To determine how many 's's are present in the word 'mississssippi', I will start by breaking down the word. The word 'mississssippi' can be visually scanned for the letter 's'. I will count each 's' as I encounter it. The word structure indicates that there are groups of 's's: one 's' after the first 'm', two 's's following the first 'i', then an additional three 's's leading up to 'ippi'. As I tally, I find that the letters are indeed: 1 (from 'miss') + 2 (from 'iss') + 3 (from 'sss') = 6 s's in the total. I will verify if there can be an alternative method, such as writing each character and counting, but the visual and systematic scanning seems straightforward. Therefore, I conclude there are 6 's's. I need to decide if there should be another step to confirm or consider alternative aspects, or if I can finalize my answer now.
**Thinking time: 4.06 seconds**

Final Answer:
6
**Thinking

## Blog Writing Agent

### Implementing the `BaseAgent`


In [5]:
from abc import abstractmethod

from mirascope.core import BaseMessageParam, openai
from pydantic import BaseModel


class OpenAIAgent(BaseModel):
    history: list[BaseMessageParam | openai.OpenAIMessageParam] = []

    @abstractmethod
    def _step(self, prompt: str) -> openai.OpenAIStream: ...

    def run(self, prompt: str) -> str:
        stream = self._step(prompt)
        result, tools_and_outputs = "", []

        for chunk, tool in stream:
            if tool:
                tools_and_outputs.append((tool, tool.call()))
            else:
                result += chunk.content
                print(chunk.content, end="", flush=True)
        if stream.user_message_param:
            self.history.append(stream.user_message_param)
        self.history.append(stream.message_param)
        if tools_and_outputs:
            self.history += stream.tool_message_params(tools_and_outputs)
            return self.run("")
        print("\n")
        return result

Note that the `_step` function is marked as an abstract method that each subclass will need to implement.

### Research Agent

#### Web Search Tool

In [6]:
import inspect

from duckduckgo_search import DDGS


class ResearcherBase(OpenAIAgent):
    max_results: int = 10

    def web_search(self, text: str) -> str:
        """Search the web for the given text.

        Args:
            text: The text to search for.

        Returns:
            The search results for the given text formatted as newline separated
            dictionaries with keys 'title', 'href', and 'body'.
        """
        try:
            results = DDGS(proxy=None).text(text, max_results=self.max_results)
            return "\n\n".join(
                [
                    inspect.cleandoc(
                        """
                        title: {title}
                        href: {href}
                        body: {body}
                        """
                    ).format(**result)
                    for result in results
                ]
            )
        except Exception as e:
            return f"{type(e)}: Failed to search the web for text"

#### Web Parser

In [7]:
import requests
from bs4 import BeautifulSoup


class ResearcherBaseWithParser(ResearcherBase):
    def parse_webpage(self, link: str) -> str:
        """Parse the paragraphs of the webpage found at `link`.

        Args:
            link: The URL of the webpage.

        Returns:
            The parsed paragraphs of the webpage, separated by newlines.
        """
        try:
            response = requests.get(link)
            soup = BeautifulSoup(response.content, "html.parser")
            return "\n".join([p.text for p in soup.find_all("p")])
        except Exception as e:
            return f"{type(e)}: Failed to parse content from URL"

#### Researcher Step Fn

In [8]:
from mirascope.core import prompt_template


class ResearcherBaseWithStep(ResearcherBaseWithParser):
    @openai.call("gpt-4o-mini", stream=True)
    @prompt_template(
        """
        SYSTEM:
        Your task is to research a topic and summarize the information you find.
        This information will be given to a writer (user) to create a blog post.

        You have access to the following tools:
        - `web_search`: Search the web for information. Limit to max {self.max_results}
            results.
        - `parse_webpage`: Parse the content of a webpage.

        When calling the `web_search` tool, the `body` is simply the body of the search
        result. You MUST then call the `parse_webpage` tool to get the actual content
        of the webpage. It is up to you to determine which search results to parse.

        Once you have gathered all of the information you need, generate a writeup that
        strikes the right balance between brevity and completeness. The goal is to
        provide as much information to the writer as possible without overwhelming them.

        MESSAGES: {self.history}
        USER: {prompt}
        """
    )
    def _step(self, prompt: str) -> openai.OpenAIDynamicConfig:
        return {"tools": [self.web_search, self.parse_webpage]}

#### Implementing the `research` tool method

While we could use the `run` method from our `OpenAIAgent` as a tool, there is value in further engineering our prompt by providing good descriptions (and names!) for the tools we use. Putting everything together, we can expose a `research` method that we can later use as a tool in our agent executor:

In [9]:
class Researcher(ResearcherBaseWithStep):
    def research(self, prompt: str) -> str:
        """Research a topic and summarize the information found.

        Args:
            prompt: The user prompt to guide the research. The content of this prompt
                is directly responsible for the quality of the research, so it is
                crucial that the prompt be clear and concise.

        Returns:
            The results of the research.
        """
        print("RESEARCHING...")
        result = self.run(prompt)
        print("RESEARCH COMPLETE!")
        return result

### Writing Agent

#### Initial Draft

In [17]:
from mirascope.integrations.tenacity import collect_errors
from pydantic import ValidationError
from tenacity import retry, wait_exponential


class AgentExecutorBase(OpenAIAgent):
    researcher: Researcher = Researcher()
    num_paragraphs: int = 4

    class InitialDraft(BaseModel):
        draft: str
        critique: str

    @staticmethod
    def parse_initial_draft(response: InitialDraft) -> str:
        return f"Draft: {response.draft}\nCritique: {response.critique}"

    @retry(
        wait=wait_exponential(multiplier=1, min=4, max=10),
        after=collect_errors(ValidationError),
    )
    @openai.call(
        "gpt-4o-mini", response_model=InitialDraft, output_parser=parse_initial_draft
    )
    @prompt_template(
        """
        SYSTEM:
        Your task is to write the initial draft for a blog post based on the information
        provided to you by the researcher, which will be a summary of the information
        they found on the internet.

        Along with the draft, you will also write a critique of your own work. This
        critique is crucial for improving the quality of the draft in subsequent
        iterations. Ensure that the critique is thoughtful, constructive, and specific.
        It should strike the right balance between comprehensive and concise feedback.

        If for any reason you deem that the research is insufficient or unclear, you can
        request that additional research be conducted by the researcher. Make sure that
        your request is specific, clear, and concise.

        MESSAGES: {self.history}
        USER:
        {previous_errors}
        {prompt}
        """
    )
    def _write_initial_draft(
        self, prompt: str, *, errors: list[ValidationError] | None = None
    ) -> openai.OpenAIDynamicConfig:
        """Writes the initial draft of a blog post along with a self-critique.

        Args:
            prompt: The user prompt to guide the writing process. The content of this
                prompt is directly responsible for the quality of the blog post, so it
                is crucial that the prompt be clear and concise.

        Returns:
            The initial draft of the blog post along with a self-critique.
        """
        return {
            "computed_fields": {
                "previous_errors": f"Previous Errors: {errors}" if errors else None
            }
        }

There are a few things worth noting here:

- We are again using `self` for convenient access to the containing class' state. In this case we expect to put this function inside of our executor and want to give access to the conversation history -- particularly the results of the researcher.
- We are using `response_model` to extract specifically the `draft` and `critique` fields.
- We are using an output parser `parse_initial_draft` to parse the `InitialDraft` class into a format that is friendly for using tools (str).
- We are using tenacity in order to retry should the call fail to properly generate an `InitialDraft` instance, reinserting the list of previous errors into each subsequent call.

### Agent Executor

Now we just need to put it all together into our AgentExecutor class, write our _step function, and run it!

In [18]:
class AgentExecutor(AgentExecutorBase):
    @openai.call("gpt-4o-mini", stream=True)
    @prompt_template(
        """
        SYSTEM:
        Your task is to facilitate the collaboration between the researcher and the
        blog writer. The researcher will provide the blog writer with the information
        they need to write a blog post, and the blog writer will draft and critique the
        blog post until they reach a final iteration they are satisfied with.

        To access the researcher and writer, you have the following tools:
        - `research`: Prompt the researcher to perform research.
        - `_write_initial_draft`: Write an initial draft with a self-critique

        You will need to manage the flow of information between the researcher and the
        blog writer, ensuring that the information provided is clear, concise, and
        relevant to the task at hand.

        The final blog post MUST have EXACTLY {self.num_paragraphs} paragraphs.

        MESSAGES: {self.history}
        USER: {prompt}
        """
    )
    def _step(self, prompt: str) -> openai.OpenAIDynamicConfig:
        # Create function wrappers that don't include validation error handling
        def research(prompt: str) -> str:
            return self.researcher.research(prompt)
            
        def write_initial_draft(prompt: str) -> str:
            return self._write_initial_draft(prompt)
            
        return {"tools": [research, write_initial_draft]}


agent = AgentExecutor()
print("STARTING AGENT EXECUTION...")
agent.run("Help me write a blog post about LLMs and structured outputs.")

STARTING AGENT EXECUTION...
RESEARCHING...
### Large Language Models (LLMs): Overview

Large Language Models (LLMs) are sophisticated AI systems trained on vast datasets to understand and generate human language. Built upon deep learning architectures, particularly the transformer model, they excel at various tasks including writing, summarization, translation, and conversation. Notable examples include OpenAI's GPT series, Google's BERT, and Meta's LLaMA, all leveraging billions of parameters to discern nuanced patterns in language. As a result, LLMs can create coherent, contextually relevant text and engage in human-like conversations, making them integral to advancements in natural language processing (NLP) and artificial intelligence (AI) applications.

### Relationship to Structured Outputs

LLMs can generate structured outputs, such as tables, formatted documents, or specific data queries (like SQL), based on unstructured inputs. Through techniques like prompt engineering and ret

"The latest draft is a well-structured and engaging discussion on Large Language Models (LLMs) and their capabilities regarding structured outputs. It successfully integrates real-world examples and addresses the associated challenges while remaining concise and informative. Here is the final draft based on the improvements made through previous iterations:\n\n---\n\n### Understanding Large Language Models and Their Structured Output Capabilities\n\nLarge Language Models (LLMs) represent a significant breakthrough in artificial intelligence, designed to interpret and generate human language through advanced deep learning techniques. These models, such as OpenAI's GPT and Google's BERT, are trained on extensive datasets, allowing them to recognize intricate patterns and contextual relationships in language. While these powerful tools excel at producing coherent and relevant textual responses, their adaptability extends beyond mere conversation—they can also generate structured outputs t