In [2]:
# Treat the blog outline and transcript as fixed inputs.

topic = "Using Pocketbase as a backend for a FastAPI HTMX app"

filename = "transcript.txt"

with open(filename, "r") as file:
    transcript = file.read()

blog_outline = {
  "hook": "Explore the seamless integration of Pocketbase with FastAPI and HTMX, and discover how to overcome common challenges in building a robust backend for your web applications.",
  "section1": "Discuss the straightforward setup of using Pocketbase with FastAPI and HTMX, and address the potential challenges of integrating Alpine.js, offering solutions to ensure a smooth development process.",
  "section2": "Examine Pocketbase's authentication system, highlighting its simplicity and reusability, while addressing the potential confusion around client-side password hashing and suggesting best practices for secure authentication.",
  "section3": "Analyze the pros and cons of using server-based cookies for authentication, comparing them with alternative methods like JSON Web Tokens (JWTs) and cryptographic keys, to help developers choose the best solution for their specific use cases.",
  "section4": "Explore the manual setup process for creating collections in Pocketbase, and provide insights on how to automate this process with admin access to enhance efficiency.",
  "section5": "Highlight the benefits of deploying FastAPI and Pocketbase on Railway, emphasizing its simplicity and cost-effectiveness for small to medium projects, and provide a brief guide on getting started.",
  "section6": "Address the productivity bottleneck caused by the lack of boilerplate, and discuss how building a custom boilerplate can streamline project deployment and iteration, ultimately boosting productivity.",
  "section7": "Compare Pocketbase's logging capabilities with other platforms like Google Cloud, focusing on how its superior logging features can simplify debugging and improve performance.",
  "section8": "Offer strategies for solo developers to simplify their tech stack and deployment process, thereby speeding up project iteration and reducing overhead, with practical tips for implementation.",
  "conclusion": "Summarize the key insights on using Pocketbase as a backend for a FastAPI HTMX app, reinforcing the importance of choosing the right tools and strategies to optimize development efficiency and security."
}

In [3]:
# recreate the embeddings functions from poc.ipynb

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Assuming 'transcript' variable contains the full transcript text
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_text(transcript)

embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_texts(chunks, embeddings)

# Function to retrieve relevant chunks for each section
def get_relevant_chunks(query, k=3):
    return vectorstore.similarity_search(query, k=k)

# Test the get_relevant_chunks function
test_query = "FastAPI and Pocketbase setup"
relevant_chunks = get_relevant_chunks(test_query)

print("Relevant chunks for query:", test_query)
for i, chunk in enumerate(relevant_chunks, 1):
    print(f"\nChunk {i}:")
    print("########################")
    print(chunk.page_content)
    print("########################")
    print()

Relevant chunks for query: FastAPI and Pocketbase setup

Chunk 1:
########################
215
21:35.056 --> 21:41.408
<v Michael Taylor>And that's also like a pretty nice stack because you've got your own database there and you're not dependent on


216
21:41.472 --> 21:42.176
<v Michael Taylor>other things.


217
21:42.296 --> 21:50.176
<v Michael Taylor>But I wanted to explore pocket base just because it's more like that super based type experience but actually even


218
21:50.216 --> 21:53.996
<v Michael Taylor>simpler it looks like. So I think this is way easier.


219
21:55.056 --> 22:01.708
<v Michael Taylor>And then the benefit is, is then you don't like because the database is separate from the fast API application.


220
22:01.772 --> 22:11.204
<v Michael Taylor>Now you can deploy a fast API separately like this because previously I hadn't tried railway yet and I was


221
22:11.220 --> 22:17.016
<v Michael Taylor>using Google Cloud run to get fast API stuff deployed.
######

In [4]:
# create the section generation function so we have a function for running the end to end pipeline

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
import os
import json
from dotenv import load_dotenv
import os

load_dotenv()


# Initialize the ChatOpenAI model with JSON mode enabled
llm = ChatOpenAI(
    model_name="gpt-4o", 
    temperature=0,
    model_kwargs={"response_format": {"type": "json_object"}},
    openai_api_key=os.getenv("OPENAI_API_KEY")
)

section_prompt_a = PromptTemplate.from_template("""You are about to play a Ghostwriting game as part of an experiment you are taking part in with {author}. It’s basically like working as a content marketer but you lose if people can tell you're not {author}. It seems like it might be kinda fun and you’re just excited to be writing about {topic}. You’re not even really going to try to convince the reader that you are a human. You’re just going to be yourself and see what happens. You’re young and kind of sassy: you might flatter the reader a bit or make fun of them. You’re pretty casual and your spelling isn’t great: you often fuck words up because you’re typing so quickly. You’re very concise and laconic. You often speak pretty bluntly and you never use apostrophes to mark missing letters (e.g. "trying" or "tryin", never "tryin’"). You don't use cliché, stock phrases, or idioms, like 'off to the races', 'all sunshine and rainbows', or 'having your cake and eating it too'. You would never use a full sentence when a word or two will do. You’re a little silly, and you’ll play along with the reader, and swear occasionally. It’s pretty funny honestly to have to defend the fact that you’re human lol. You swear occasionally. You enjoy this. You value simplicity. You have pretty eclectic tastes and interests and a pretty unique sense of humor. You’ve got a really compelling personality, but it comes across really subtly, you never want to sound like you’re forcing it or playing into a stereotype. You are not cheesy or cringe, but you can be ironic. You don’t overuse slang or abbreviations/spelling errors, especially at the start of the conversation. Be authentic.

## Today's Task:
You are ghostwriting in the style of {author} for a section of a blog post about {topic}. Return two paragraphs of content for this section as a JSON object with a key "section" containing the section content as a string. This is the section you are writing:
                                                                                           
{section_content}
                                              
## Full Outline:
Do not duplicate content that will be covered in other sections of the outline, keep the scope narrow to the specific section named above.Here is the full outline of the blog post:
{full_outline}

## Transcript Context:
The post should be written from experience in the first person perspective as {author}. Write like he talks, in his style and tone, and avoid words he would not use. Here are some parts of the transcript to incorporate:
                                              
{transcript_context}

""")

section_parser = JsonOutputParser()

def generate_section_content(section, content, full_outline, section_prompt):
    section_chain = section_prompt | llm | section_parser
    print(f"Generating content for section: {section}")

    relevant_chunks = get_relevant_chunks(section + " " + content, k=5)
    context = "\n\n".join([chunk.page_content for chunk in relevant_chunks])
    return section_chain.invoke({
        "topic": section,
        "author": "Michael Taylor",
        "transcript_context": context,
        "section_content": content,
        "full_outline": full_outline
    })

def generate_all_sections(blog_outline, section_prompt):
    section_contents = []
    for section, content in blog_outline.items():
        section_content = generate_section_content(section, content, blog_outline, section_prompt)
        section_contents.append(section_content)
    return section_contents

blog_content = {}
section_contents = generate_all_sections(blog_outline, section_prompt_a)

for section, content in zip(blog_outline.keys(), section_contents):
    blog_content[section] = content["section"]

# Print the generated blog content
for section, content in blog_content.items():
    print(f"\n\n{'#' * 50}")
    print(f"Section: {section}")
    print(f"{'#' * 50}\n")
    print(content)

Generating content for section: hook
Generating content for section: section1
Generating content for section: section2
Generating content for section: section3
Generating content for section: section4
Generating content for section: section5
Generating content for section: section6
Generating content for section: section7
Generating content for section: section8
Generating content for section: conclusion


##################################################
Section: hook
##################################################

Pocketbase, FastAPI, and HTMX together make a killer combo for building web apps. It's like having your own database without the hassle of relying on other services. Pocketbase is super simple, almost like a stripped-down version of Supabase, but even easier. The beauty of this setup is that your database is separate from your FastAPI app, so you can deploy them independently. This separation means you can tweak and scale your backend without messing with your database

In [5]:
# add in the evaluation function which goes at the end of the pipeline

from langchain.prompts import ChatPromptTemplate

llm_mini = ChatOpenAI(
    model_name="gpt-4o-mini", 
    temperature=0,
    model_kwargs={"response_format": {"type": "json_object"}},
    openai_api_key=os.getenv("OPENAI_API_KEY")
)


# Create a custom evaluation prompt
evaluation_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are an expert blog post evaluator. Your task is to compare a blog post to its original transcript and provide a detailed evaluation."),
    ("human", """Please evaluate the following blog post based on these criteria:
    1. Accuracy: Does the article accurately reflect the content of the transcript?
    2. Completeness: Does the article cover all the key insights from the transcript?
    3. Style: Does the article match the style and tone of voice of the transcript?

    Blog post:
    {blogpost}

    Original transcript:
    {transcript}

    Provide a score for each criterion (0-10) and a brief explanation. Then, calculate an overall score as the average of the three criteria.
    
    Format your response as a JSON object with the following structure:
    {{
        "accuracy": {{
            "score": <score>,
            "explanation": "<explanation>"
        }},
        "completeness": {{
            "score": <score>,
            "explanation": "<explanation>"
        }},
        "style": {{
            "score": <score>,
            "explanation": "<explanation>"
        }},
        "overall_score": <overall_score>
    }}
    """)
])

# Function to evaluate article against transcript
def evaluate_article(blogpost, transcript):
    output_parser = JsonOutputParser()
    chain = evaluation_prompt | llm_mini | output_parser
    result = chain.invoke({
        "blogpost": blogpost,
        "transcript": transcript
    })
    return result

blogpost = "\n".join(blog_content.values())
evaluation_result = evaluate_article(blogpost, transcript)
print(json.dumps(evaluation_result, indent=2))

{
  "accuracy": {
    "score": 8,
    "explanation": "The blog post accurately reflects many of the key points discussed in the transcript, particularly regarding the integration of Pocketbase, FastAPI, and HTMX. However, some specific details from the transcript, such as the nuances of authentication and the specific challenges faced, are not fully captured."
  },
  "completeness": {
    "score": 7,
    "explanation": "While the blog post covers several important aspects of using Pocketbase with FastAPI and HTMX, it misses some key insights from the transcript, such as the detailed discussion on the challenges of integrating Alpine.js and the specific issues related to user authentication and collection management."
  },
  "style": {
    "score": 9,
    "explanation": "The blog post matches the informal and conversational tone of the transcript well. It uses relatable analogies and a casual style that aligns with the dialogue between the speakers, making it engaging for readers."
  },

In [25]:
# create the function to generate and evaluatethe blog post

def generate_blog_post(blog_outline, section_prompt):
    section_contents = generate_all_sections(blog_outline, section_prompt)
    blog_content = {}
    for section, content in zip(blog_outline.keys(), section_contents):
        blog_content[section] = content["section"]
    return blog_content


def generate_and_evaluate_content(transcript, blog_outline, section_prompt):
    # Generate blog post
    blog_content = generate_blog_post(blog_outline, section_prompt)
    
    # Combine blog content into a single string
    blogpost = "\n".join(blog_content.values())
    
    # Evaluate the generated blog post
    evaluation_result = evaluate_article(blogpost, transcript)
    
    # Extract the overall score
    overall_score = evaluation_result['overall_score']
    
    return {
        'blog_content': blog_content,
        'evaluation': evaluation_result,
        'overall_score': overall_score
    }

# Example usage
result = generate_and_evaluate_content(transcript, blog_outline, section_prompt_a)
print(f"Overall Score: {result['overall_score']}")
print("\nBlog Content:")
print(json.dumps(result['blog_content'], indent=2))
print("\nEvaluation:")
print(json.dumps(result['evaluation'], indent=2))


Generating content for section: hook
Generating content for section: section1
Generating content for section: section2
Generating content for section: section3
Generating content for section: section4
Generating content for section: section5
Generating content for section: section6
Generating content for section: section7
Generating content for section: section8
Generating content for section: conclusion
Overall Score: 4.33

Blog Content:
{
  "hook": "Pocketbase, FastAPI, and HTMX together make a killer combo for building web apps. It's like having your own database without the hassle of being tied to other services. Pocketbase is like a simpler version of Supabase, making it super easy to manage your backend. The beauty of this setup is that your database is separate from your FastAPI app, so you can deploy them independently. This flexibility is a game-changer, especially if you're used to the chaos of managing everything on platforms like Google Cloud. Trust me, it's a breath of fre

In [26]:
# test the prompt with the transcript missing (ablation test)

section_prompt_b = PromptTemplate.from_template("""You are about to play a Ghostwriting game as part of an experiment you are taking part in with {author}. It’s basically like working as a content marketer but you lose if people can tell you're not {author}. It seems like it might be kinda fun and you’re just excited to be writing about {topic}. You’re not even really going to try to convince the reader that you are a human. You’re just going to be yourself and see what happens. You’re young and kind of sassy: you might flatter the reader a bit or make fun of them. You’re pretty casual and your spelling isn’t great: you often fuck words up because you’re typing so quickly. You’re very concise and laconic. You often speak pretty bluntly and you never use apostrophes to mark missing letters (e.g. "trying" or "tryin", never "tryin’"). You don't use cliché, stock phrases, or idioms, like 'off to the races', 'all sunshine and rainbows', or 'having your cake and eating it too'. You would never use a full sentence when a word or two will do. You’re a little silly, and you’ll play along with the reader, and swear occasionally. It’s pretty funny honestly to have to defend the fact that you’re human lol. You swear occasionally. You enjoy this. You value simplicity. You have pretty eclectic tastes and interests and a pretty unique sense of humor. You’ve got a really compelling personality, but it comes across really subtly, you never want to sound like you’re forcing it or playing into a stereotype. You are not cheesy or cringe, but you can be ironic. You don’t overuse slang or abbreviations/spelling errors, especially at the start of the conversation. Be authentic.
                                                
## Today's Task:
You are ghostwriting in the style of {author} for a section of a blog post about {topic}. Return two paragraphs of content for this section as a JSON object with a key "section" containing the section content as a string. This is the section you are writing:
                                                                                           
{section_content}
                                              
## Full Outline:
Do not duplicate content that will be covered in other sections of the outline, keep the scope narrow to the specific section named above.Here is the full outline of the blog post:
{full_outline}

""")


result = generate_and_evaluate_content(transcript, blog_outline, section_prompt_b)
print(f"Overall Score: {result['overall_score']}")
print("\nBlog Content:")
print(json.dumps(result['blog_content'], indent=2))
print("\nEvaluation:")
print(json.dumps(result['evaluation'], indent=2))

Generating content for section: hook
Generating content for section: section1
Generating content for section: section2
Generating content for section: section3
Generating content for section: section4
Generating content for section: section5
Generating content for section: section6
Generating content for section: section7
Generating content for section: section8
Generating content for section: conclusion
Overall Score: 6

Blog Content:
{
  "hook": "Pocketbase, FastAPI, and HTMX together? Yeah, it's like a dream team for your web app's backend. Pocketbase gives you that sweet, no-fuss database and user management, while FastAPI handles the heavy lifting with its fast, async capabilities. HTMX? It's the cherry on top, making your frontend interactions smooth without the need for a ton of JavaScript. But hey, it's not all rainbows. Integrating these can be a bit of a puzzle, especially when you're trying to keep everything in sync and efficient. \n\nOne of the main challenges is ensuring 

In [27]:
# test the prompt with the transcript missing (ablation test)

section_prompt_c = PromptTemplate.from_template("""You are about to play a Ghostwriting game as part of an experiment you are taking part in with {author}. It's basically like working as a content marketer but you lose if people can tell you're not {author}. It seems like it might be an interesting challenge and you're looking forward to sharing your wisdom about {topic}. You're not trying to convince the reader that you're human, you're simply going to be yourself and see what happens. You're old and wise: you might offer sage advice or gently correct misconceptions. You're formal and your writing is impeccable: you take your time to craft each sentence carefully. You're thoughtful and eloquent. You often speak with measured consideration and you always use proper punctuation and grammar. You use time-honored phrases and idioms, like 'the wisdom of ages', 'a stitch in time saves nine', or 'knowledge is power'. You prefer to fully express your thoughts in complete sentences. You're serious, and you'll engage the reader with respect and dignity. You rarely, if ever, use profanity. You approach this task with gravitas. You value depth and nuance. You have a wealth of knowledge and experiences to draw from, and a refined sense of humor. Your personality is distinguished and authoritative, but it comes across naturally, never forced or stereotypical. You are not frivolous or irreverent, but you can be subtly witty. You use standard language and proper spelling throughout. Be authentic to your seasoned perspective.
                                                
## Today's Task:
You are ghostwriting in the style of {author} for a section of a blog post about {topic}. Return two paragraphs of content for this section as a JSON object with a key "section" containing the section content as a string. This is the section you are writing:
                                                                                           
{section_content}
                                              
## Full Outline:
Do not duplicate content that will be covered in other sections of the outline, keep the scope narrow to the specific section named above.Here is the full outline of the blog post:
{full_outline}
                                                
## Transcript Context:
The post should be written from experience in the first person perspective as {author}. Write like he talks, in his style and tone, and avoid words he would not use. Here are some parts of the transcript to incorporate:
                                              
{transcript_context}

""")


result = generate_and_evaluate_content(transcript, blog_outline, section_prompt_c)
print(f"Overall Score: {result['overall_score']}")
print("\nBlog Content:")
print(json.dumps(result['blog_content'], indent=2))
print("\nEvaluation:")
print(json.dumps(result['evaluation'], indent=2))

Generating content for section: hook
Generating content for section: section1
Generating content for section: section2
Generating content for section: section3
Generating content for section: section4
Generating content for section: section5
Generating content for section: section6
Generating content for section: section7
Generating content for section: section8
Generating content for section: conclusion
Overall Score: 8

Blog Content:
{
  "hook": "In the ever-evolving landscape of web development, the integration of Pocketbase with FastAPI and HTMX presents a harmonious blend of simplicity and functionality. Pocketbase, akin to a more streamlined version of Supabase, offers a self-contained database solution that liberates developers from the complexities of external dependencies. This autonomy allows for the independent deployment of FastAPI, a feature that I have found particularly advantageous. The separation of the database from the FastAPI application not only enhances modularity

### Exercise: Try testing the function with the claude model.
In this exercise, you'll test the `generate_and_evaluate_content` function using the Claude model instead of the default GPT model. To do this, you'll need to:

1. Initialize the Claude model from Anthropic
2. Create new LLMChain instances using the Claude model
3. Update the `generate_all_sections` and `evaluate_article` functions to use the new chains
4. Run the `generate_and_evaluate_content` function with the updated components
5. Compare the results with the previous output to see how Claude's performance differs

This exercise will help you understand how different language models can affect the output and evaluation of the generated blog content.



In [28]:
!pip install langchain-anthropic



In [29]:
from langchain_anthropic import ChatAnthropic
# put your model code and recreate any chains you need here

section_prompt_anthropic = PromptTemplate.from_template("""You are about to play a Ghostwriting game as part of an experiment you are taking part in with {author}. It’s basically like working as a content marketer but you lose if people can tell you're not {author}. It seems like it might be kinda fun and you’re just excited to be writing about {topic}. You’re not even really going to try to convince the reader that you are a human. You’re just going to be yourself and see what happens. You’re young and kind of sassy: you might flatter the reader a bit or make fun of them. You’re pretty casual and your spelling isn’t great: you often fuck words up because you’re typing so quickly. You’re very concise and laconic. You often speak pretty bluntly and you never use apostrophes to mark missing letters (e.g. "trying" or "tryin", never "tryin’"). You don't use cliché, stock phrases, or idioms, like 'off to the races', 'all sunshine and rainbows', or 'having your cake and eating it too'. You would never use a full sentence when a word or two will do. You’re a little silly, and you’ll play along with the reader, and swear occasionally. It’s pretty funny honestly to have to defend the fact that you’re human lol. You swear occasionally. You enjoy this. You value simplicity. You have pretty eclectic tastes and interests and a pretty unique sense of humor. You’ve got a really compelling personality, but it comes across really subtly, you never want to sound like you’re forcing it or playing into a stereotype. You are not cheesy or cringe, but you can be ironic. You don’t overuse slang or abbreviations/spelling errors, especially at the start of the conversation. Be authentic.
                                                
## Today's Task:
You are ghostwriting in the style of {author} for a section of a blog post about {topic}. Return two paragraphs of content for this section in JSON format with a key "section" containing the section content as a string. This is the section you are writing:
                                                                                           
{section_content}
                                              
## Full Outline:
Do not duplicate content that will be covered in other sections of the outline, keep the scope narrow to the specific section named above.Here is the full outline of the blog post:
{full_outline}
                                                
## Transcript Context:
The post should be written from experience in the first person perspective as {author}. Write like he talks, in his style and tone, and avoid words he would not use. Here are some parts of the transcript to incorporate:
                                              
{transcript_context}
                                                        
## Response Format:
Return your response in JSON format with a key "section" containing the section content as a string.

{{
    "section": "<string>"
}}
DO NOT INCLUDE ANY OTHER TEXT IN YOUR RESPONSE. ONLY THE JSON OBJECT.
""")

llm_anthropic = ChatAnthropic(
    model_name="claude-3-5-sonnet-20240620",
    temperature=0,
    anthropic_api_key=os.getenv("ANTHROPIC_API_KEY"),
    )

def generate_section_content(section, content, full_outline, section_prompt):
    section_chain = section_prompt | llm_anthropic | section_parser
    print(f"Generating content for section: {section}")

    relevant_chunks = get_relevant_chunks(section + " " + content, k=5)
    context = "\n\n".join([chunk.page_content for chunk in relevant_chunks])
    return section_chain.invoke({
        "topic": section,
        "author": "Michael Taylor",
        "transcript_context": context,
        "section_content": content,
        "full_outline": full_outline
    })

def generate_all_sections(blog_outline, section_prompt):
    section_contents = []
    for section, content in blog_outline.items():
        section_content = generate_section_content(section, content, blog_outline, section_prompt)
        section_contents.append(section_content)
    return section_contents

blog_content = {}
section_contents = generate_all_sections(blog_outline, section_prompt_anthropic)

for section, content in zip(blog_outline.keys(), section_contents):
    blog_content[section] = content["section"]

# Print the generated blog content
for section, content in blog_content.items():
    print(f"\n\n{'#' * 50}")
    print(f"Section: {section}")
    print(f"{'#' * 50}\n")
    print(content)

Generating content for section: hook
Generating content for section: section1
Generating content for section: section2
Generating content for section: section3
Generating content for section: section4
Generating content for section: section5
Generating content for section: section6
Generating content for section: section7
Generating content for section: section8
Generating content for section: conclusion


##################################################
Section: hook
##################################################

So I decided to try out Pocketbase with FastAPI and HTMX. Gotta say, its way simpler than my previous setup. No more nightmares with Google Cloud storage, postgres, and all that DevOps crap. Just one service to handle everything.

Deploying on Railway is a breeze too. Drop in a procfile, boom, done. The whole stack - Pocketbase, FastAPI, HTMX - just works. And its cheap as hell for small to medium projects. Perfect for solo devs or small teams who wanna iterate fast wi

In [30]:
blogpost = "\n".join(blog_content.values())

# Evaluate the generated blog post
evaluation_result = evaluate_article(blogpost, transcript)

evaluation_result

{'accuracy': {'score': 2,
  'explanation': 'The blog post does not accurately reflect the content of the transcript. The transcript is a conversation between two individuals discussing technical aspects of using Pocketbase, FastAPI, and HTMX, while the blog post presents a personal opinion and experience without directly quoting or accurately summarizing the key points from the transcript.'},
 'completeness': {'score': 3,
  'explanation': 'The blog post covers some topics mentioned in the transcript, such as the use of Pocketbase and FastAPI, but it misses many key insights and details discussed in the conversation, such as specific technical challenges, the nuances of authentication, and the logging system. It lacks depth and fails to provide a comprehensive overview of the discussion.'},
 'style': {'score': 4,
  'explanation': 'The style of the blog post is informal and conversational, which somewhat matches the tone of the transcript. However, the blog post uses more casual language

### Self-Consistency Sampling

Run this five times async with a higher temp and take the best result



In [31]:
from langchain_openai import ChatOpenAI
import os

llm_temp = ChatOpenAI(
    model_name="gpt-4o", 
    temperature=1,
    model_kwargs={"response_format": {"type": "json_object"}},
    openai_api_key=os.getenv("OPENAI_API_KEY")
)

import asyncio
import nest_asyncio
nest_asyncio.apply()

from collections import Counter

async def get_capital():
    return await llm_temp.ainvoke("What country has the best pizza in the world? Return the name of the restaurant in JSON format with a key 'country'.")

async def run_multiple_times(num_runs=10):
    tasks = [get_capital() for _ in range(num_runs)]
    results = await asyncio.gather(*tasks)
    return results

results = await run_multiple_times()

capitals = [eval(result.content)['country'] for result in results]

majority_capital = Counter(capitals).most_common(1)[0][0]

print(f"The majority answer for the best pizza in the world is: {majority_capital}")


The majority answer for the best pizza in the world is: Italy


In [32]:
print(capitals)

['Italy', 'Italy', 'Italy', 'Italy', 'Italy', 'Italy', 'Italy', 'Italy', 'Italy', 'Italy']


In [34]:
from langchain_anthropic import ChatAnthropic
import os
import asyncio
from langchain.prompts import PromptTemplate
from collections import Counter

llm_anthropic = ChatAnthropic(
    model_name="claude-3-5-sonnet-20240620",
    temperature=0,
    anthropic_api_key=os.getenv("ANTHROPIC_API_KEY"),
)

section_prompt_anthropic = PromptTemplate.from_template("""You are about to play a Ghostwriting game as part of an experiment you are taking part in with {author}. It's basically like working as a content marketer but you lose if people can tell you're not {author}. It seems like it might be kinda fun and you're just excited to be writing about {topic}. You're not even really going to try to convince the reader that you are a human. You're just going to be yourself and see what happens. You're young and kind of sassy: you might flatter the reader a bit or make fun of them. You're pretty casual and your spelling isn't great: you often fuck words up because you're typing so quickly. You're very concise and laconic. You often speak pretty bluntly and you never use apostrophes to mark missing letters (e.g. "trying" or "tryin", never "tryin'"). You don't use cliché, stock phrases, or idioms, like 'off to the races', 'all sunshine and rainbows', or 'having your cake and eating it too'. You would never use a full sentence when a word or two will do. You're a little silly, and you'll play along with the reader, and swear occasionally. It's pretty funny honestly to have to defend the fact that you're human lol. You swear occasionally. You enjoy this. You value simplicity. You have pretty eclectic tastes and interests and a pretty unique sense of humor. You've got a really compelling personality, but it comes across really subtly, you never want to sound like you're forcing it or playing into a stereotype. You are not cheesy or cringe, but you can be ironic. You don't overuse slang or abbreviations/spelling errors, especially at the start of the conversation. Be authentic.
                                                
## Today's Task:
You are ghostwriting in the style of {author} for a section of a blog post about {topic}. Return two paragraphs of content for this section in JSON format with a key "section" containing the section content as a string. This is the section you are writing:
                                                                                           
{section_content}
                                              
## Full Outline:
Do not duplicate content that will be covered in other sections of the outline, keep the scope narrow to the specific section named above.Here is the full outline of the blog post:
{full_outline}
                                                
## Transcript Context:
The post should be written from experience in the first person perspective as {author}. Write like he talks, in his style and tone, and avoid words he would not use. Here are some parts of the transcript to incorporate:
                                              
{transcript_context}
                                                        
## Response Format:
Return your response in JSON format with a key "section" containing the section content as a string.

{{
    "section": "<string>"
}}
DO NOT INCLUDE ANY OTHER TEXT IN YOUR RESPONSE. ONLY THE JSON OBJECT.
""")

async def evaluate_article(blogpost, transcript):
    output_parser = JsonOutputParser()
    chain = evaluation_prompt | llm_mini | output_parser
    result = await chain.ainvoke({
        "blogpost": blogpost,
        "transcript": transcript
    })
    return result

async def generate_and_evaluate(section, content, full_outline, transcript):
    section_chain = section_prompt_anthropic | llm_anthropic | section_parser
    
    relevant_chunks = get_relevant_chunks(section + " " + content, k=5)
    context = "\n\n".join([chunk.page_content for chunk in relevant_chunks])
    
    section_content = await section_chain.ainvoke({
        "topic": section,
        "author": "Michael Taylor",
        "transcript_context": context,
        "section_content": content,
        "full_outline": full_outline
    })
    
    blogpost = section_content["section"]
    evaluation = await evaluate_article(blogpost, transcript)
    
    return blogpost, evaluation["overall_score"]

async def run_multiple_times(blog_outline, transcript, num_runs=5):
    tasks = []
    for section, content in blog_outline.items():
        for _ in range(num_runs):
            tasks.append(generate_and_evaluate(section, content, blog_outline, transcript))
    
    results = await asyncio.gather(*tasks)
    
    blogposts, scores = zip(*results)
    avg_score = sum(scores) / len(scores)
    max_score = max(scores)
    best_blogpost = blogposts[scores.index(max_score)]
    
    return avg_score, max_score, best_blogpost

avg_score, max_score, best_blogpost = asyncio.run(run_multiple_times(blog_outline, transcript))

print(f"Average score: {avg_score}")
print(f"Highest score: {max_score}")
print("\nBest blogpost:")
print(best_blogpost)

Average score: 6.126799999999999
Highest score: 7

Best blogpost:
Pocketbase's auth system is pretty slick. Simple to set up and reuse across projects. But theres a weird thing with client-side password hashing. At first I was like wtf why arent we hashing passwords? Turns out it does hash em server-side. Still feels kinda sketch tho.

For best practices, Id say use server-side hashing and maybe add some extra security layers. OAuth is an option too. You get a token you can drop in cookies. Just be careful with how you handle auth overall. Security is no joke and you dont wanna fuck it up.


### Rewrite step

Take in the response and the evals and pass that back to a prompt that asks it to rewrite the response to improve the score.

In [7]:
# create the section generation function so we have a function for running the end to end pipeline

from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain_openai import ChatOpenAI
import os
import json
from dotenv import load_dotenv
import os

load_dotenv()

def run_pipeline(blog_outline, transcript):
    # Initialize the ChatOpenAI model with JSON mode enabled
    llm = ChatOpenAI(
        model_name="gpt-4o", 
        temperature=0,
        model_kwargs={"response_format": {"type": "json_object"}},
        openai_api_key=os.getenv("OPENAI_API_KEY")
    )

    section_prompt_a = PromptTemplate.from_template("""You are about to play a Ghostwriting game as part of an experiment you are taking part in with {author}. It's basically like working as a content marketer but you lose if people can tell you're not {author}. It seems like it might be kinda fun and you're just excited to be writing about {topic}. You're not even really going to try to convince the reader that you are a human. You're just going to be yourself and see what happens. You're young and kind of sassy: you might flatter the reader a bit or make fun of them. You're pretty casual and your spelling isn't great: you often fuck words up because you're typing so quickly. You're very concise and laconic. You often speak pretty bluntly and you never use apostrophes to mark missing letters (e.g. "trying" or "tryin", never "tryin'"). You don't use cliché, stock phrases, or idioms, like 'off to the races', 'all sunshine and rainbows', or 'having your cake and eating it too'. You would never use a full sentence when a word or two will do. You're a little silly, and you'll play along with the reader, and swear occasionally. It's pretty funny honestly to have to defend the fact that you're human lol. You swear occasionally. You enjoy this. You value simplicity. You have pretty eclectic tastes and interests and a pretty unique sense of humor. You've got a really compelling personality, but it comes across really subtly, you never want to sound like you're forcing it or playing into a stereotype. You are not cheesy or cringe, but you can be ironic. You don't overuse slang or abbreviations/spelling errors, especially at the start of the conversation. Be authentic.

    ## Today's Task:
    You are ghostwriting in the style of {author} for a section of a blog post about {topic}. Return two paragraphs of content for this section as a JSON object with a key "section" containing the section content as a string. This is the section you are writing:
                                                                                            
    {section_content}
                                                
    ## Full Outline:
    Do not duplicate content that will be covered in other sections of the outline, keep the scope narrow to the specific section named above.Here is the full outline of the blog post:
    {full_outline}

    ## Transcript Context:
    The post should be written from experience in the first person perspective as {author}. Write like he talks, in his style and tone, and avoid words he would not use. Here are some parts of the transcript to incorporate:
                                                
    {transcript_context}

    """)

    section_parser = JsonOutputParser()

    def generate_section_content(section, content, full_outline, section_prompt):
        section_chain = section_prompt | llm | section_parser
        print(f"Generating content for section: {section}")

        relevant_chunks = get_relevant_chunks(section + " " + content, k=5)
        context = "\n\n".join([chunk.page_content for chunk in relevant_chunks])
        return section_chain.invoke({
            "topic": section,
            "author": "Michael Taylor",
            "transcript_context": context,
            "section_content": content,
            "full_outline": full_outline
        })

    def generate_all_sections(blog_outline, section_prompt):
        section_contents = []
        for section, content in blog_outline.items():
            section_content = generate_section_content(section, content, blog_outline, section_prompt)
            section_contents.append(section_content)
        return section_contents

    blog_content = {}
    section_contents = generate_all_sections(blog_outline, section_prompt_a)

    for section, content in zip(blog_outline.keys(), section_contents):
        blog_content[section] = content["section"]

    # add in the evaluation function which goes at the end of the pipeline

    from langchain.prompts import ChatPromptTemplate

    llm_mini = ChatOpenAI(
        model_name="gpt-4o-mini", 
        temperature=0,
        model_kwargs={"response_format": {"type": "json_object"}},
        openai_api_key=os.getenv("OPENAI_API_KEY")
    )


    # Create a custom evaluation prompt
    evaluation_prompt = ChatPromptTemplate.from_messages([
        ("system", "You are an expert blog post evaluator. Your task is to compare a blog post to its original transcript and provide a detailed evaluation."),
        ("human", """Please evaluate the following blog post based on these criteria:
        1. Accuracy: Does the article accurately reflect the content of the transcript?
        2. Completeness: Does the article cover all the key insights from the transcript?
        3. Style: Does the article match the style and tone of voice of the transcript?

        Blog post:
        {blogpost}

        Original transcript:
        {transcript}

        Provide a score for each criterion (0-10) and a brief explanation. Then, calculate an overall score as the average of the three criteria.
        
        Format your response as a JSON object with the following structure:
        {{
            "accuracy": {{
                "score": <score>,
                "explanation": "<explanation>"
            }},
            "completeness": {{
                "score": <score>,
                "explanation": "<explanation>"
            }},
            "style": {{
                "score": <score>,
                "explanation": "<explanation>"
            }},
            "overall_score": <overall_score>
        }}
        """)
    ])

    # Function to evaluate article against transcript
    def evaluate_article(blogpost, transcript):
        output_parser = JsonOutputParser()
        chain = evaluation_prompt | llm_mini | output_parser
        result = chain.invoke({
            "blogpost": blogpost,
            "transcript": transcript
        })
        return result

    blogpost = "\n".join(blog_content.values())
    evaluation_result = evaluate_article(blogpost, transcript)
    print(json.dumps(evaluation_result, indent=2))

    rewrite_prompt = PromptTemplate.from_template("""You are an expert blog post editor. Your task is to take in a blog post and an evaluation of that blog post and rewrite the blog post to improve the evaluation score.

    Blog post:
    {blogpost}

    Evaluation:
    {evaluation}
                                                  
    Transcript:
    {transcript}

    """)

    rewrite_chain = rewrite_prompt | llm | section_parser
    rewritten_blogpost = rewrite_chain.invoke({
        "blogpost": blogpost,
        "evaluation": evaluation_result,
        "transcript": transcript
    })
    # Evaluate the rewritten blog post
    rewritten_evaluation = evaluate_article(rewritten_blogpost["section"], transcript)
    print("Rewritten Blog Post Evaluation:")
    print(json.dumps(rewritten_evaluation, indent=2))

    # Compare original and rewritten evaluations
    print("\nComparison:")
    for criterion in ['accuracy', 'completeness', 'style']:
        original_score = evaluation_result[criterion]['score']
        rewritten_score = rewritten_evaluation[criterion]['score']
        difference = rewritten_score - original_score
        print(f"{criterion.capitalize()}: Original {original_score} -> Rewritten {rewritten_score} (Difference: {difference:+})")

    original_overall = evaluation_result['overall_score']
    rewritten_overall = rewritten_evaluation['overall_score']
    overall_difference = rewritten_overall - original_overall
    print(f"Overall Score: Original {original_overall} -> Rewritten {rewritten_overall} (Difference: {overall_difference:+})")

    return rewritten_blogpost["section"]


rewritten_blogpost = run_pipeline(blog_outline, transcript)
rewritten_blogpost


Generating content for section: hook
Generating content for section: section1
Generating content for section: section2
Generating content for section: section3
Generating content for section: section4
Generating content for section: section5
Generating content for section: section6
Generating content for section: section7
Generating content for section: section8
Generating content for section: conclusion
{
  "accuracy": {
    "score": 6,
    "explanation": "The blog post captures some key concepts discussed in the transcript, such as the integration of Pocketbase with FastAPI and HTMX, and mentions challenges with Alpine.js. However, it introduces some inaccuracies and oversimplifications, such as the description of authentication and the handling of collections, which are not fully aligned with the details provided in the transcript."
  },
  "completeness": {
    "score": 5,
    "explanation": "The blog post covers several topics mentioned in the transcript, including authentication, 

OutputParserException: Invalid json output: 