In [12]:
profiles = [
    r'''Matias Cattaneo
Position
Professor
Website
Matias Cattaneo's Site
Office Phone
(609) 258-8825
Email
cattaneo@princeton.edu
Office
230 - Sherrerd Hall
Bio/Description
Research Interests: Econometrics, statistics, machine learning, data science, causal inference, program evaluation, quantitative methods in the social, behavioral and biomedical sciences.''',
    r'''Jianqing Fan
Position
Frederick L. Moore Professor in Finance
Website
Jianqing Fan's Site
Office Phone
(609) 258-7924
Email
jqfan@princeton.edu
Office
205 - Sherrerd Hall
Bio/Description
Research Interests: High-dimensional statistics, Machine Learning, financial econometrics, computational biology, biostatistics, graphical and network modeling, portfolio theory, high-frequency finance, time series.''',
    r'''Jason Klusowski
Position
Assistant Professor
Website
Jason Klusowski's Site
Office Phone
(609) 258-5305
Email
jason.klusowski@princeton.edu
Office
327 - Sherrerd Hall
Bio/Description
Research Interests: Data science, statistical learning, deep learning, decision tree learning; high-dimensional statistics, information theory, statistical physics, network modeling'''
]

In [13]:
from openai import OpenAI
from dotenv import load_dotenv
import anthropic
import os

load_dotenv()

novita_client = OpenAI(
    api_key=os.getenv("novita_api_key"),
    base_url="https://api.novita.ai/openai"
)
openai_client = OpenAI(
    api_key=os.getenv("openai_api_key")
)

client = anthropic.Anthropic(api_key=os.getenv("claude_api_key"))


In [14]:
from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()

client = OpenAI(
    api_key=os.getenv("novita_api_key"),
    base_url="https://api.novita.ai/openai"
)

response = client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-instruct",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello, how are you?"}
    ],
    max_tokens=120000,
    temperature=0.7
)

print(response.choices[0].message.content)
print(response.usage)

Hello. I'm doing well, thanks for asking. I'm a large language model, so I don't have feelings or emotions like humans do, but I'm always happy to help with any questions or tasks you may have. How about you? How's your day going so far? Is there anything I can help you with or would you like to chat?
CompletionUsage(completion_tokens=74, prompt_tokens=47, total_tokens=121, completion_tokens_details=None, prompt_tokens_details=None)


In [15]:
# Step 1: Define the task prompt for analyzing professor profiles
task_prompt = """
You are an expert academic researcher. Please analyze the following professor profiles and provide:

1. A summary of each professor's research focus
2. Potential collaboration opportunities between them
3. Emerging research trends in their fields
4. Recommendations for interdisciplinary research projects

Here are the professor profiles:

{profiles}

Please provide a comprehensive analysis that would be valuable for academic planning and research strategy.
"""

# Format the prompt with the actual profiles
formatted_prompt = task_prompt.format(profiles="\n\n".join(profiles))
print("Task prompt created and formatted with profiles")


Task prompt created and formatted with profiles


In [16]:
# Step 2: Test the prompt with OpenAI
print("Testing with OpenAI GPT-4...")

openai_response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": formatted_prompt}
    ],
    max_tokens=2000,
    temperature=0.7
)

openai_output = openai_response.choices[0].message.content
openai_usage = openai_response.usage

print("OpenAI Response:")
print("=" * 50)
print(openai_output)
print("=" * 50)
print(f"Token usage: {openai_usage}")
print()


Testing with OpenAI GPT-4...
OpenAI Response:
1. Summary of each professor's research focus:

Matias Cattaneo's research primarily revolves around econometrics, statistics, machine learning, data science, causal inference, program evaluation, and quantitative methods in the social, behavioral and biomedical sciences. This suggests a comprehensive approach to data analysis and modeling, with a particular focus on developing methods that can help understand the impact of various programs or interventions.

Jianqing Fan's research interests lie in high-dimensional statistics, machine learning, financial econometrics, computational biology, biostatistics, graphical and network modeling, portfolio theory, high-frequency finance, and time series. This indicates a strong focus on applying statistical tools and techniques to finance and biology, developing models that can help understand complex systems and make predictions.

Jason Klusowski's research interests include data science, statistic

In [17]:
# Step 3: Test the same prompt with Llama via Novita
print("Testing with Llama 3.3 70B via Novita...")

llama_response = novita_client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-instruct",
    messages=[
        {"role": "user", "content": formatted_prompt}
    ],
    max_tokens=2000,
    temperature=0.7
)

llama_output = llama_response.choices[0].message.content
llama_usage = llama_response.usage

print("Llama Response:")
print("=" * 50)
print(llama_output)
print("=" * 50)
print(f"Token usage: {llama_usage}")
print()


Testing with Llama 3.3 70B via Novita...
Llama Response:
**Summary of Each Professor's Research Focus**

1. **Matias Cattaneo**: Professor Cattaneo's research focus is on econometrics, statistics, machine learning, and data science, with applications in social, behavioral, and biomedical sciences. His work on causal inference, program evaluation, and quantitative methods suggests a strong emphasis on developing and applying statistical techniques to understand complex phenomena.
2. **Jianqing Fan**: Professor Fan's research interests span high-dimensional statistics, machine learning, financial econometrics, and computational biology. His work in portfolio theory, high-frequency finance, and time series analysis indicates a strong background in finance and economics, with a focus on developing statistical and computational methods for analyzing complex financial data.
3. **Jason Klusowski**: Professor Klusowski's research focus is on data science, statistical learning, and deep learnin

In [18]:
# Step 4: Feed both outputs to GPT for comparison and prompt improvement
print("Analyzing outputs and generating improved prompt...")

comparison_prompt = f"""
You are an expert in prompt engineering and AI model optimization. I have two responses to the same prompt from different AI models:

ORIGINAL PROMPT:
{formatted_prompt}

OPENAI GPT-4 RESPONSE:
{openai_output}

LLAMA 3.3 70B RESPONSE:
{llama_output}

Please analyze these responses and:

1. Identify the key differences in quality, depth, and structure between the two responses
2. Determine what specific aspects of the Llama response could be improved
3. Generate a new, optimized prompt that would help Llama produce output closer in quality to the GPT-4 response
4. Explain your reasoning for the prompt improvements

Focus on making the prompt more specific, providing better structure guidance, and addressing any weaknesses you observe in the Llama response.
"""

comparison_response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": comparison_prompt}
    ],
    max_tokens=3000,
    temperature=0.3
)

comparison_output = comparison_response.choices[0].message.content
comparison_usage = comparison_response.usage

print("Analysis and Improved Prompt:")
print("=" * 50)
print(comparison_output)
print("=" * 50)
print(f"Token usage: {comparison_usage}")
print()


Analyzing outputs and generating improved prompt...
Analysis and Improved Prompt:
1. Key Differences in Quality, Depth, and Structure:

The GPT-4 response provides a more detailed analysis of each professor's research focus, potential collaboration opportunities, emerging research trends, and recommendations for interdisciplinary research projects. It also provides a more comprehensive understanding of the professors' research interests and how they could potentially collaborate. The structure of the GPT-4 response is more coherent and flows better, with each section logically leading to the next.

The Llama response, while also providing a detailed analysis, is somewhat more generic and lacks the depth of understanding displayed in the GPT-4 response. The structure is also less coherent, with the sections seeming more disjointed and less logically connected.

2. Specific Aspects of the Llama Response That Could Be Improved:

The Llama response could be improved by providing a more in-

In [19]:
# Step 5: Extract and test the improved prompt
print("Extracting improved prompt from analysis...")

# Extract the improved prompt from the comparison output
# This assumes the improved prompt is clearly marked in the response
import re

# Try to extract the improved prompt (this might need adjustment based on actual output format)
improved_prompt_match = re.search(r'IMPROVED PROMPT:|NEW PROMPT:|OPTIMIZED PROMPT:(.*?)(?=\n\n|\Z)', comparison_output, re.DOTALL | re.IGNORECASE)

if improved_prompt_match:
    improved_prompt = improved_prompt_match.group(1).strip()
else:
    # If no clear marker, try to extract the last substantial block of text
    lines = comparison_output.split('\n')
    improved_prompt = '\n'.join(lines[-20:]).strip()  # Take last 20 lines as fallback

print("Extracted Improved Prompt:")
print("=" * 50)
print(improved_prompt)
print("=" * 50)

# Test the improved prompt with Llama
print("\nTesting improved prompt with Llama...")

improved_llama_response = novita_client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-instruct",
    messages=[
        {"role": "user", "content": improved_prompt}
    ],
    max_tokens=2000,
    temperature=0.7
)

improved_llama_output = improved_llama_response.choices[0].message.content
improved_llama_usage = improved_llama_response.usage

print("Improved Llama Response:")
print("=" * 50)
print(improved_llama_output)
print("=" * 50)
print(f"Token usage: {improved_llama_usage}")
print()


Extracting improved prompt from analysis...
Extracted Improved Prompt:


Testing improved prompt with Llama...
Improved Llama Response:
It seems like you didn't ask a question or provide any text for me to respond to. Could you please provide more context or clarify what you would like to know? I'm here to help with any questions or topics you'd like to discuss.
Token usage: CompletionUsage(completion_tokens=51, prompt_tokens=35, total_tokens=86, completion_tokens_details=None, prompt_tokens_details=None)



In [20]:
# Summary and Results
print("PROMPT OPTIMIZATION TEST SUMMARY")
print("=" * 60)

print("\n1. ORIGINAL PROMPT:")
print("-" * 30)
print(formatted_prompt[:200] + "..." if len(formatted_prompt) > 200 else formatted_prompt)

print(f"\n2. OPENAI GPT-4 OUTPUT LENGTH: {len(openai_output)} characters")
print(f"   Token usage: {openai_usage}")

print(f"\n3. LLAMA 3.3 70B OUTPUT LENGTH: {len(llama_output)} characters") 
print(f"   Token usage: {llama_usage}")

print(f"\n4. ANALYSIS AND IMPROVED PROMPT LENGTH: {len(comparison_output)} characters")
print(f"   Token usage: {comparison_usage}")

print(f"\n5. IMPROVED LLAMA OUTPUT LENGTH: {len(improved_llama_output)} characters")
print(f"   Token usage: {improved_llama_usage}")

print("\n6. COMPARISON:")
print("-" * 30)
print("Original Llama vs Improved Llama:")
print(f"  Original: {len(llama_output)} chars, {llama_usage.total_tokens} tokens")
print(f"  Improved: {len(improved_llama_output)} chars, {improved_llama_usage.total_tokens} tokens")

print("\nTest completed! Check the outputs above to evaluate the effectiveness of the prompt optimization.")


PROMPT OPTIMIZATION TEST SUMMARY

1. ORIGINAL PROMPT:
------------------------------

You are an expert academic researcher. Please analyze the following professor profiles and provide:

1. A summary of each professor's research focus
2. Potential collaboration opportunities between t...

2. OPENAI GPT-4 OUTPUT LENGTH: 3255 characters
   Token usage: CompletionUsage(completion_tokens=541, prompt_tokens=356, total_tokens=897, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))

3. LLAMA 3.3 70B OUTPUT LENGTH: 4460 characters
   Token usage: CompletionUsage(completion_tokens=784, prompt_tokens=382, total_tokens=1166, completion_tokens_details=None, prompt_tokens_details=None)

4. ANALYSIS AND IMPROVED PROMPT LENGTH: 2630 characters
   Token usage: CompletionUsage(completion_tokens=456, prompt_tokens=1829, total_tokens=2