In [1]:
profiles = [
    r'''Matias Cattaneo
Position
Professor
Website
Matias Cattaneo's Site
Office Phone
(609) 258-8825
Email
cattaneo@princeton.edu
Office
230 - Sherrerd Hall
Bio/Description
Research Interests: Econometrics, statistics, machine learning, data science, causal inference, program evaluation, quantitative methods in the social, behavioral and biomedical sciences.''',
    r'''Jianqing Fan
Position
Frederick L. Moore Professor in Finance
Website
Jianqing Fan's Site
Office Phone
(609) 258-7924
Email
jqfan@princeton.edu
Office
205 - Sherrerd Hall
Bio/Description
Research Interests: High-dimensional statistics, Machine Learning, financial econometrics, computational biology, biostatistics, graphical and network modeling, portfolio theory, high-frequency finance, time series.''',
    r'''Jason Klusowski
Position
Assistant Professor
Website
Jason Klusowski's Site
Office Phone
(609) 258-5305
Email
jason.klusowski@princeton.edu
Office
327 - Sherrerd Hall
Bio/Description
Research Interests: Data science, statistical learning, deep learning, decision tree learning; high-dimensional statistics, information theory, statistical physics, network modeling'''
]

In [2]:
from openai import OpenAI
from dotenv import load_dotenv
import anthropic
import os

load_dotenv()

novita_client = OpenAI(
    api_key=os.getenv("novita_api_key"),
    base_url="https://api.novita.ai/openai"
)
openai_client = OpenAI(
    api_key=os.getenv("openai_api_key")
)

client = anthropic.Anthropic(api_key=os.getenv("claude_api_key"))


In [3]:
# Step 1: Define the task prompt for analyzing professor profiles
task_prompt = """
You are an expert academic researcher. Please analyze the following professor profiles and provide:

1. A summary of each professor's research focus
2. Potential collaboration opportunities between them
3. Emerging research trends in their fields
4. Recommendations for interdisciplinary research projects

Here are the professor profiles:

{profiles}

Please provide a comprehensive analysis that would be valuable for academic planning and research strategy.
"""

# Format the prompt with the actual profiles
formatted_prompt = task_prompt.format(profiles="\n\n".join(profiles))
print("Task prompt created and formatted with profiles")


Task prompt created and formatted with profiles


In [4]:
# Step 2: Test the prompt with OpenAI
print("Testing with Llama 3.3 70b via Novita...")

openai_response = novita_client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-instruct",
    messages=[
        {"role": "user", "content": formatted_prompt}
    ],
    max_tokens=2000,
    temperature=0.7
)

openai_output = openai_response.choices[0].message.content
openai_usage = openai_response.usage

print("Llama 3.3 Response:")
print("=" * 50)
print(openai_output)
print("=" * 50)
print(f"Token usage: {openai_usage}")
print()


Testing with Llama 3.3 70b via Novita...
Llama 3.3 Response:
**Summary of Each Professor's Research Focus**

1. **Matias Cattaneo**: Professor Cattaneo's research focus lies at the intersection of econometrics, statistics, machine learning, and data science, with a strong emphasis on causal inference, program evaluation, and quantitative methods in the social, behavioral, and biomedical sciences.
2. **Jianqing Fan**: Professor Fan's research interests span a broad range of topics, including high-dimensional statistics, machine learning, financial econometrics, computational biology, biostatistics, and graphical and network modeling, with applications in finance, biology, and other fields.
3. **Jason Klusowski**: Professor Klusowski's research focus is on data science, statistical learning, deep learning, and decision tree learning, with a particular interest in high-dimensional statistics, information theory, statistical physics, and network modeling.

**Potential Collaboration Opportu

In [5]:
# Step 3: Test the same prompt with Llama via Novita
print("Testing with Llama 3.1 8B via Novita...")

llama_response = novita_client.chat.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    messages=[
        {"role": "user", "content": formatted_prompt}
    ],
    max_tokens=2000,
    temperature=0.7
)

llama_output = llama_response.choices[0].message.content
llama_usage = llama_response.usage

print("Llama 3.1 Response:")
print("=" * 50)
print(llama_output)
print("=" * 50)
print(f"Token usage: {llama_usage}")
print()


Testing with Llama 3.1 8B via Novita...
Llama 3.1 Response:
Based on the provided professor profiles, here's a comprehensive analysis:

**Professor Profile Summaries:**

1. **Matias Cattaneo**: Professor Cattaneo's research focus lies at the intersection of econometrics, statistics, machine learning, and data science. His interests include causal inference, program evaluation, and quantitative methods in social, behavioral, and biomedical sciences. His work involves developing and applying statistical and machine learning techniques to analyze complex data and inform policy decisions.
2. **Jianqing Fan**: Professor Fan is a renowned expert in high-dimensional statistics, machine learning, and financial econometrics. His research encompasses topics like graphical and network modeling, portfolio theory, high-frequency finance, and time series analysis. Fan's work aims to develop new statistical and computational methods for analyzing and modeling complex data in finance, biology, and oth

In [6]:
# Step 4: Feed both outputs to GPT for comparison and prompt improvement
print("Analyzing outputs and generating improved prompt...")

comparison_prompt = f"""
You are an expert in prompt engineering and AI model optimization. I have two responses to the same prompt from different AI models:

ORIGINAL PROMPT:
{formatted_prompt}

LLAMA 3.3 70B RESPONSE:
{openai_output}

LLAMA 3.1 8B RESPONSE:
{llama_output}

Please analyze these responses and:

1. Identify the key differences in quality, depth, and structure between the two responses
2. Determine what specific aspects of the Llama response could be improved

Focus on making the prompt more specific, providing better structure guidance, and addressing any weaknesses you observe in the Llama response.
"""

comparison_response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": comparison_prompt}
    ],
    max_tokens=3000,
    temperature=0.3
)

comparison_output = comparison_response.choices[0].message.content
comparison_usage = comparison_response.usage

print("Analysis:")
print("=" * 50)
print(comparison_output)
print("=" * 50)
print(f"Token usage: {comparison_usage}")
print()

messages = [{"role": "user", "content": comparison_prompt}, {"role": "assistant", "content": comparison_output}, {'role': 'user', 'content': 'Please provide the improved prompt only.'}]

response = openai_client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    max_tokens=1000,
    temperature=0.3
)
print("Improved Prompt:")
print("=" * 50)
print(response.choices[0].message.content)

Analyzing outputs and generating improved prompt...
Analysis:
**1. Key Differences in Quality, Depth, and Structure between the Two Responses**

Quality:
- Both responses are of high quality, providing detailed and comprehensive analyses of the professors' research interests, potential collaborations, emerging trends, and interdisciplinary research projects. However, the Llama 3.3 70B response seems to provide a slightly more in-depth analysis, particularly in the section on potential collaboration opportunities and recommendations for interdisciplinary research projects.

Depth:
- The Llama 3.3 70B response goes into more depth in its analysis, providing more detailed descriptions of the professors' research interests and potential collaborations. It also provides more specific recommendations for interdisciplinary research projects, and includes an additional section on academic planning and research strategy that the Llama 3.1 8B response does not include.

Structure:
- Both respons

In [7]:
# Step 5: Extract and test the improved prompt

improved_prompt_match = response.choices[0].message.content
improved_prompt = improved_prompt_match.strip().strip('"').strip("'")

# Test the improved prompt with Llama
print("\nTesting improved prompt with Llama 3.1...")

improved_llama_response = novita_client.chat.completions.create(
    model="meta-llama/llama-3.1-8b-instruct",
    messages=[
        {"role": "user", "content": improved_prompt}
    ],
    max_tokens=2000,
    temperature=0.7
)

improved_llama_output = improved_llama_response.choices[0].message.content
improved_llama_usage = improved_llama_response.usage

print("Improved Llama Response:")
print("=" * 50)
print(improved_llama_output)
print("=" * 50)
print(f"Token usage: {improved_llama_usage}")
print()



Testing improved prompt with Llama 3.1...
Improved Llama Response:
**Professor Profile Analysis**

**1. One-sentence summary of each professor's research focus:**

- **Matias Cattaneo**: Professor Cattaneo's research focus is on developing and applying cutting-edge econometric and statistical methods to address complex social, behavioral, and biomedical science questions.
- **Jianqing Fan**: Professor Fan's research focus is on advancing high-dimensional statistics and machine learning techniques to analyze and model complex data structures in finance, biology, and other fields.
- **Jason Klusowski**: Professor Klusowski's research focus is on exploring the application of machine learning, deep learning, and statistical learning methods to analyze and understand complex data structures in various fields, including data science.

**2. Potential collaboration projects and research questions:**

- **Project 1:** "Causal Inference in High-Dimensional Settings"
  - Professor Cattaneo's exp

In [8]:
# Summary and Results
print("PROMPT OPTIMIZATION TEST SUMMARY")
print("=" * 60)

print("\n1. ORIGINAL PROMPT:")
print("-" * 30)
print(formatted_prompt[:200] + "..." if len(formatted_prompt) > 200 else formatted_prompt)

print(f"\n2. OPENAI GPT-4 OUTPUT LENGTH: {len(openai_output)} characters")
print(f"   Token usage: {openai_usage}")

print(f"\n3. LLAMA 3.3 70B OUTPUT LENGTH: {len(llama_output)} characters") 
print(f"   Token usage: {llama_usage}")

print(f"\n4. ANALYSIS AND IMPROVED PROMPT LENGTH: {len(comparison_output)} characters")
print(f"   Token usage: {comparison_usage}")

print(f"\n5. IMPROVED LLAMA OUTPUT LENGTH: {len(improved_llama_output)} characters")
print(f"   Token usage: {improved_llama_usage}")

print("\n6. COMPARISON:")
print("-" * 30)
print("Original Llama vs Improved Llama:")
print(f"  Original: {len(llama_output)} chars, {llama_usage.total_tokens} tokens")
print(f"  Improved: {len(improved_llama_output)} chars, {improved_llama_usage.total_tokens} tokens")

print("\nTest completed! Check the outputs above to evaluate the effectiveness of the prompt optimization.")


PROMPT OPTIMIZATION TEST SUMMARY

1. ORIGINAL PROMPT:
------------------------------

You are an expert academic researcher. Please analyze the following professor profiles and provide:

1. A summary of each professor's research focus
2. Potential collaboration opportunities between t...

2. OPENAI GPT-4 OUTPUT LENGTH: 5488 characters
   Token usage: CompletionUsage(completion_tokens=939, prompt_tokens=382, total_tokens=1321, completion_tokens_details=None, prompt_tokens_details=None)

3. LLAMA 3.3 70B OUTPUT LENGTH: 4421 characters
   Token usage: CompletionUsage(completion_tokens=767, prompt_tokens=382, total_tokens=1149, completion_tokens_details=None, prompt_tokens_details=None)

4. ANALYSIS AND IMPROVED PROMPT LENGTH: 3457 characters
   Token usage: CompletionUsage(completion_tokens=603, prompt_tokens=2179, total_tokens=2782, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tok