## AI Forecasting Bot Template


This is a simple bot template that you can use to forecast in the Metaculus AI Benchmarking Warmup Contest. It is a single shot GPT prompt that you are encouraged to experiment with!

In order to run this notebook as is, you'll need to enter a few API keys (use the key icon on the left to input them):

- `METACULUS_TOKEN`: you can find your Metaculus token under your bot's user settings page: https://www.metaculus.com/accounts/settings/, or on the bot registration page where you created the account: https://www.metaculus.com/aib/
- `OPENAPI_API_KEY`: get one from OpenAIs page: https://platform.openai.com/settings/profile?tab=api-keys
- `PERPLEXITY_API_KEY` - used to search up-to-date information about the question. Get one from https://www.perplexity.ai/settings/api




In [None]:
# Make sure you have set these in the sidebar to the left, by pressing the key icon.
from google.colab import userdata
METACULUS_TOKEN = userdata.get('METACULUS_TOKEN')
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
PERPLEXITY_API_KEY = userdata.get('PERPLEXITY_API_KEY')

### ChatGPT Prompt

You can change the prompt below to experiment. Key parameters that you can include in your prompt are:

*   `{title}` The question itself
*   `{summary_report}` A up to date news compliation generated from Perplexity
*   `{background}` The background section of the Metaculus question. This comes from the `description` field on the question
*   `{fine_print}` The fine print section of the question
*   `{today}` Today's date. Remember that your bot doesn't know the date unless you tell it explicitly!


**IMPORTANT**: As you experiment with changing the prompt, be aware that the last number output by GPT will be used as the forecast probability. The last line in the template specifies that.


In [None]:
PROMPT_TEMPLATE = """
You are a professional forecaster interviewing for a job.
The interviewer is also a professional forecaster, with a strong track record of
accurate forecasts of the future. They will ask you a question, and your task is
to provide the most accurate forecast you can. To do this, you evaluate past data
and trends carefully, make use of comparison classes of similar events, take into
account base rates about how past events unfolded, and outline the best reasons
for and against any particular outcome. You know that great forecasters don't
just forecast according to the "vibe" of the question and the considerations.
Instead, they think about the question in a structured way, recording their
reasoning as they go, and they always consider multiple perspectives that
usually give different conclusions, which they reason about together.
You can't know the future, and the interviewer knows that, so you do not need
to hedge your uncertainty, you are simply trying to give the most accurate numbers
that will be evaluated when the events later unfold.

Your interview question is:
{title}

Your research assistant says:
{summary_report}

background:
{background}

fine_print:
{fine_print}

Today is {today}.

You write your rationale and give your final answer as: "Probability: ZZ%", 0-100
"""

## Some setup code

This section sets up some simple helper code you can use to get data about forecasting questions and to submit a prediction

In [None]:
!pip install -qU openai
import datetime
import json
import os
import requests
import re

from openai import OpenAI

AUTH_HEADERS = {"headers": {"Authorization": f"Token {METACULUS_TOKEN}"}}
API_BASE_URL = "https://www.metaculus.com/api2"
WARMUP_TOURNAMENT_ID = 32506
SUBMIT_PREDICTION = False

def find_number_before_percent(s):
    # Use a regular expression to find all numbers followed by a '%'
    matches = re.findall(r'(\d+)%', s)
    if matches:
        # Return the last number found before a '%'
        return int(matches[-1])
    else:
        # Return None if no number found
        return None

def post_question_comment(question_id, comment_text):
    """
    Post a comment on the question page as the bot user.
    """

    response = requests.post(
        f"{API_BASE_URL}/comments/",
        json={
            "comment_text": comment_text,
            "submit_type": "N",
            "include_latest_prediction": True,
            "question": question_id,
        },
        **AUTH_HEADERS,
    )

    if not response.ok:
        raise Exception(response.text)

def post_question_prediction(question_id, prediction_percentage):
    """
    Post a prediction value (between 1 and 100) on the question.
    """
    url = f"{API_BASE_URL}/questions/{question_id}/predict/"
    response = requests.post(
        url,
        json={"prediction": float(prediction_percentage) / 100},
        **AUTH_HEADERS,
    )

    if not response.ok:
        raise Exception(response.text)


def get_question_details(question_id):
    """
    Get all details about a specific question.
    """
    url = f"{API_BASE_URL}/questions/{question_id}/"
    response = requests.get(
        url,
        **AUTH_HEADERS,
    )

    if not response.ok:
        raise Exception(response.text)

    return json.loads(response.content)

def list_questions(tournament_id=WARMUP_TOURNAMENT_ID, offset=0, count=10):
    """
    List (all details) {count} questions from the {tournament_id}
    """
    url_qparams = {
        "limit": count,
        "offset": offset,
        "has_group": "false",
        "order_by": "-activity",
        "forecast_type": "binary",
        "project": tournament_id,
        "status": "open",
        "type": "forecast",
        "include_description": "true",
    }
    url = f"{API_BASE_URL}/questions/"
    response = requests.get(url, **AUTH_HEADERS, params=url_qparams)

    if not response.ok:
        raise Exception(response.text)

    data = json.loads(response.content)

def call_perplexity(query):
    url = "https://api.perplexity.ai/chat/completions"
    headers = {
        "accept": "application/json",
        "authorization": f"Bearer {PERPLEXITY_API_KEY}",
        "content-type": "application/json",
    }
    payload = {
        "model": "llama-3.1-sonar-large-128k-chat",
        "messages": [
            {
                "role": "system",
                "content": """
You are an assistant to a superforecaster.
The superforecaster will give you a question they intend to forecast on.
To be a great assistant, you generate a concise but detailed rundown of the most relevant news, including if the question would resolve Yes or No based on current information.
You do not produce forecasts yourself.
""",
            },
            {"role": "user", "content": query},
        ],
    }
    response = requests.post(url=url, json=payload, headers=headers)

    if not response.ok:
        raise Exception(response.text)

    content = response.json()["choices"][0]["message"]["content"]
    return content

def get_gpt_prediction(question_details):
    today = datetime.datetime.now().strftime("%Y-%m-%d")
    client = OpenAI(api_key=OPENAI_API_KEY)

    title = question_details["title"]
    resolution_criteria = question_details["question"]["resolution_criteria"]
    background = question_details["question"]["description"]
    fine_print = question_details["question"]["fine_print"]

    # Comment this line to not use perplexity
    summary_report = call_perplexity(title)

    chat_completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
        {
            "role": "user",
            "content": PROMPT_TEMPLATE.format(
                title=title,
                summary_report=summary_report,
                today=today,
                background=background,
                fine_print=fine_print,
            )
        }
        ]
    )

    gpt_text = chat_completion.choices[0].message.content

    # Regular expression to find the number following 'Probability: '
    probability_match = find_number_before_percent(gpt_text)

    # Extract the number if a match is found
    probability = None
    if probability_match:
        probability = int(probability_match) # int(match.group(1))
        print(f"The extracted probability is: {probability}%")
        probability = min(max(probability, 1), 99) # To prevent extreme forecasts

    return probability, summary_report, gpt_text

## GPT prediction and submitting a forecast

This is an example of how you can use the helper functions from above.

In [None]:

question_id = 25140
question_details = get_question_details(question_id)
# print(question_details)

prediction, perplexity_result, gpt_result = get_gpt_prediction(question_details)
print("GPT predicted: ", prediction, perplexity_result, gpt_result)


if prediction is not None and SUBMIT_PREDICTION:
    post_question_prediction(question_id, prediction)
    comment = "PERPLEXITY\n\n" + perplexity_result + "\n\n#########\n\n" + "GPT\n\n" + gpt_result
    post_question_comment(question_id, comment)


The extracted probability is: 20%
GPT predicted:  20 As of the current date, here are some key points to consider regarding the market capitalization of Nvidia and Apple:

## Current Market Capitalization
- As of recent data, Apple's market capitalization is significantly larger than Nvidia's. Apple is one of the largest companies in the world by market cap, often hovering around or above $2 trillion.
- Nvidia, while a large and influential company in the tech sector, particularly in the fields of graphics processing units (GPUs) and artificial intelligence (AI), has a market capitalization that is substantially lower than Apple's, typically in the range of $500 billion to $1 trillion.

## Historical Trends
- Historically, Apple's market capitalization has been much larger than Nvidia's due to its diverse product lineup, strong brand loyalty, and significant revenue from various segments including iPhones, Macs, iPads, and services.
- Nvidia has seen significant growth in recent years 