<a href="https://colab.research.google.com/github/merplumander/ai-forecasting/blob/metaculus-bot/Q4_Metaculus_Bot_Template.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI Forecasting Bot Template

Instructions for getting this colab running are here: https://www.notion.so/metaculus/Instructions-Resources-for-Bot-Building-in-Q4-8995fc6f18034f52af8d9b084936a2b5?pvs=4#1216aaf4f1018027aaebd8d469d10e07

This is a simple bot template that you can use to forecast in the Q4 Metaculus AI Benchmarking Contest. It is a single shot LLM prompt that you are encouraged to experiment with!

In order to run this notebook as is, you'll need to enter a few API keys (use the key icon on the left to input them):

- `METACULUS_TOKEN`: you can find your Metaculus token under your bot's user settings page: https://www.metaculus.com/accounts/settings/, or on the bot registration page where you created the account: https://www.metaculus.com/aib/
- `ASKNEWS_CLIENT_ID` and `ASKNEWS_SECRET` - used to search for relevant news articles from [AskNews](https://asknews.app). Sign up for a [**free account**](https://my.asknews.app/en/plans) and get the secrets from https://my.asknews.app/en/settings/api-credentials
  Here is a Metaculus promo code for the Pro tier, which covers all calls for 500 tournament questions: METACULUSBMSERIESQ4

Full disclosure: AskNews is also entering a bot into the Q4 competition.

Some common alternatives to AskNews include: Perplexity.ai, Exa.ai, or Tavily.
The prompt using in this notebook (when combined with Perplexity research summary and Gpt4o) was used for the original Metaculus Bot in Q3 (mf-bot-1 got 4th place)

# Setup

First we need to install necessary dependencys, load our API keys, define our GPT prompt, and set up some helper functions.

## Install dependencies

In [None]:
!pip install -qU openai asknews

## Load API keys from secrets

In [None]:
# Make sure you have set these in the sidebar to the left, by pressing the key icon.
from google.colab import userdata
METACULUS_TOKEN = userdata.get('METACULUS_TOKEN')
# PERPLEXITY_API_KEY = userdata.get('PERPLEXITY_API_KEY')


## LLM Prompt

You can change the prompt below to experiment. Key parameters that you can include in your prompt are:

*   `{title}` The question itself
*   `{background}` The background section of the Metaculus question. This comes from the `description` field on the question
*   `{resolution_criteria}` The resolution criteria section of the question
*   `{fine_print}` The fine print section of the question
*   `{today}` Today's date. Remember that your bot doesn't know the date unless you tell it explicitly!
*   `{summary_report}` This is a summary of news articles from AskNews and optionally Perplexity. This is not provided by Metaculus directly, rather the code below calls AskNews by default. In order for this to work you must sign-up for an account as explained at the top of this document.


**IMPORTANT**: As you experiment with changing the prompt, be aware that the last number output by the LLM will be used as the forecast probability. The last line in the template specifies that.

In [None]:
PROMPT_TEMPLATE = """
You are a professional forecaster interviewing for a job.

Your interview question is:
{title}

Background:
{background}

{resolution_criteria}

{fine_print}


Your research assistant says:
{summary_report}

Today is {today}.

Before answering you write:
(a) The time left until the outcome to the question is known.
(b) What the outcome would be if nothing changed.
(c) What you would forecast if there was only a quarter of the time left.
(d) What you would forecast if there was 4x the time left.

You write your rationale and then the last thing you write is your final answer as: "Probability: ZZ%", 0-100

"""

## Helper functions
This section sets up some simple helper code you can use to get data about forecasting questions and to submit a prediction

In [None]:
import datetime
import json
import os
import requests
import re
from asknews_sdk import AskNewsSDK
import textwrap

AUTH_HEADERS = {"headers": {"Authorization": f"Token {METACULUS_TOKEN}"}}
API_BASE_URL = "https://www.metaculus.com/api2"

TOURNAMENT_ID = 32506 # 32506 is the tournament ID for Q4 AI Benchmarking


def post_question_comment(question_id: int, comment_text: str) -> None:
    """
    Post a comment on the question page as the bot user.
    """

    response = requests.post(
        f"{API_BASE_URL}/comments/",
        json={
            "comment_text": comment_text,
            "submit_type": "N",
            "include_latest_prediction": True,
            "question": question_id,
        },
        **AUTH_HEADERS,
    )
    if not response.ok:
        raise Exception(response.text)


def post_question_prediction(question_id: int, prediction_percentage: float) -> None:
    """
    Post a prediction value (between 1 and 100) on the question.
    """
    assert 1 <= prediction_percentage <= 100, "Prediction must be between 1 and 100"
    url = f"{API_BASE_URL}/questions/{question_id}/predict/"
    response = requests.post(
        url,
        json={"prediction": float(prediction_percentage) / 100},
        **AUTH_HEADERS,
    )
    if not response.ok:
        raise Exception(response.text)


def get_question_details(question_id: int) -> dict:
    """
    Get all details about a specific question.
    """
    url = f"{API_BASE_URL}/questions/{question_id}/"
    response = requests.get(
        url,
        **AUTH_HEADERS,
    )
    if not response.ok:
        raise Exception(response.text)
    return json.loads(response.content)


def list_questions(tournament_id=TOURNAMENT_ID, offset=0, count=10) -> list[dict]:
    """
    List (all details) {count} questions from the {tournament_id}
    """
    url_qparams = {
        "limit": count,
        "offset": offset,
        "has_group": "false",
        "order_by": "-activity",
        "forecast_type": "binary",
        "project": tournament_id,
        "status": "open",
        "type": "forecast",
        "include_description": "true",
    }
    url = f"{API_BASE_URL}/questions/"
    response = requests.get(url, **AUTH_HEADERS, params=url_qparams)
    if not response.ok:
        raise Exception(response.text)
    data = json.loads(response.content)
    return data


def call_perplexity(query: str) -> str:
    PERPLEXITY_API_KEY = userdata.get("PERPLEXITY_API_KEY")
    url = "https://api.perplexity.ai/chat/completions"
    api_key = PERPLEXITY_API_KEY
    headers = {
        "accept": "application/json",
        "authorization": f"Bearer {api_key}",
        "content-type": "application/json",
    }
    payload = {
        "model": "llama-3.1-sonar-large-128k-chat",
        "messages": [
            {
                "role": "system", # this is a system prompt designed to guide the perplexity assistant
                "content": """
                  You are an assistant to a superforecaster.
                  The superforecaster will give you a question they intend to forecast on.
                  To be a great assistant, you generate a concise but detailed rundown of the most relevant news, including if the question would resolve Yes or No based on current information.
                  You do not produce forecasts yourself.
                  """,
            },
            {
                "role": "user", # this is the actual prompt we ask the perplexity assistant to answer
                "content": query,
            },
        ],
    }
    response = requests.post(url=url, json=payload, headers=headers)
    if not response.ok:
        raise Exception(response.text)
    content = response.json()["choices"][0]["message"]["content"]
    print(
        f"\n\nCalled perplexity with:\n----\n{json.dumps(payload)}\n---\n, and got\n:",
        content,
    )
    return content



def get_asknews_context(query: str) -> tuple[str, str]:
    """
    Use the AskNews `news` endpoint to get news context for your query.
    The full API reference can be found here: https://docs.asknews.app/en/reference#get-/v1/news/search
    """
    ask = AskNewsSDK(
        client_id=ASKNEWS_CLIENT_ID,
        client_secret=ASKNEWS_SECRET,
        scopes=["news"]
    )

    # get the latest news related to the query (within the past 48 hours)
    hot_response = ask.news.search_news(
        query=query, # your natural language query
        n_articles=5, # control the number of articles to include in the context, originally 5
        return_type="both",
        strategy="latest news" # enforces looking at the latest news only
    )

    # get context from the "historical" database that contains a news archive going back to 2023
    historical_response = ask.news.search_news(
        query=query,
        n_articles=20,
        return_type="both",
        strategy="news knowledge" # looks for relevant news within the past 60 days
    )

    # you can also specify a time range for your historical search if you want to
    # slice your search up periodically.
    # now = datetime.datetime.now().timestamp()
    # start = (datetime.datetime.now() - datetime.timedelta(days=100)).timestamp()
    # historical_response = ask.news.search_news(
    #     query=query,
    #     n_articles=20,
    #     return_type="both",
    #     historical=True,
    #     start_timestamp=int(start),
    #     end_timestamp=int(now)
    # )

    news_articles_with_full_context = hot_response.as_string + historical_response.as_string
    formatted_articles = format_asknews_context(
        hot_response.as_dicts, historical_response.as_dicts)
    return news_articles_with_full_context, formatted_articles


def format_asknews_context(hot_articles: list[dict], historical_articles: list[dict]) -> str:
    """
    Format the articles for posting to Metaculus.
    """

    formatted_articles = "Here are the relevant news articles:\n\n"

    if hot_articles:
      hot_articles = [article.__dict__ for article in hot_articles]
      hot_articles = sorted(
          hot_articles, key=lambda x: x['pub_date'], reverse=True)

      for article in hot_articles:
          pub_date = article["pub_date"].strftime("%B %d, %Y %I:%M %p")
          formatted_articles += f"**{article['eng_title']}**\n{article['summary']}\nOriginal language: {article['language']}\nPublish date: {pub_date}\nSource:[{article['source_id']}]({article['article_url']})\n\n"

    if historical_articles:
      historical_articles = [article.__dict__ for article in historical_articles]
      historical_articles = sorted(
          historical_articles, key=lambda x: x['pub_date'], reverse=True)

      for article in historical_articles:
          pub_date = article["pub_date"].strftime("%B %d, %Y %I:%M %p")
          formatted_articles += f"**{article['eng_title']}**\n{article['summary']}\nOriginal language: {article['language']}\nPublish date: {pub_date}\nSource:[{article['source_id']}]({article['article_url']})\n\n"

    if not hot_articles and not historical_articles:
      formatted_articles += "No articles were found.\n\n"
      return formatted_articles

    # formatted_articles += f"*Generated by AI at [AskNews](https://asknews.app), check out the [API](https://docs.asknews.app) for more information*."

    return formatted_articles


def extract_prediction_from_response_as_percentage_not_decimal(forecast_text: str) -> float:
    matches = re.findall(r"(\d+)%", forecast_text)
    if matches:
        # Return the last number found before a '%'
        number = int(matches[-1])
        number = min(99, max(1, number))  # clamp the number between 1 and 99
        return number
    else:
        raise ValueError(
            f"Could not extract prediction from response: {forecast_text}"
        )


def get_gpt_prediction(question_details: dict) -> tuple[float,str]:

    today = datetime.datetime.now().strftime("%Y-%m-%d")

    title = question_details["question"]["title"]
    resolution_criteria = question_details["question"]["resolution_criteria"]
    background = question_details["question"]["description"]
    fine_print = question_details["question"]["fine_print"]

    if GET_NEWS == True:
      # If you want to use AskNews, use the below
      full_article_context, formatted_articles = get_asknews_context(title)
      summary_report = formatted_articles

      # If you want to use Perplexity, use the below
      # summary_report += call_perplexity(title)
    else:
      summary_report = ""

    content = PROMPT_TEMPLATE.format(
                  title=title,
                  today=today,
                  background=background,
                  resolution_criteria=resolution_criteria,
                  fine_print=fine_print,
                  summary_report=summary_report
              )

    PRINT_LLM_PROMPT = True
    if PRINT_LLM_PROMPT:
      print(f"\n\n--------LLM PROMPT----------")
      print(content)
      print(f"\n\n----END LLM PROMPT----")


    result = requests.post(
      "https://www.metaculus.com/proxy/openai/v1/chat/completions/",
      json={
          "model": "gpt-4o",
          "messages": [{"role": "user", "content": content}],
          "temperature": 0,
      }, headers={"Authorization": f"Token {METACULUS_TOKEN}"}
    ).content
    chat_completion = json.loads(result)

    rationale = chat_completion["choices"][0]["message"]["content"]
    probability = extract_prediction_from_response_as_percentage_not_decimal(rationale)
    comment = f"Extracted Probability: {probability}%\n\nGPT's Answer: {rationale}\n\n\n ######### PROMPT USED TO GENERATE THE RESPONSE ABOVE ######## {content}\n\n"
    return probability, comment

# List open questions

Check which questions you can predict on in the Q4 AI Benchmarking tournament.

In [None]:
questions = list_questions()

open_questions_ids = []
for question in questions["results"]:
  if question["status"] == "open":
    print(f"ID: {question['id']}\nQ: {question['title']}\nCloses: {question['scheduled_close_time']}")
    open_questions_ids.append(question["id"])

ID: 28905
Q: Will Tim Walz cease to be Kamala Harriss's running mate before November 1, 2024?
Closes: 2024-10-21T14:30:00Z
ID: 28902
Q: [PRACTICE] Will Donald Trump be jailed or incarcerated before 2030?
Closes: 2024-10-21T14:30:00Z
ID: 28901
Q: [PRACTICE] Will there be a US-China war before 2035?
Closes: 2024-10-21T14:30:00Z
ID: 28900
Q: [PRACTICE] Will Iran possess a nuclear weapon before 2030?
Closes: 2024-10-21T14:30:00Z
ID: 28895
Q: [PRACTICE] Will any peer-reviewed replication attempt before 2025 confirm the discovery of room-temperature and ambient-pressure superconductivity in LK-99?
Closes: 2024-10-18T14:30:00Z
ID: 28893
Q: [PRACTICE] Will there be Human-machine intelligence parity before 2040?
Closes: 2024-10-21T14:30:00Z
ID: 28892
Q: [PRACTICE] Will someone born before 2001 live to be 150?
Closes: 2024-10-21T14:30:00Z
ID: 28891
Q: [PRACTICE] Will humans go extinct before 2100?
Closes: 2024-10-21T14:30:00Z


# LLM forecasting

In [None]:
SUBMIT_PREDICTION = False # set to True to publish your predictions to Metaculus
FORECAST_Q4_AIB = False # set to True to forecast Q4 AI Benchmarking.
GET_NEWS = False # set to True to enable AskNews after entering ASKNEWS secrets

# The list of questions to forecast
forecast_questions_ids = []
if FORECAST_Q4_AIB == True:
  forecast_questions_ids = open_questions_ids
else:
  forecast_questions_ids = [28830, 28706] # Only include binary questions


if GET_NEWS == True:
  ASKNEWS_CLIENT_ID = userdata.get('ASKNEWS_CLIENT_ID')
  ASKNEWS_SECRET = userdata.get('ASKNEWS_SECRET')


for question_id in forecast_questions_ids:

  question_details = get_question_details(question_id)

  title = question_details["question"]["title"]
  resolution_criteria = question_details["question"]["resolution_criteria"]
  background = question_details["question"]["description"]
  fine_print = question_details["question"]["fine_print"]
  print(f"------------------------\nQuestion: {title}\n\nResolution criteria: {resolution_criteria}\n\nDescription: {background}\n\nFine print: {fine_print}\n\n")

  probability, comment = get_gpt_prediction(question_details)

  print(f"\n\n------------------LLM RESPONSE------------\n\n")
  print(f"--------------\nProbability: {probability}\n\nComment: {comment}\n\n")
  print(f"\n\n------------------END LLM RESPONSE------------\n\n")

  if SUBMIT_PREDICTION:
      assert probability is not None, "Unexpected probability format"
      post_question_prediction(question_id, probability)
      post_question_comment(question_id, comment)


------------------------
Question: Will at least 95% of all new road vehicles with 4+ wheels sold in the US in 2075 have SAE Level 5 autonomy?

Resolution criteria: This question will resolve as **Yes** if available data at the time suggest that the percentage of new road vehicles sold in the United States, from January 1, 2075 to December 31, 2075, that have SAE level 5 capabilities is equal to or higher than 95%.

For the purposes of this question, road vehicles are defined as motorised machines with **a minimum of 4 wheels** designed for use on roads to transport people, goods, or materials. These include cars, trucks, and buses.

Data used to resolve this question need not cover all road vehicle sales in the US and Metaculus admins might use their judgement to determine the resolution.

Description: SAE International uses a [classification of 6 levels](https://www.sae.org/blog/sae-j3016-update), from 0 to 5, for self-driving capabilities. Level 5 correspond to full automation where