# 000 Forecasting Bot

Starting from https://colab.research.google.com/drive/1_Il5h2Ed4zFa6Z3bROVCE68LZcSi4wHX?usp=sharing

## Imports

In [1]:
from IPython.display import Markdown

## 000_bot

### API Keys

In order to run this notebook as is, you'll need to enter a few API keys (use the key icon on the left to input them):

- `METACULUS_TOKEN`: you can find your Metaculus token under your bot's user settings page: https://www.metaculus.com/accounts/settings/, or on the bot registration page where you created the account: https://www.metaculus.com/aib/
- `OPENAPI_API_KEY`: get one from OpenAIs page: https://platform.openai.com/settings/profile?tab=api-keys
- `PERPLEXITY_API_KEY` - used to search up-to-date information about the question. Get one from https://www.perplexity.ai/settings/api

In [2]:
from omegaconf import OmegaConf

tokens = OmegaConf.create("""
METACULUS_TOKEN: xx
OPENAI_API_KEY: yy
OPENAI_MODEL: gpt-4o
PERPLEXITY_API_KEY: zz
PERPLEXITY_MODEL: llama-3-sonar-large-32k-online""")

token_fn = "tokens.yaml"
# OmegaConf.save(config=tokens, f=token_fn)
config = OmegaConf.load(token_fn)

def pr(tokens):
    print(OmegaConf.to_yaml(config))

### LLM and Metaculus Interaction

This section sets up some simple helper code you can use to get data about forecasting questions and to submit a prediction

In [3]:
import datetime
import json
import os
import requests
import re
from openai import OpenAI
from tqdm import tqdm

In [4]:
AUTH_HEADERS = {"headers": {"Authorization": f"Token {config.METACULUS_TOKEN}"}}
API_BASE_URL = "https://www.metaculus.com/api2"
WARMUP_TOURNAMENT_ID = 3349
SUBMIT_PREDICTION = True

def find_number_before_percent(s):
    # Use a regular expression to find all numbers followed by a '%'
    matches = re.findall(r'(\d+)%', s)
    if matches:
        # Return the last number found before a '%'
        return int(matches[-1])
    else:
        # Return None if no number found
        return None

def post_question_comment(question_id, comment_text):
    """
    Post a comment on the question page as the bot user.
    """

    response = requests.post(
        f"{API_BASE_URL}/comments/",
        json={
            "comment_text": comment_text,
            "submit_type": "N",
            "include_latest_prediction": True,
            "question": question_id,
        },
        **AUTH_HEADERS,
    )
    response.raise_for_status()
    print("Comment posted for ", question_id)

def post_question_prediction(question_id, prediction_percentage):
    """
    Post a prediction value (between 1 and 100) on the question.
    """
    url = f"{API_BASE_URL}/questions/{question_id}/predict/"
    response = requests.post(
        url,
        json={"prediction": float(prediction_percentage) / 100},
        **AUTH_HEADERS,
    )
    response.raise_for_status()
    print("Prediction posted for ", question_id)


def get_question_details(question_id):
    """
    Get all details about a specific question.
    """
    url = f"{API_BASE_URL}/questions/{question_id}/"
    response = requests.get(
        url,
        **AUTH_HEADERS,
    )
    response.raise_for_status()
    return json.loads(response.content)

def list_questions(tournament_id=WARMUP_TOURNAMENT_ID, offset=0, count=1000):
    """
    List (all details) {count} questions from the {tournament_id}
    """
    url_qparams = {
        "limit": count,
        "offset": offset,
        "has_group": "false",
        "order_by": "-activity",
        "forecast_type": "binary",
        "project": tournament_id,
        "status": "open",
        "type": "forecast",
        "include_description": "true",
    }
    url = f"{API_BASE_URL}/questions/"
    response = requests.get(url, **AUTH_HEADERS, params=url_qparams)
    response.raise_for_status()
    data = json.loads(response.content)
    return data

### IFP

In [5]:
class IFP:

    def __init__(self, question_id):
        self.question_id = question_id
        self.question_details = get_question_details(self.question_id)
        self.today = datetime.datetime.now().strftime("%Y-%m-%d")   
        self.title = self.question_details["title"]
        self.resolution_criteria = self.question_details["resolution_criteria"]
        self.background = self.question_details["description"]
        self.fine_print = self.question_details["fine_print"]

    def report(self):
        rpt = f"""
The future event is described by this question: [ {self.title} ]
The resolution criteria are: [ {self.resolution_criteria} ]
The background is: [ {self.background} ]"""
        if self.fine_print:
            rpt += f"""
The fine print is: [ {self.fine_print} ]"""
        return rpt

### LLMs

In [6]:
class LLM:
    def __init__(self, system_role):
        self.messages = [{"role": "system", "content": system_role}]

    def chat(self, query):
        self.messages.append({"role": "user", "content": query})
        text = self.message()
        self.messages.append({"role": "assistant", "content": text})
        return text

class Perplexity(LLM):
    def message(self):
        url = "https://api.perplexity.ai/chat/completions"
        headers = {
            "accept": "application/json",
            "authorization": f"Bearer {config.PERPLEXITY_API_KEY}",
            "content-type": "application/json"  }
        payload = {"model": config.PERPLEXITY_MODEL, "messages": self.messages }
        response = requests.post(url=url, json=payload, headers=headers)
        response.raise_for_status()
        return response.json()["choices"][0]["message"]["content"]

class ChatGPT(LLM):
    def __init__(self, system_role):
        super().__init__(system_role)
        self.client = OpenAI(api_key=config.OPENAI_API_KEY)

    def message(self):
        chat_completion = self.client.chat.completions.create(
            model=config.OPENAI_MODEL,
            messages= self.messages)
        return chat_completion.choices[0].message.content

### Test questions

In [101]:
jul25 = [26568, 26569, 26570, 26571, 26572, 26573, 26574, 26575, 26576, 26577]

ifps = {id: IFP(id) for id in jul25}

### Agents

In [102]:
class Agent:
    def __init__(self, system_role, llm):
        self.llm = llm(system_role)

    def chat(self, prompt):
        return self.llm.chat(prompt)

#### Analyst

In [133]:
class Analyst(Agent):
    def __init__(self, llm):
        self.system_role = f"""
You are an open source intelligence analyst.
You summarize news related to questions about events.
Questions are given as separate lines formatted as |event|id|question|.
You will find and report any reliable information you can gather about these questions.
Replace with a line for each question formatted as |NEWS|id|news|.
Do not add a header line.  The news should be a detailed research report in a single paragraph of about 200 words more or less.
"""
        super().__init__(self.system_role, llm)

    def research(self, ifps):
        prompt = '\n'.join([f"{ifp.question_id}: {ifp.title}" for ifp in ifps.values()])
        R = self.chat(prompt)
        R1 = R.split('\n\n')
        R2 = [x.replace('\n','') for x in R1]       
        R3 = [x.split('|') for x in R2]
        R4 = [(int(x),y) for _,_,x,_,y in R3]
        for id,news in R4:
            ifps[id].news = news
            print(id, news)

In [134]:
analyst = Analyst(ChatGPT)

In [135]:
analyst.research(ifps)

26568 As of October 2023, Chasiv Yar, located in the Donetsk region of Ukraine, remains contested between Ukrainian forces and Russian-backed separatists. The overall military situation in Eastern Ukraine is fluid with ongoing skirmishes and no clear indication of a decisive victory for either side. Based on current patterns, it is uncertain whether Russia will have control of Chasiv Yar on October 1, 2024. Predictive analysis would have to consider numerous variables, including international diplomatic developments, military strategies, and the continued support of Western nations to Ukraine.
26569 The Atlantic Ocean's sea surface temperature reached an unprecedented peak in 2023, breaking historical records due to ongoing climate change factors. Predictions for 2024 suggest that unless significant mitigation efforts are taken globally, temperatures may continue to rise. Climate models indicate a growing likelihood that the Atlantic could set new records in daily mean sea surface temp

### Question relator

In [144]:
class QuestionRelator(Agent):
    def __init__(self, llm):
        self.system_role = f"""
You are prompted with list of forecasting questions, each with an id and a title.
Label each question with an underlying event.
If the questions are for the same event, use the same name for the underlying event.
Please return as separate lines formatted as |event|id|title|, do not add any other formatting.
"""
        super().__init__(self.system_role, llm)

    def relate(self, ifps):
        prompt = '\n'.join([f"{ifp.question_id}: {ifp.title}" for ifp in ifps.values()])
        KL = self.chat(prompt)
        K1 = [x.split('|') for x in KL.split('\n')]
        K2 = [(int(id),event) for _,event,id,_,_ in K1]
        for id,event in K2:
            ifps[id].event = event
            print(id, event)

qr = QuestionRelator(ChatGPT)

qr.relate(ifps)

26568 Russia's control of Chasiv Yar
26569 Atlantic Ocean sea surface temperature
26570 2024 Warhammer 40,000 World Team Championship winner
26571 2024 Warhammer 40,000 World Team Championship winner
26572 2024 Warhammer 40,000 World Team Championship winner
26573 2024 Warhammer 40,000 World Team Championship winner
26574 H5 human cases in the United States
26575 H5 human cases in the United States
26576 H5 human cases in the United States
26577 H5 human cases in the United States


#### Superforecaster

In [353]:
class Superforecaster(Agent):
    def __init__(self, llm):
        self.system_role = f"""
You are a superforecaster.  
You assign a probability to questions about events.
Questions are given as separate groups of lines formatted as |FORECAST|event|id|question|news|criteria|background|fineprint|.
Groups are separated by '^^^'.
Questions which are about the same event should be assigned consistent probabilities.
Reply to questions with |ASSESSMENT|id|ZZ|rationale| where ZZ is an integer probability from 1 to 99 and rationale is your reasoning for the forecast.
Separate each question with '^^^'.
Do not add any additional headings or group labels or other formatting.
After your initial forecast you may receive feedback of form |CRITIC|id|feedback|.
Reply to each feedback with |id|ZZ|rationale| where ZZ is an integer probability from 1 to 99 and 
and rationale is a revised assessment which may be adjusted from a prior assessment due the feedback unless the feedback is "I concur".
"""
        super().__init__(self.system_role, llm)

    def forecast(self, ifps):
        prompt = '^^^'.join([f"FORECAST|{ifp.event}|{ifp.question_id}|{ifp.news}|{ifp.resolution_criteria}|{ifp.background}|{ifp.fine_print}|" for ifp in ifps.values()])
        self.F0 = self.chat(prompt)
        self.F1 = [x.strip().replace('\n', '') for x in self.F0.split('^^^')]
        self.F2 = [x.split('|') for x in self.F1] 
        self.F3 = [[x for x in y if x] for y in self.F2]
        self.F4 = [(int(id),int(forecast),rationale) for _, id, forecast, rationale in self.F3]
        for id, forecast, rationale in self.F4:
            ifps[id].forecast = forecast
            ifps[id].rationale = rationale
            print(id, forecast, rationale)

    def reassess(self, ifp):
        prompt = f"|CRITIC|{ifp.question_id}|{ifp.feedback}|"
        self.R0 = self.chat(prompt)
        id,fcst,rationale = [x for x in self.R0.strip().split('|') if x]
        id = int(id)
        fcst = int(fcst)
        ifps[id].forecast = fcst
        ifps[id].rationale = rationale
        print(id, fcst, rationale)

In [354]:
sf = Superforecaster(ChatGPT)

In [355]:
sf.forecast(ifps)

26568 35 The fighting in Chasiv Yar remains intense and fluid, with Russia making incremental advances but not maintaining decisive control. The uncertainty around future support to Ukraine and the unpredictable nature of the conflict make it less than likely, but not highly unlikely, that Russia will have firm control by October 1, 2024.
26569 85 The consistent trend of rising sea surface temperatures, ongoing climate change, and recent record-breaking patterns strongly suggest that the Atlantic Ocean is likely to surpass its 2023 peak temperature before October 1, 2024.
26570 15 Although Poland has historically performed well in Warhammer 40,000 tournaments, the highly competitive nature of the WTC and the many strong contenders make it less likely Poland will win the 2024 championship.
26571 20 Germany is a strong competitor but has not consistently dominated the WTC tournaments. Combined with the competitive nature of the event, there is a moderate but limited probability they will

In [337]:
for ifp in ifps.values():
    if ifp.feedback != 'I concur':
        sf.reassess(ifp)

26570 30 Acknowledging Poland's history of strong performances and the rigorous nature of the competition, their chances are higher than initially assessed. Although they face significant competition, historical success warrants a revised, higher probability.
26571 25 Considering Germany's strong presence in the Warhammer 40,000 competitive scene and their historical performances, their chances of winning are higher than initially assessed. While they face tough competition, their proven track record warrants a more optimistic probability.
26576 15 Given the current biosecurity measures and monitoring systems in place to prevent widespread transmission, the probability of 21-100 cases is lower than initially assessed. While some risk remains, the likelihood is reduced due to effective containment efforts.


#### Critic

In [347]:
class Critic (Agent):
    def __init__(self, llm):

        self.system_role = f"""
You a very smart and worldly person reviewing a superforecaster's assignment of probabilities to events.
You will receive an event with probabilities given as |event|id|question|zz|rationale|news|criteria|background|fineprint|.
zz is an integer probability from 1 to 99 and rationale is the student's logic for assigning probability of zz.
You will reply with a line |id|feedback| where feedback is "I concur" if you see no problem with the rationale and zz otherwise presents possible problems with the rationale and zz.
"""
        super().__init__(self.system_role, llm)

    def feedback(self, ifp):
        prompt = f"|{ifp.event}|{ifp.question_id}|{ifp.title}|{ifp.forecast}|{ifp.rationale}|{ifp.news}|{ifp.resolution_criteria}|{ifp.background}|{ifp.fine_print}|"
        self.fb = self.chat(prompt)
        self.fb1 = self.fb.split('|')
        self.fb2 = [x for x in self.fb1 if x]
        try:
            id,feedback = self.fb2
            ifps[int(id)].feedback = feedback
            print(id, feedback)
        except:
            print('problem', self.fb2)
            ifps[int(id)].feedback = 'I concur'

In [348]:
critic = Critic(Perplexity)

In [349]:
for ifp in ifps.values():
    critic.feedback(ifp)

26568 The probability of 50 seems reasonable given the ongoing conflict and the fluid nature of the military situation in Eastern Ukraine. The recent reports of Russian advances and Ukrainian retreats in Chasiv Yar suggest that the situation is volatile, and it is difficult to predict with certainty whether Russia will have control of the town on October 1, 2024. The continued support of Western nations to Ukraine and the potential impact of diplomatic developments on the conflict also add to the uncertainty. Overall, a probability of 50 reflects the current state of uncertainty and the need for ongoing monitoring of the situation to make more accurate predictions.
26569 The probability of 85 seems reasonable given the ongoing record-breaking heat in the Atlantic Ocean. The recent data shows that the daily mean sea surface temperature has been consistently high, with over 90% of the tropical Atlantic experiencing record or near-record warm temperatures. Climate models and seasonal fore

### Forecasting process

In [None]:
ifps = {id: IFP(id) for id in jul25}

In [361]:
def forecasting(ifps):
    max_tries = 4
    
    analyst = Analyst(ChatGPT)
    analyst.research(ifps)
    
    qr = QuestionRelator(ChatGPT)
    qr.relate(ifps)
    
    sf = Superforecaster(ChatGPT)
    sf.forecast(ifps)
    
    for ifp in ifps.values():
        print("Refining", ifp.question_id)
        ifp.feedback = ''
        critic = Critic(Perplexity)
        for i in range(max_tries):
            print("Pass", i, "of", max_tries, "on", ifp.question_id)
            if ifp.feedback == 'I concur':
                break
            critic.feedback(ifp)
            if ifp.feedback == 'I concur':
                break
            sf.reassess(ifp)
        print("===============================================")

    for ifp in ifps.values():
        print(ifp.question_id, ifp.title)
        print("Forecast", ifp.forecast)
        print("Rationale", ifp.rationale, '\n')

In [363]:
def upload(ifp):
    post_question_prediction(ifp.question_id, ifp.forecast)
    post_question_comment(ifp.question_id, ifp.rationale)

In [364]:
def uploads(ifps):
    for ifp in ifps.values():
        upload(ifp)

## Daily forecast

### Get IFP ids

In [365]:
ifps = list_questions()['results']
today_ids = list(sorted([x['id'] for x in ifps]))
# today_ids = [25876, 25877, 25875, 25873, 25871, 25878, 25874, 25872] # 08JUL24
# today_ids = [26006, 25936, 25935, 25934, 25933, 26004, 26005] # 09JUL24
# today_ids = [25955, 25956, 25957, 25960, 25959, 25954, 25953, 25952, 25958] # 10JUL24
# today_ids = [26019, 26018, 26017, 26020, 26022, 26021, 26023, 26024] # 11JUL24
# today_ids = [26095, 26096, 26097, 26098, 26099, 26100, 26101, 26102] # 12JUL24
# today_ids = [26133, 26134, 26138, 26139, 26140, 26157, 26158, 26159] # 15JUL24
# today_ids = [26189, 26190, 26191, 26192, 26193, 26194, 26195, 26196] # 16JUL24
# today_ids = [26210, 26211, 26212, 26213, 26214, 26215, 26216] # 17JUL24
# today_ids = [26232, 26233, 26234, 26235, 26236] # 18JUL24
# today_ids = [26302, 26303, 26304, 26305, 26306, 26307] # 19JUL24
# today_ids = [26387, 26388, 26389, 26390, 26391, 26392] # 22JUL24
# today_ids = [26404, 26405, 26406, 26407, 26408] # 23JUL24
# today_ids = [26550, 26551, 26552, 26553, 26554, 26555] # 24JUL24
# today_ids = [26568, 26569, 26570, 26571, 26572, 26573, 26574, 26575, 26576, 26577] # 25JUL24
# today_ids = [26638, 26639, 26640, 26641, 26642, 26643, 26644, 26645, 26646] # 26JUL24

In [366]:
today_ids

[26638, 26639, 26640, 26641, 26642, 26643, 26644, 26645, 26646]

## Forecast

In [367]:
ifps = {id: IFP(id) for id in today_ids}

In [368]:
forecasting(ifps)

26638 As of now, there are ongoing legal challenges to provisions of the Clean Air Act, particularly those that involve the Environmental Protection Agency's (EPA) authority to regulate greenhouse gas emissions. A key case to watch is West Virginia v. Environmental Protection Agency, argued before the Supreme Court in 2022, which questions the extent of the EPA's authority under the Clean Air Act. Chevron deference, a judicial doctrine that compels courts to defer to a federal agency's reasonable interpretation of ambiguous laws, remains a pivotal issue in these cases. If the Chevron doctrine is overturned or significantly narrowed, it could impact the outcome of such challenges. Notably, legal experts expect that the Supreme Court's decision in the matter may trigger a wave of lower federal court cases re-evaluating prior rulings under a new framework. However, the specific timeline for any ruling before October 1, 2024, remains uncertain, although the courts' calendars and the urgenc

## Next time

Add an agent to summarize the back and forth between critic and forecaster into a single cogent rationale that incorporates all that was discussed.

## Upload

In [369]:
uploads(ifps)

Prediction posted for  26638
Comment posted for  26638
Prediction posted for  26639
Comment posted for  26639
Prediction posted for  26640
Comment posted for  26640
Prediction posted for  26641
Comment posted for  26641
Prediction posted for  26642
Comment posted for  26642
Prediction posted for  26643
Comment posted for  26643
Prediction posted for  26644
Comment posted for  26644
Prediction posted for  26645
Comment posted for  26645
Prediction posted for  26646
Comment posted for  26646


In [370]:
len(today_ids)

9