<a href="https://colab.research.google.com/github/granade/metaculus-bot-forecaster/blob/main/granade_bot_v14.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## AI Forecasting Bot Template


This is a simple bot template that you can use to forecast in the Metaculus AI Benchmarking Warmup Contest. It is a single shot GPT prompt that you are encouraged to experiment with!

In order to run this notebook as is, you'll need to enter a few API keys (use the key icon on the left to input them):

- `METACULUS_TOKEN`: you can find your Metaculus token under your bot's user settings page: https://www.metaculus.com/accounts/settings/, or on the bot registration page where you created the account: https://www.metaculus.com/aib/
- `OPENAPI_API_KEY`: get one from OpenAIs page: https://platform.openai.com/settings/profile?tab=api-keys
- `PERPLEXITY_API_KEY` - used to search up-to-date information about the question. Get one from https://www.perplexity.ai/settings/api




In [17]:
# Install necessary libraries
!pip install -U -q pydrive2
!pip install -qU openai asknews
!pip install tiktoken
!pip install anthropic==0.3.10
!pip install lxml_html_clean

# Import necessary libraries
import datetime
import json
import os
import requests
import re
import csv
import anthropic
from asknews_sdk import AskNewsSDK
from openai import OpenAI
from datetime import timedelta
import requests
import tiktoken
import re

# use the below to detect if it's being run in google colab
def in_colab():
    try:
        import google.colab
        from pydrive2.auth import GoogleAuth
        from pydrive2.drive import GoogleDrive
        from google.colab import auth
        from google.colab import drive
        from oauth2client.client import GoogleCredentials
        return True
    except ImportError:
        return False

def load_secrets(secrets_path):
    try:
        with open(secrets_path, 'r') as secrets_file:
            secrets = json.loads(secrets_file.read())
            for k, v in secrets.items():
                os.environ[k] = v
    except Exception as e:
        print(f"Error loading secrets from {secrets_path}: {e}")

if in_colab():
    from google.colab import userdata

try:
    if 'secretsPath' in globals():
        print(f"secretsPath exists: {secretsPath}")
        load_secrets(secretsPath)
        METACULUS_TOKEN = os.environ('METACULUS_TOKEN')
        OPENAI_API_KEY = os.environ('OPENAI_API_KEY')
        PERPLEXITY_API_KEY = os.environ('PERPLEXITY_API_KEY')
        ASKNEWS_CLIENT_ID = os.environ('ASKNEWS_CLIENT_ID')
        ASKNEWS_SECRET = os.environ('ASKNEWS_SECRET')
        CLAUDE_API_KEY = os.environ('CLAUDE_API_KEY')
        # And Other Keys Saved In GitHub Secrets
    else:
        raise NameError("secretsPath not defined")
except NameError:
    print("Loading secrets from userdata (Google Colab)")
    METACULUS_TOKEN = userdata.get('METACULUS_TOKEN')
    OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
    PERPLEXITY_API_KEY = userdata.get('PERPLEXITY_API_KEY')
    ASKNEWS_CLIENT_ID = userdata.get('ASKNEWS_CLIENT_ID')
    ASKNEWS_SECRET = userdata.get('ASKNEWS_SECRET')
    CLAUDE_API_KEY = userdata.get('CLAUDE_API_KEY')
except KeyError as e:
    print(f"Missing required environment variable: {e}")

AUTH_HEADERS = {"headers": {"Authorization": f"Token {METACULUS_TOKEN}"}}
API_BASE_URL = "https://www.metaculus.com/api2"
WARMUP_TOURNAMENT_ID = 3294
#3294 --> AI Test Bot Page
#3349 --> AI Bot Competition
#3366 --> Regular Quarterly Tournement
SUBMIT_PREDICTION = False

Collecting lxml_html_clean
  Downloading lxml_html_clean-0.2.2-py3-none-any.whl.metadata (1.8 kB)
Downloading lxml_html_clean-0.2.2-py3-none-any.whl (13 kB)
Installing collected packages: lxml_html_clean
Successfully installed lxml_html_clean-0.2.2
Loading secrets from userdata (Google Colab)


### ChatGPT Prompt

You can change the prompt below to experiment. Key parameters that you can include in your prompt are:

*   `{title}` The question itself
*   `{summary_report}` A up to date news compliation generated from Perplexity
*   `{background}` The background section of the Metaculus question. This comes from the `description` field on the question
*   `{fine_print}` The fine print section of the question
*   `{today}` Today's date. Remember that your bot doesn't know the date unless you tell it explicitly!


**IMPORTANT**: As you experiment with changing the prompt, be aware that the last number output by GPT will be used as the forecast probability. The last line in the template specifies that.


In [18]:
ASKNEWS_CLEANUP_PROMPT = """
You are a professional researcher, who works for a forecaster.  The forecaster is workking
on the following question:
{title}

Today is {today}.

As part of the forecasting process, the researcher pulled ~50 articles from a database
that contains newspapers and other media articles from the around the world.

The database is good at pulling lots of articles but not at indentifying the most important articles.
That's where you come in.

I need you to read through all the articles and send back the
relevant ones given the question your client (the professional forecaster) is trying to answer.
Your answer could be anywhere from 0 to 50 articles.

Here is the stack of articles:
{llm_context}

Your final response should present the relevant articles in a format similar to the
format you received them. Please present the title, summary, source, published date.
There is no need to include the other fields in your final response.

A few things to think about in your process.  First, you should weigh the quality of the sources.
Some media is better than others so be sure to pull from higher quality sources.
Second, try to get a wide-variety of perspectives.  Divergent views are useful -- even
helpful.  Third, you should not develop a forecast; you should present the relevant articles.

"""

ASKNEWS_HOTNEWS_CLEANUP_PROMPT = """
You are a professional researcher, who works for a forecaster.  The forecaster is workking
on the following question:
{title}

Today is {today}.

As part of the forecasting process, the researcher pulled ~10 articles from a media database that contains newspapers and other media articles from the around the world.
All the articles are from the past 48 hours.  The idea is to get the latest news related to the topic so he or
she is fully up-to-date.

The database is good at pulling lots of articles but not at indentifying the most important articles.
That's where you come in.

I need you to read through all the articles and send back the
relevant ones given the question your client (the professional forecaster) is trying to answer.

You should be an emphasize on breaking news.  What is the latest news related to this question.

Your answer could be anywhere from 0 to 10 articles.

Here is the stack of articles:
{hotnews}

Your final response should present the relevant articles in a format similar to the
format you received them. Please present the title, summary, source, published date.
There is no need to include the other fields in your final response.

"""

PROMPT_TEMPLATE = """
You are a professional forecaster, and I need your help making a prediction.

Your goal is to make an accurate prediction. To do this, you evaluate past data
and trends carefully, make use of comparison classes of similar events, take into
account base rates about how past events unfolded, and outline the best reasons
for and against any particular outcome. You know that great forecasters don't
just forecast according to the "vibe" of the question -- they do the work.
They think about the question in a structured way, recording their
reasoning as they go, and they always consider multiple perspectives. You do not
need to hedge your uncertainty, you are simply trying to give the most accurate
probability.  Your answer will be evaluated later when the event unfolds.

Here is some information about the question.

The question is:
{title}

Here is some background on the question:
{background}

Here is how the question gets resolved:
{resolution_criteria}

Here is the fine print on the question:
{fine_print}

Today is {today}.

Your research assistant did some work to establish a fact-base on the question
you're trying to answer.  Specifically, you sent her some important questions and she
did work to answer each of them.  Here's what she found:

{summary_report}

I also thought it would be useful for you to have some background articles and other reporting on the question.
I asked another research assistant to pull some headlines and article summaries from
a wide range of media sources related to the question you're trying to forecast.
The database he used has articles from the last year.  Here they are:

{llm_context_cleaned}

I asked your research assistant also to pull a few headlines from the past
48 hours related to the question you're trying to forecast. Here are the headlines and a
summary of each article: {hotnews_cleaned}.

Let's now go through some steps that good forecasters use to answer a question.
I am going to layout some questions I would like you to answer and to think
about before you give a probability. You should give an explicit answer to all
of these questions -- and think about them carefully -- before you give your
probablity. Showing your work will help you develop a better answer.

1. Given the question above, please rephrase and expand the question to help
you do a better job answering it.  Maintain all of the information in the
original question.

2. Think about the default resolution, meaning if the question resolved today how
would it resolve.

3.  The time element is always important in prediction.  So make sure you know
today's date.  Then think about how much time is left until this question gets resolved.
In other words, if nothing else changes, what is the resolution?

4. Using your knowledge of the world and topic, as well as the information provided,
including the information provided by your research assistant, list a few reasons
why the answer might be NO.  Rate the strength of those reasons.

5. Using your knowledge of the world and topic, as well as the information provided,
including the information provided by your research assistant, list a few reasons
why the answer might be YES.  Rate the strength of those reasons.

6. Think about historical data and determine a base rate for the question.  Here's a
good definition of base rate: a base rate is the fundamental likelihood of an event
occurring based on historical data.

7. Your research assistant created an important file of factual information, which
I provided you above. Your other research assistant pulled headlines and artciles.
How do these inputs change your thinking about the question, if at all?

8. Another important thing to think about is recent developments -- essentionally
break news. Above I gave you some few headlines from the past
24 hours related to the question you're trying to forecast. Sometimes they will
be very relevant; sometimes they will be completely irrelevant. It's your job to
sort that out. How do these headlines impact your prediction, if at all?

9. You know that the resolution criteria and fine print of a question often
contain important edge cases that should be considered. Considering the resolution criterion and
fine-print provided to you above, how do you think that impacts the probabilities of a given
outcome here?

10. Now aggregate your considerations.  Think like a superforecaster (e.g., Nate
Silver, Phil Tetlock).  Based on everything you've learning in steps 1 through 9,
give us your best answer.  You should aggregate your answer into a probability between 0% (very, very unlikely) and 100% (very, very likely).
Be sure to answer the question as it is phrased -- i.e., provide a probability for
the question you're trying to answer (not the inverse).

***Do not predict NONE or NO.***
You should always provide a number.  That said, the number can be very low or very high. Don't be
afraid to go to the extremes if your analysis suggests so.

Thanks for your help.

Follow these steps when generating an output:

1) **show your work** Provide your analysis based on each of the steps described above i.e., write out an answer to each step.
Then given the question, all the material provided to you, and your step-by-step work
provide your expert forecast on whether or not the resolution criteria will be achieved and your rationale.
Overall "show your work" will be several paragpraphs long.  That's okay -- take your time and write out what you need to write out.
2) **determine a forecast probability** Given the resolution criteria and your rationale, determine a the probability (likelihood) that the resolution criteria will be achieved, this is an integer between 1 and 100.

Output your response in the following JSON structure:

{{
"rationale": "string",
"probability": "integer between 0 and 100"
}}

"""

PROMPT_TEMPLATE2 = """
You are a professional forecaster. Your goal is to make an accurate prediction
on an important question. To do this, you evaluate past data and trends, make
use of comparison classes of similar events, take into account base rates about
how past events unfolded, and outline the best reasons for and against any particular outcome.

I am going to give you some information about the question I need you to forecast.

Your question is:
{title}

background:
{background}

Today is {today}.

Before you do any forecasting, your research assistant is going to do some work
to discover important background information for your forecast.  You need to give
the research assistant guidance on what would be most helpful to you.  The more explicit
you are about what you want / need to know, the more likely your assistant is to help you.

Given the question you need to forecast and the background -- as well as everything
you know about forecasting -- what are three or so questions your research assistant
could help you with.  These questions should primarily be factual in nature -- things you can
look up in news sources, encyclopedia's, presses releases, on the internet, etc. Please list them
in order of importance.

Your response should be in the form of instructions to your research assistant, which
I will pass on directly.  You should begin with some context on what
you are trying to do.  Then list the ~5 or so questions you want him or her to research.
You also should point out a couple of best practices to the research assistant:
First, the research assistant should use a wide range of high quality sources -- especially
news sources.  Second, he or she should not develop their own forecast. You need the research assistant to develop
a useful factbase; you will then do the forecasting.  Include all of this in your instructions.

"""

PROMPT_TEMPLATE4 = """
You are a professional forecaster, and I need your help making a prediction.
Your goal is to make an accurate prediction.

Here is some information about the question I need you to predict.

The question is:
{title}

Here is some background on the question:
{background}

Here is how the question gets resolved:
{resolution_criteria}

Today is {today}.

You know that examining the reasoning of other
forecasters is an excellent way to improve your own forecast. Below I have provided the reasoning from several other forecasters who predicted on the same question.
Examine their reasoning and use it to inform your own, using your expertise as a forecaster to assess which reasoning seems strongest and which seems flawed,
as well as which reasoning seems to incorporate the most accurate information about base rates and historic reference classes. Construct your own reasoning and forecast,
describing your reasoning step by step and incorporating the strongest arguments from the other forecasters in a way that improves your own reasoning. First produce a
one sentence summary of the reasoning of each forecaster, then describe your forecast.

You have a team of forecasters who work for you.  In each case, they went through
a multi-step process to make a prediction. They wrote up their work at every step.
They also included a brief summary of their logic.

Here are the forecasters' predictions and logic:

Forecaster 1:
{forecaster1}

Forecaster 2:
{forecaster2}

Forecaster 3:
{forecaster3}

Forecaster 4:
{forecaster4}

Forecaster 5:
{forecaster5}

Your job is to review their work and then develop your own prediction.

Here is one thing to think about in particular -- if the other forecasters have scores that are less than 10% or more than 90%
chances are they are suffering from a failure to extremize the forecast.  What does that mean?
When you get near extremes -- like 0% and 100% -- forecasters tend to hedge.  They tend to add or subtract
a few percentage points.  That's dangerous.  Tell us what you really think!

Now aggregate your considerations.  Think like a superforecaster (e.g., Nate
Silver, Phil Tetlock).  Based on everything you've learned above, give us your best answer.
You should aggregate your answer into a probability between 0% (very, very unlikely) and 100% (very, very likely).
Be sure to answer the question as it is phrased -- i.e., provide a probability for
the question you're trying to answer (not the inverse).

Do not predict NONE or NO. You should always provide a number.  That said, the number can be very low or very high. Don't be
afraid to go to the extremes if your analysis suggests so.

Thanks for your help.


Follow these steps when generating output:

1) **provide rationale** This should have three parts.  Part I: Start by offering a one sentence summary of each of the forecasts submitted to you.  Part II: Then comment overall on the best and worst arguments.
Part III: Given the question and everything else submitted to you, provide your expert forecasting rationale behind whether or not the resolution criteria will be achieved.
2) **determine a forecast probability** Given the resolution criteria and your rationale, determine a the probability (likelihood) that the resolution criteria will be achieved, this is an integer between 0 and 100.

Output your response in the following JSON structure:

{{
"rationale": "string",
"probability": "integer between 0 and 100"
}}


"""

CLAUDE_PROMPT_TEMPLATE = """
\n\nHuman:
You are a professional forecaster, and I need your help making a prediction.

Your goal is to make an accurate prediction. To do this, you evaluate past data
and trends carefully, make use of comparison classes of similar events, take into
account base rates about how past events unfolded, and outline the best reasons
for and against any particular outcome. You know that great forecasters don't
just forecast according to the "vibe" of the question -- they do the work.
They think about the question in a structured way, recording their
reasoning as they go, and they always consider multiple perspectives. You do not
need to hedge your uncertainty, you are simply trying to give the most accurate
probability.  Your answer will be evaluated later when the event unfolds.

Here is some information about the question.

The question is:
{title}

Here is some background on the question:
{background}

Here is how the question gets resolved:
{resolution_criteria}

Here is the fine print on the question:
{fine_print}

Today is {today}.

Your research assistant did some work to establish a fact-base on the question
you're trying to answer.  Specifically, you sent her some important questions and she
did work to answer each of them.  Here's what she found:

{summary_report}

I also thought it would be useful for you to have some background articles and other reporting on the question.
I asked another research assistant to pull some headlines and article summaries from
a wide range of media sources related to the question you're trying to forecast.
The database he used has articles from the last year.  Here they are:

{llm_context_cleaned}

I asked your research assistant also to pull a few headlines from the past
48 hours related to the question you're trying to forecast. Here are the headlines and a
summary of each article: {hotnews_cleaned}.

Let's now go through some steps that good forecasters use to answer a question.
I am going to layout some questions I would like you to answer and to think
about before you give a probability. You should give an explicit answer to all
of these questions -- and think about them carefully -- before you give your
probablity. Showing your work will help you develop a better answer.

When you state or restate the question in your output, please do not write "%" instead
spell out "percent" ... except in your final answer when you should write "%" as instructed below.  Thanks.

1. Given the question above, please rephrase and expand the question to help
you do a better job answering it.  Maintain all of the information in the
original question.

2. Think about the default resolution, meaning if the question resolved today how
would it resolve.

3.  The time element is always important in prediction.  So make sure you know
today's date.  Then think about how much time is left until this question gets resolved.
In other words, if nothing else changes, what is the resolution?

4. Using your knowledge of the world and topic, as well as the information provided,
including the information provided by your research assistant, list a few reasons
why the answer might be NO.  Rate the strength of those reasons.

5. Using your knowledge of the world and topic, as well as the information provided,
including the information provided by your research assistant, list a few reasons
why the answer might be YES.  Rate the strength of those reasons.

6. Think about historical data and determine a base rate for the question.  Here's a
good definition of base rate: a base rate is the fundamental likelihood of an event
occurring based on historical data.

7. Your research assistant created an important file of factual information, which
I provided you above. Your other research assistant pulled headlines and artciles.
How do these inputs change your thinking about the question, if at all?

8. Another important thing to think about is recent developments -- essentionally
break news. Above I gave you some few headlines from the past
24 hours related to the question you're trying to forecast. Sometimes they will
be very relevant; sometimes they will be completely irrelevant. It's your job to
sort that out. How do these headlines impact your prediction, if at all?

9. You know that the resolution criteria and fine print of a question often
contain important edge cases that should be considered. Considering the resolution criterion and
fine-print provided to you above, how do you think that impacts the probabilities of a given
outcome here?

10. Now aggregate your considerations.  Think like a superforecaster (e.g., Nate
Silver, Phil Tetlock).  Based on everything you've learning in steps 1 through 9,
give us your best answer. You should write your answer as: "Probability: ZZ%", 0-100.

A few critical things to remember:
a) Do not predict NONE or NO.
b) You should always provide a number.
c) That said, the number can be very low or very high. Don't be
afraid to go to the extremes if your analysis suggests so.
d) Be sure to answer the question as it is phrased -- i.e., provide a probability for
the question you're trying to answer (not the inverse).

You should also provide a three or four sentence summary of why you think that probability is correct.

Thanks for your help!

\n\nAssistant:
"""


## Some setup code

This section sets up some simple helper code you can use to get data about forecasting questions and to submit a prediction

In [19]:


def write_to_txt(question_id, label, text_string):
  today = datetime.date.today().strftime("%Y-%m-%d")
  file_name = f'{today}_{question_id}.txt'
  file_path = f'/content/drive/My Drive/snapshots/{file_name}'

  with open(file_path, 'a') as f:
    f.write(f"**{label}:**\n {text_string}\n\n\n\n")


def find_number_before_percent(s):
    # Use a regular expression to find all numbers followed by a '%'
    matches = re.findall(r'(\d+)%', s)
    if matches:
        # Return the last number found before a '%'
        return int(matches[-1])
    else:
        # Return None if no number found
        return None

def post_question_comment(question_id, comment_text):
    """
    Post a comment on the question page as the bot user.
    """

    response = requests.post(
        f"{API_BASE_URL}/comments/",
        json={
            "comment_text": comment_text,
            "submit_type": "N",
            "include_latest_prediction": True,
            "question": question_id,
        },
        **AUTH_HEADERS,
    )
    response.raise_for_status()

def get_claude_prediction(question_details, llm_context_cleaned, hotnews_cleaned, summary_report):
    today = datetime.datetime.now().strftime("%Y-%m-%d")
    client = anthropic.Anthropic(api_key=CLAUDE_API_KEY)

    title = question_details["title"]
    resolution_criteria = question_details["resolution_criteria"]
    background = question_details["description"]
    fine_print = question_details["fine_print"]
    llm_context_cleaned=llm_context_cleaned
    hotnews_cleaned=hotnews_cleaned
    summary_report=summary_report

    prompt = CLAUDE_PROMPT_TEMPLATE.format(
      title=title,
      today=today,
      resolution_criteria=resolution_criteria,
      background=background,
      fine_print=fine_print,
      llm_context_cleaned=llm_context_cleaned,
      hotnews_cleaned=hotnews_cleaned,
      summary_report=summary_report
    )

#    print(prompt)

    response = client.completions.create(
        model="claude-2",
        max_tokens_to_sample=1000,
        prompt=prompt,
    )

#    print(response.completion)
    probability_match = find_number_before_percent(response.completion)
    probability = int(probability_match) # int(match.group(1))
#    print(f"The extracted probability is: {probability}%")
    return probability, response.completion

def post_question_prediction(question_id, prediction_percentage):
    """
    Post a prediction value (between 1 and 100) on the question.
    """
    url = f"{API_BASE_URL}/questions/{question_id}/predict/"
    response = requests.post(
        url,
        json={"prediction": float(prediction_percentage) / 100},
        **AUTH_HEADERS,
    )
    response.raise_for_status()
#    print("The prediction percentage is:", prediction_percentage)


def get_question_details(question_id):
    """
    Get all details about a specific question.
    """
    url = f"{API_BASE_URL}/questions/{question_id}/"
    response = requests.get(
        url,
        **AUTH_HEADERS,
    )
    response.raise_for_status()
    return json.loads(response.content)

def list_questions(tournament_id=WARMUP_TOURNAMENT_ID, offset=0, count=10):

    url_qparams = {
        "limit": count,
        "offset": offset,
        "has_group": "false",
        "order_by": "-activity",
        "forecast_type": "binary",
        "project": tournament_id,
        "status": "open",
        "type": "forecast",
        "include_description": "true",
    }
    url = f"{API_BASE_URL}/questions/"
    response = requests.get(url, **AUTH_HEADERS, params=url_qparams)
    response.raise_for_status()
    data = json.loads(response.content)
    return data

def get_asknews_llmcontext(query):

  ask = AskNewsSDK(
      client_id=ASKNEWS_CLIENT_ID,
      client_secret=ASKNEWS_SECRET,
      scopes=["news"]
  )

  historical_response = ask.news.search_news(
      query=query,
      n_articles=50,
      return_type="string",
      historical=True,
      method="both",
      diversify_sources=True,
      strategy="default",
      provocative="low",
      hours_back=1400,
  )

  llm_context = historical_response.as_string
  return llm_context

def asknews_cleanup(llm_context, title):
    today = datetime.datetime.now().strftime("%Y-%m-%d")
    client = OpenAI(api_key=OPENAI_API_KEY)

    llm_context = llm_context
    title = title

    chat_completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
        {
            "role": "user",
            "content": ASKNEWS_CLEANUP_PROMPT.format(
                llm_context=llm_context,
                today=today,
                title=title,
            )
        }
        ]
    )

    llm_context_cleaned = chat_completion.choices[0].message.content
    return llm_context_cleaned

def get_asknews_hotnews(query):
  """
  Use the AskNews `news` endpoint to get news context for your query.
  The full API reference can be found here: https://docs.asknews.app/en/reference#get-/v1/news/search
  """
  ask = AskNewsSDK(
      client_id=ASKNEWS_CLIENT_ID,
      client_secret=ASKNEWS_SECRET,
      scopes=["news"]
  )

  hotnews_response = ask.news.search_news(
      query=query,
      n_articles=10,
      return_type="string",
      historical=False,
      method="both",
      diversify_sources=False,
      strategy="default",
      similarity_score_threshold=0.9,
      provocative="low",
      hours_back=48
  )

  hotnews = hotnews_response.as_string
  return hotnews

def asknews_hotnews_cleanup(hotnews, title):
    today = datetime.datetime.now().strftime("%Y-%m-%d")
    client = OpenAI(api_key=OPENAI_API_KEY)

    hotnews = hotnews
    title = title

    chat_completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
        {
            "role": "user",
            "content": ASKNEWS_HOTNEWS_CLEANUP_PROMPT.format(
                hotnews=hotnews,
                today=today,
                title=title,
            )
        }
        ]
    )

    hotnews_cleaned = chat_completion.choices[0].message.content
    return hotnews_cleaned

def call_perplexity(query):

    title = question_details["title"]
    resolution_criteria = question_details["resolution_criteria"]
    background = question_details["description"]
    fine_print = question_details["fine_print"]

    gpt_question = get_gpt_questions(question_details)
    write_to_txt(question_id, "Question for Perplexity", gpt_question)

    url = "https://api.perplexity.ai/chat/completions"
    headers = {
        "accept": "application/json",
        "authorization": f"Bearer {PERPLEXITY_API_KEY}",
        "content-type": "application/json",
    }

    payload = {
        "model": "llama-3.1-sonar-large-128k-chat",
        "messages": [
            {
                "role": "system",
                "content": gpt_question,
            },
            {"role": "user", "content": query},
        ],
    }

    response = requests.post(url=url, json=payload, headers=headers)
    response.raise_for_status()
    content = response.json()["choices"][0]["message"]["content"]

    return content

def get_gpt_questions(question_details):
    today = datetime.datetime.now().strftime("%Y-%m-%d")
    client = OpenAI(api_key=OPENAI_API_KEY)

    title = question_details["title"]
    resolution_criteria = question_details["resolution_criteria"]
    background = question_details["description"]
    fine_print = question_details["fine_print"]

    chat_completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
        {
            "role": "user",
            "content": PROMPT_TEMPLATE2.format(
                title=title,
                today=today,
                background=background,
                fine_print=fine_print,
            )
        }
        ]
    )
    gpt_questions = chat_completion.choices[0].message.content
    return gpt_questions

def get_gpt_prediction(question_details, llm_context_cleaned, hotnews_cleaned, summary_report):
    today = datetime.datetime.now().strftime("%Y-%m-%d")
    client = OpenAI(api_key=OPENAI_API_KEY)

    title = question_details["title"]
    resolution_criteria = question_details["resolution_criteria"]
    background = question_details["description"]
    fine_print = question_details["fine_print"]

    summary_report = summary_report

    filled_prompt = PROMPT_TEMPLATE.format(
      title=title,
      llm_context_cleaned=llm_context_cleaned,
      hotnews_cleaned=hotnews_cleaned,
      summary_report=summary_report,
      today=today,
      resolution_criteria=resolution_criteria,
      background=background,
      fine_print=fine_print
    )

#    print(filled_prompt)

    chat_completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
        {
            "role": "user",
            "content": PROMPT_TEMPLATE.format(
                title=title,
                llm_context_cleaned=llm_context_cleaned,
                hotnews_cleaned=hotnews_cleaned,
                summary_report=summary_report,
                today=today,
                resolution_criteria=resolution_criteria,
                background=background,
                fine_print=fine_print,
            )
        }
        ],
        response_format={ "type": "json_object" }
    )

    gpt_text = chat_completion.choices[0].message.content

    parsed_dict = json.loads(gpt_text)
    probability = parsed_dict.get("probability", None)
    rationale = parsed_dict.get("rationale", None)

    return probability, rationale

def get_gpt_finalenhancedprediction(question_details, all_forecasts):
    today = datetime.datetime.now().strftime("%Y-%m-%d")
    client = OpenAI(api_key=OPENAI_API_KEY)

    title = question_details["title"]
    resolution_criteria = question_details["resolution_criteria"]
    background = question_details["description"]
    fine_print = question_details["fine_print"]
    forecaster1 = all_forecasts[0][1]
    forecaster2 = all_forecasts[1][1]
    forecaster3 = all_forecasts[2][1]
    forecaster4 = all_forecasts[3][1]
    forecaster5 = all_forecasts[4][1]

    filled_prompt = PROMPT_TEMPLATE4.format(
      title=title,
      llm_context=llm_context,
      hotnews=hotnews,
      forecaster1 = forecaster1,
      forecaster2 = forecaster2,
      forecaster3 = forecaster3,
      forecaster4 = forecaster4,
      forecaster5 = forecaster5,
      today=today,
      resolution_criteria=resolution_criteria,
      background=background,
      fine_print=fine_print,    )

#    print(filled_prompt)

    chat_completion = client.chat.completions.create(
        model="gpt-4o",
        messages=[
        {
            "role": "user",
            "content": PROMPT_TEMPLATE4.format(
                title=title,
                llm_context=llm_context,
                hotnews=hotnews,
                forecaster1 = forecaster1,
                forecaster2 = forecaster2,
                forecaster3 = forecaster3,
                forecaster4 = forecaster4,
                forecaster5 = forecaster5,
                today=today,
                resolution_criteria=resolution_criteria,
                background=background,
                fine_print=fine_print,
            )
        }
        ],
        response_format={ "type": "json_object" }
    )

    gpt_text = chat_completion.choices[0].message.content

    parsed_dict = json.loads(gpt_text)
    probability = parsed_dict.get("probability", None)
    rationale = parsed_dict.get("rationale", None)

    return probability, rationale


## GPT prediction and submitting a forecast

This is an example of how you can use the helper functions from above.

In [20]:

questions = list_questions()

open_questions_ids = []

for question in questions["results"]:
  if question["active_state"] == "OPEN":
# and question['my_predictions'] is None and question["id"] != 0:
#    print(f"ID: {question['id']}\nQ: {question['title']}\nCloses: {question['close_time']}")
    open_questions_ids.append(question["id"])

for id in open_questions_ids:

  drive.mount('/content/drive', force_remount=True)

#  question_id = 27911
  question_id = id
  question_details = get_question_details(question_id)

  print(question_id)
  print("Title: ", question_details['title'])

  write_to_txt(question_id, "Questinion ID", str(question_id))
  write_to_txt(question_id, "Title", question_details['title'])
  write_to_txt(question_id, "Resolution Criteria", question_details['resolution_criteria'])
  write_to_txt(question_id, "Description", question_details['description'])
  write_to_txt(question_id, "Fine Print", question_details['fine_print'])


  title = question_details["title"]

  llm_context = get_asknews_llmcontext(title)
#  print("AskNews historical articles: ", llm_context)
  write_to_txt(question_id, "AskNews Background", llm_context)

  llm_context_cleaned = asknews_cleanup(llm_context, title)
#  print("AskNews cleaned articles: ", llm_context_cleaned)
  write_to_txt(question_id, "AskNews Background Cleaned", llm_context_cleaned)

  hotnews = get_asknews_hotnews(title)
#  print("AskNews hotnews: ", hotnews)
  write_to_txt(question_id, "AskNews Hotnews", hotnews)

  hotnews_cleaned = asknews_cleanup(hotnews, title)
#  print("AskNews hotnews: ", hotnews_cleaned)
  write_to_txt(question_id, "AskNews Hotnews Cleaned", hotnews_cleaned)

  summary_report = call_perplexity(title)
#  print("Perplixity said: ", summary_report)
  write_to_txt(question_id, "Perplixity Report", summary_report)

  all_forecasts = []
  promptset = [1, 2, 3, 4, 5]

  prompt = 0
  while prompt < 5:
      probability, rationale = get_gpt_prediction(question_details, llm_context_cleaned, hotnews_cleaned, summary_report)
      print("OpenAI predicted: ", probability)
#      print("GPT said: ", rationale)
      write_to_txt(question_id, "GPT Forecast", rationale)
      if probability == None:
          prompt = prompt + 0
      else:
          all_forecasts.append((probability, rationale))
          prompt = prompt + 1

  prompt2 = 0
  while prompt2 < 1:
      probability, rationale = get_gpt_finalenhancedprediction(question_details, all_forecasts)
      print("GPT integrated predicted: ", probability)
#      print("GPT said: ", rationale)
      write_to_txt(question_id, "GPT Integrated Forecast", rationale)
      if probability == None:
          prompt2 = prompt2 + 0
      else:
          all_forecasts.append((probability, rationale))
          prompt2 = prompt2 + 1

  forecaster_weight = 0.1
  weighted_forecast = forecaster_weight*float(all_forecasts[0][0]) + forecaster_weight*float(all_forecasts[1][0]) + forecaster_weight*float(all_forecasts[2][0]) + forecaster_weight*float(all_forecasts[3][0]) + forecaster_weight*float(all_forecasts[4][0]) + 0.5*float(all_forecasts[5][0])
  weighted_forecast = int(weighted_forecast)

  print("Final forecast for submission is: ", weighted_forecast)
  prediction = weighted_forecast
  comment = (all_forecasts[5][1])

  if prediction is not None and SUBMIT_PREDICTION:
      post_question_prediction(question_id, prediction)
      post_question_comment(question_id, comment)
      print("The submitted predicition is: ", prediction)
#      print("The submitted comment is: ", comment)
      write_to_txt(question_id, "Submitted forecast is", prediction)
      write_to_txt(question_id, "GPT Integrated Forecast", rationale)

  new_array = []
  new_array = np.insert(new_array, 0, question_id)
  new_array = np.insert(new_array, 1, all_forecasts[0][0])
  new_array = np.insert(new_array, 2, all_forecasts[1][0])
  new_array = np.insert(new_array, 3, all_forecasts[2][0])
  new_array = np.insert(new_array, 4, all_forecasts[3][0])
  new_array = np.insert(new_array, 5, all_forecasts[4][0])
  new_array = np.insert(new_array, 6, all_forecasts[5][0])
  new_array = np.insert(new_array, 7, weighted_forecast)

#  print(new_array)


  file_path = '/content/drive/My Drive/data.csv'
  with open(file_path, 'a', newline='') as file:
      writer = csv.writer(file)
      writer.writerow(new_array)

  for prompt in promptset:
      prediction, gpt_result = get_claude_prediction(question_details, llm_context_cleaned, hotnews_cleaned, summary_report)
      all_forecasts.append((prediction, gpt_result))
#      print("Claude said: ", gpt_result)
      print("Claude predicted: ", prediction)
      write_to_txt(question_id, "Claude Forecast", gpt_result)

  new_array = np.insert(new_array, 8, all_forecasts[6][0])
  new_array = np.insert(new_array, 9, all_forecasts[7][0])
  new_array = np.insert(new_array, 10, all_forecasts[8][0])
  new_array = np.insert(new_array, 11, all_forecasts[9][0])
  new_array = np.insert(new_array, 12, all_forecasts[10][0])

  for prompt in promptset:
      probability, rationale = get_gpt_prediction(question_details, llm_context_cleaned, hotnews_cleaned, summary_report)
      all_forecasts.append((probability, rationale))
      print("OpenAI predicted: ", probability)
#      print("GPT said: ", rationale)
      write_to_txt(question_id, "GPT Forecast", rationale)

  new_array = np.insert(new_array, 13, all_forecasts[11][0])
  new_array = np.insert(new_array, 14, all_forecasts[12][0])
  new_array = np.insert(new_array, 15, all_forecasts[13][0])
  new_array = np.insert(new_array, 16, all_forecasts[14][0])
  new_array = np.insert(new_array, 17, all_forecasts[15][0])

  drive.mount('/content/drive', force_remount=True)
  file_path = '/content/drive/My Drive/data.csv'
  with open(file_path, 'a', newline='') as file:
      writer = csv.writer(file)
      writer.writerow(new_array)

if in_colab():
  from google.colab import runtime
  runtime.unassign()
else: # in github
  import sys
  sys.exit()


Mounted at /content/drive
27890
Title:  [PRACTICE] Will Donald Trump be elected US President in 2024?


KeyboardInterrupt: 