# AI(t)HistChronicle: Exploring Counterfactual History with Generative AI

## Info: AI(t)HistChronicle

This project explores the intersection of AI and counterfactual history — using large language models to generate plausible alternate timelines for pivotal moments in human history.

AI(t)HistChronicle is an AI agent pipeline that combines external knowledge retrieval, structured reasoning (via Tree-of-Thoughts), and controlled text generation to create richly detailed historical narratives based on user-defined “what-if” scenarios.

### AI Capabilities used:
- RAG (Retrieval Augmented Generation) through feeding in background information from Wikipedia and news articles from Wikinews
- Controlled generation by specifying, for instance, the role the model should take and the output type the response should have
- Structured output (JSON)
- Agents through combination of reasoning, API connection and model output
- Few-shot/ One-shot prompting
- Tree-of-Thoughts (ToT) prompting

### Models used
- DeepSeek-R1-Distill-Llama-70B (DeepSeek)
- Llama3–70B–8192 (Meta)

### Author
Dr. Patrick Thiel

This is notebook is part of my submission to the [Gen AI Intensive Course Capstone 2025Q1](https://www.kaggle.com/competitions/gen-ai-intensive-course-capstone-2025q1).

I also wrote a blog post on it. See [@Medium](https://medium.com/@patthie/alternate-realities-powered-by-ai-a-creative-experiment-in-counterfactual-history-30b41ed5b1be)

## Setup

Note: A groq API ([Link](https://console.groq.com/keys)) is needed for the notebook to be run.

In [1]:
# installion (if required)
!pip install groq
!pip install wikipedia

Collecting groq
  Downloading groq-0.22.0-py3-none-any.whl.metadata (15 kB)
Downloading groq-0.22.0-py3-none-any.whl (126 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m126.7/126.7 kB[0m [31m3.0 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: groq
Successfully installed groq-0.22.0
Collecting wikipedia
  Downloading wikipedia-1.4.0.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: wikipedia
  Building wheel for wikipedia (setup.py) ... [?25l[?25hdone
  Created wheel for wikipedia: filename=wikipedia-1.4.0-py3-none-any.whl size=11679 sha256=938495a5f9439eaa47d35ea56b48d3b57969499b6955c21239c5b572588dfef2
  Stored in directory: /root/.cache/pip/wheels/5e/b6/c5/93f3dec388ae76edc830cb42901bb0232504dfc0df02fc50de
Successfully built wikipedia
Installing collected packages: wikipedia
Successfully installed wikipedia-1.4.0


In [3]:
from groq import Groq
import re
import json
import textwrap
from kaggle_secrets import UserSecretsClient
import wikipedia
import requests

In [4]:
# Retrieve API from secrets
user_secrets = UserSecretsClient()
api_key = user_secrets.get_secret("deepseek")

**NOTE**: I make use of the groq library and groq API service to utilize the following models:
- DeepSeek-R1-Distill-Llama-70B ([Link](https://console.groq.com/docs/model/deepseek-r1-distill-llama-70b))
- Llama3-70B-8192 ([Link](https://console.groq.com/docs/model/llama3-70b-8192))

In [5]:
# initialize client for Groq models (using API key)
client = Groq(api_key = api_key)

## Specification of Input

- Defines which counterfactural world is created

### Explanation
- event: defining the actual historical event of interest
- scenario: counterfactual (what if) statement
output_type: format the final response should take
- role: perspective the model takes on

- additional_instructions (optional): anything the agent should consider additional. Can be used to set, e.g., the tone of the document
- text_length (optional): length of the produced output

In [6]:
# inputs (adjust to the setting you are interested in)
event = "The fall of the Berlin Wall"
scenario = "What if the Berlin Wall never came down?"
output_type = "school textbook page"
role = "lecturer"

# optional inputs (if not needed, leave blank or at default)
additional_instructions = "Be precise. You are writing for 10th graders. Generate also the name of the textbook." # In case you want to give the text a specific twist. If not, leave empty!
text_length = 1000 # number of tokens

## API Connection to Wikipedia

- This serves the RAG capability and offers additional background information to the model by summarizing a corresponding Wikepedia article related to the event.
- By default, the request is limited to a 50 sentence summary. This is arbitrary and could be easily changed in the function below. However, my reasoning was to give the model just enough information to have a reasonable amount of context.

In [7]:
# retrieve background information from Wikipedia
def getting_background_wikipedia(event: str, num_sentences: int = 50) -> str:
    """
    Retrieving information from Wikipedia regarding the supplied event.
    The information represents a summary (in 50 sentences) of the Wikipedia article.

    Args:
        event (str): User specified historical event.
        num_sentences (int): Number of sentences to be retrieved (default: 50)

    Returns:
        str: Summary description of the event based on Wikipedia.
    """
    summary = wikipedia.summary(event, sentences = num_sentences)
    return summary

## Getting news articles (Wikinews)

- To extend the RAG capability even further, I also pull in information from news articles using Wikinews. As historical events are discussed prominently in the use (even if the event dates some time back), this strategy offers greater extent to the model for reasoning and the final response generation.
- My reasoning is that news articles offer different perspectives to the model which can be helpful in the reasoning of the model itself.
- The retrieval is, again, restricted by default to 5 articles but can be easily adjusted in the function below.

In [26]:
def getting_background_wikinews(event: str, num_articles: int = 5) -> list:
    """
    Retrieving information from Wikinews regarding the specified event.
    The information retrieval is restricted to 5 articles since the model using this information has a token limitation.
    
    Args:
        event (str): User specified historical event.
        num_articles (int): Number of articles to be considered (default: 5)

    Returns:
        list: Title and content of news articles. 
    """
    # look for news
    url = "https://en.wikinews.org/w/api.php"
    params = {
        "action": "query",
        "format": "json",
        "list": "search",
        "srsearch": event,
    }
    response = requests.get(url, params = params).json()
    search_results = response.get("query", {}).get("search", [])
    
    if not search_results:
        return None

    articles_list = list()
    for element in search_results[0:num_articles]:
        # Get title of article
        title = element["title"]
        
        # Get page content
        page_params = {
            "action": "query",
            "format": "json",
            "prop": "extracts",
            "explaintext": True,
            "titles": title,
        }
        
        page_response = requests.get(url, params = page_params).json()
        pages = page_response["query"]["pages"]
        page_content = next(iter(pages.values()))["extract"]

        articles_list.append({"title": title, "content": page_content})

    return articles_list

## Define background query (based on RAG input)

The prompt combines the retrieved information into context prompt.

In [27]:
# Retrieve the Wiki information
wikipedia_info = getting_background_wikipedia(event)
wikinews_info = getting_background_wikinews(event)

# Contruct prompt for background information
background_context = f"""
    Background Information:
    From Wikipedia: {wikipedia_info}
    
    From Wikinews : {wikinews_info}
"""

In [24]:
print(wikipedia_info)

The fall of the Berlin Wall (German: Mauerfall, pronounced [ˈmaʊ̯ɐˌfal] ) on 9 November 1989, during the Peaceful Revolution, marked the beginning of the destruction of the Berlin Wall and the figurative Iron Curtain, as East Berlin transit restrictions were overwhelmed and discarded.  Sections of the wall were breached, and planned deconstruction began the following June. It was one of the series of events that started the fall of communism in Central and Eastern Europe. The fall of the inner German border took place shortly afterward. An end to the Cold War was declared at the Malta Summit in early December, and German reunification took place in October the following year.


== Background ==


=== Opening of the Iron Curtain ===

The opening of the Iron Curtain between Austria and Hungary at the Pan-European Picnic on 19 August 1989 set in motion a peaceful chain reaction, at the end of which there was no longer an East Germany and the Eastern Bloc had disintegrated. After the picni

In [28]:
print(wikinews_info)

[{'title': 'Thousands to celebrate twenty years since fall of Berlin Wall', 'content': 'Monday, November 9, 2009 \n\nLeaders from around the world are set to join hundreds of thousands of people in Berlin, Germany tonight to celebrate twenty years of the fall of the Berlin Wall, an historic event which contributed to the end of the Cold War.\nAs the capital prepared for the anniversary of the Wall\'s fall with festivities planned throughout the city, United States Secretary of State Hillary Clinton called for renewed worldwide efforts for freedom for those still living in repressive regimes. "Our history did not end the night the Wall came down," she said last night to influential political figures, past and present. "To expand freedom to more people, we cannot accept that freedom does not belong to all people. We cannot allow oppression defined and justified by religion or tribe to replace that of [communist] ideology."\nCelebrations will centre around the Brandenburg Gate, which has 

## Step 1: Implementation of Tree-of-Thoughts (ToT) - Generate thoughts (using background)

This steps generates n ideas (default: n = 3) about the event and scenario based on the background information. This is the first step of the reasoning part through Tree-of-Thoughts.

I use the model [DeepSeek-R1-Distill-Llama-70B](https://console.groq.com/docs/model/deepseek-r1-distill-llama-70b) in this step as it is made for reasoning and strategic planning tasks. It is also free of use, fast and easy accessible through the groq API.

I use a relatively high randomness factor (temperature = 0.9) to allow the model to generate "free" ideas.

Note that I also tried using top-p to control the randomness of the thought generation (instead of using temperature). Since the output is semantically highly similar I stick with temperature for consistency with the other specifications downstream. Further, the groq documentation recommends using either temperature or top-p/k and not both (See [groq-docs](https://console.groq.com/docs/api-reference#chat)).

The testing is documented on GitHub (see [DOCUMENTATION-Prompts](https://github.com/PThie/Google-Kaggle-GenAI/blob/main/Cap-Stone/DOCUMENTATION-Prompts.md)).

In [10]:
def generating_thoughts(event: str, scenario: str, role: str, output_type: str, background: str, num_thoughts = 3) -> str:
    """
    Generate thoughts given the event, the alternative scenario and the background information. The thoughts (after being evaluated) can be used in the steps downstream to generate the actual output text.
    
    Args:
        event (str): User specified historical event.
        scenario (str): User specified alternative scenario related to the event.
        role (str): User specified role the model should take on.
        output_type (str): User specified output type which provides the structure for the response.
        background (str): Prompt with background information related to the event.
        num_thoughts (int): Number of thoughts that should be generated. Default = 3.

    Returns:
        str: Text with response from the model given the inputs. The response text includes the thinking process as well as the actual output.
    """
    prompt = f"""
        {background}

        You are brainstorming text outlines as a {role} in 2025.
        The historical event is: {event}.
        The counterfactual scenario is: {scenario}.

        Based on the above background, generate {num_thoughts} brief but distinct outlines (1-2 sentences each) for a {output_type}.

        Please return the response as JSON with the following structure:
            
            "thought_id": integer, running ID,
            "title": title of the thought outline,
            "content": content of the thought

        Example output:

            "thought_id": 1,
            "title": "The Road Not Taken: JFK's Vision for a United America",
            "content": "If JFK had not been assassinated, his 'New Frontier' agenda might have prioritized space exploration, civil rights, and educational reforms, potentially uniting a fractured nation and altering the course of American history in the 1960s and beyond."
    
    """

    response = client.chat.completions.create(
        model = "deepseek-r1-distill-llama-70b",
        messages = [
            {
                "role": "user",
                "content": prompt
            }
        ],
        temperature = 0.9,
        max_completion_tokens = 1000
    )

    return response.choices[0].message.content

In [11]:
thoughts = generating_thoughts(event = event, scenario = scenario, role = role, output_type = output_type, background = background_context)

We can see how the model thinks below. I will call this the **"outline" model**. First, it goes through some reasoning based on the prompt (i.e. the task) and the background information supplied (everything between inside `<think></think>`). After the think-block, the model generates the desired thoughts

In [12]:
print(thoughts)

<think>
Okay, I'm trying to help someone who wants to create three brief outlines for a school textbook page about the fall of the Berlin Wall, but in a counterfactual scenario where the wall never came down. The user provided a bunch of background information, so I need to extract the key points and imagine different directions.

First, I should understand the historical event: the Berlin Wall fell in 1989, leading to German reunification and the end of the Cold War. The counterfactual is that the wall didn't fall, so I need to explore what that would mean.

The user wants three distinct outlines, each with a title and a 1-2 sentence content. They should be in JSON format, with a thought_id, title, and content.

Looking at the example response, each outline takes a different angle. The first one focuses on the continued division, the second on global politics, and the third on technological and cultural stagnation.

So, I'll need to think of three different angles:

1. **Geopolitical 

### Extracting raw reponse

- Since the model returns the thinking process as well as the actual output, lets extract just the output. For the following steps, I do not want the model to get impact by the reasoning of the previous model run but solely base its reasoning on the (raw) output.

In [13]:
def extracting_raw_response(response_text: str, json_format = False) -> str:
    """
    Extract the raw output from the model and drop the thinking/ reasoning output. This should provide a clear response which can be used further.
    
    Args:
        reponse_text (str): Response output from the model.
        json_format (bool): If true returned output will be in JSON style. 
        
    Returns:
        str: Clean output from the model without reasoning output.
    
    """
    if json_format == True:
        raw_output = json.loads(re.search(r"```json(.*)```", response_text, re.DOTALL).group(1).strip())
    else:
        raw_output = re.search(r"</think>(.*)", response_text, re.DOTALL).group(1).strip()
    
    return raw_output

In [14]:
raw_thoughts = extracting_raw_response(response_text = thoughts)
print(raw_thoughts)

```json
[
    {
        "thought_id": 1,
        "title": "A Divided World: The Persistence of the Cold War",
        "content": "If the Berlin Wall had never fallen, the Cold War might have endured, with continued division between East and West, prolonged geopolitical tensions, and a different trajectory for global politics."
    },
    {
        "thought_id": 2,
        "title": "Stagnation and Isolation",
        "content": "Without the fall of the Berlin Wall, technological and cultural advancements might have been hindered due to continued isolation, affecting global progress and exchange."
    },
    {
        "thought_id": 3,
        "title": "Economic Disparities and Global Trade",
        "content": "The persistence of the Berlin Wall could have led to ongoing economic disparities, affecting global trade and potentially slowing the integration of world markets."
    }
]
```


## Step 2 ToT: Evaluate thoughts

- Task: Given the thoughts generate in step 1, it is now crucial to identify the best thought from the **outline model**, which should be the main outline for the final text output.
- I send the generated thoughts through another model which I label as the **editor model**.
- I use a medium randomness parameter (temperature = 0.5) here. The lower randomness seem to work better as it generates more clear reasoning on the evaluation making the statements more sharp. I also tried a higher temperature value before. The testing is documented on GitHub (see [DOCUMENTATION-Prompts](https://github.com/PThie/Google-Kaggle-GenAI/blob/main/Cap-Stone/DOCUMENTATION-Prompts.md)).
- NOTE: This is based on the clean thoughts. I disregard any reasoning from before to keep the decision process as unbiased as possible.

In [15]:
def evaluating_thoughts(thoughts: str, num_tokens: int) -> str:
    """
    Evaluate the generated thoughts and rank them according to their creativity, plausibility, and potential impact

    Args:
        thoughts (str): Multiple thoughts generated in the previous step which should be evaluated (supplied as JSON if raw response is converted, see function 'extracting_raw_response').
        num_tokens: User specified number of tokens to use (text length).
    Returns:
        str: List of thoughts where each thought is evaluated according to its content.
    """
    evaluation_prompt = f"""
        You are an expert editor. Evaluate the following text outlines for creativity, plausibility, and potential impact.
        Rank them from best to worst and provide a short reason.

        Thoughts:
        {thoughts}

        Please return the response as JSON keeping the following structure and adding the evaluation report:
            
            "thought_id": integer, running ID,
            "title": title of the thought outline,
            "content": content of the thought,
            "evaluation": evaluation comments of the though,
            "evaluation_rank": integer, rank of the thought

        Example output:

            "thought_id": 1,
            "title": "The Road Not Taken: JFK's Vision for a United America",
            "content": "If JFK had not been assassinated, his 'New Frontier' agenda might have prioritized space exploration, civil rights, and educational reforms, potentially uniting a fractured nation and altering the course of American history in the 1960s and beyond."
            "evaluation": "Focuses on international implications, providing a fresh perspective. Broad scope and focus on international effects make it a strong contender."
            "evaluation_rank": 3
    """

    response = client.chat.completions.create(
        model = "deepseek-r1-distill-llama-70b",
        messages = [
            {
                "role": "user",
                "content": evaluation_prompt
            }
        ],
        temperature = 0.5,
        max_completion_tokens = num_tokens
    )

    return response.choices[0].message.content

Again, we can inspect the reasoning process of our **editor model** using the thoughts of the **outline model**.

In [16]:
evaluated_thoughts = evaluating_thoughts(raw_thoughts, num_tokens = text_length)
print(evaluated_thoughts)

<think>
Okay, so I need to evaluate these three thought outlines based on creativity, plausibility, and potential impact, then rank them from best to worst. Let me go through each one step by step.

First, looking at Thought 1: "A Divided World: The Persistence of the Cold War." The content suggests that if the Berlin Wall hadn't fallen, the Cold War would have continued, leading to ongoing division between East and West, prolonged tensions, and a different global politics trajectory. This seems pretty plausible because the Berlin Wall was a significant symbol of the Cold War, and its fall marked a turning point. The idea that the Cold War would have continued makes sense, so plausibility is high. Creativity-wise, it's a classic alternate history scenario, which is common but still relevant. The potential impact is also considerable because it affects global politics, which is a big area. So, I think this is strong but maybe a bit on the common side.

Next, Thought 2: "Stagnation and I

## Step 3 ToT: Select the best thought (for further processing)

Based on the generated ideas (step 1 of ToT) and the evaluation process (step 2 of ToT), we can select the best thought which can be supplied to the final model run which generates the actual output returned to the user.

In [17]:
# Retrieve the best thought
thoughts_json = extracting_raw_response(evaluated_thoughts, json_format = True)
best_thought = [thought for thought in thoughts_json if thought["evaluation_rank"] == 1][0]

In [18]:
print(f"Best thought: {best_thought}")

Best thought: {'thought_id': 1, 'title': 'A Divided World: The Persistence of the Cold War', 'content': 'If the Berlin Wall had never fallen, the Cold War might have endured, with continued division between East and West, prolonged geopolitical tensions, and a different trajectory for global politics.', 'evaluation': "This thought presents a plausible and impactful scenario, exploring the continuation of the Cold War. While it's a common topic, its comprehensive approach makes it strong.", 'evaluation_rank': 1}


## Text Generation

For text generation, I switch to a different LLM and instead of using DeepSeek-R1-Distill-Llama-70B, I rely on Llama3–70B-8192, which is also available via the groq API. While the first model excels at reasoning, the second model focuses on content generation with a strength in creative writing and storytelling. This seems perfect for the task described in the prompt above.

In [19]:
def generating_text(event: str, scenario: str, role: str, output_type: str, additional_instructions: str, background: str, best_thought: dict, num_tokens: int) -> str:
    """
    Generate the final response given the inputs of the user.
    
    Args:
        event (str): User specified historical event.
        scenario (str): User specified alternative scenario related to the event.
        role (str): User specified role the model should take on.
        output_type (str): User specified output type which provides the structure for the response.
        additional_instructions (str): User specified additional instructions.
        background (str): Prompt with background information related to the event.
        best_thought (dict): Dictionary with the title and content of the best thought.
        num_tokens: User specified number of tokens to use (text length).

    Returns:
        str: Final text response.
    """
    prompt = f"""
        Based on the following background on the {event}:
        {background}

        And based on this outline with the title {best_thought["title"]} and the content {best_thought["content"]} 

        Take on the professional role {role} in 2025. Write a detailed {output_type}.
        Imagining that the following event in history had a different outcome: {event}.
        Consider the following scenario: {scenario} as counterfactual scenario.
        {additional_instructions}
        
        Do not write in conditional sentences. Write in active style such that you could think it really happened like this!
    """

    response = client.chat.completions.create(
        model = "llama3-70b-8192",
        messages = [
            {
                "role": "user",
                "content": prompt
            }
        ],
        temperature = 0.9,
        max_completion_tokens = num_tokens
    )

    return response.choices[0].message.content

In [20]:
final_story = generating_text(
    event = event,
    scenario = scenario,
    role = role,
    output_type = output_type,
    additional_instructions = additional_instructions,
    background = background_context,
    best_thought = best_thought,
    num_tokens = text_length
)

In [21]:
print(final_story)

**Textbook:** "A Divided World: Exploring the Consequences of a Persistent Cold War"

**Chapter 5: The Enduring Divide**

**The Berlin Wall Stands Tall**

November 9, 1989, was supposed to be a day of hope and unity. Instead, the Berlin Wall, a physical and symbolic barrier between East and West Berlin, remained standing. The protests, the demands for freedom, and the expectations of a new era of cooperation were met with a firm response from the Soviet Union and its Eastern European allies.

**Consequences of a Persistent Cold War**

The failure of the Berlin Wall to fall marked a turning point in modern history. The division between East and West, which had defined the post-World War II era, persisted. Geopolitical tensions continued to simmer, and the world remained mired in a Cold War stalemate.

**The Eastern Bloc: A Fortress of Ideology**

In the aftermath of the failed protests, the Soviet Union and its satellite states redoubled their efforts to maintain control and suppress di

## Adding some styling to the text

This part adds formatting to the generated final response for a visually more appealing output. Technically, this is not required but rather should give the user a more pleasing result and increase plausibility of the content.

In [22]:
def formatting_text(text):
    lines = text.split("\n")
    formatted_article = []

    # Wrap settings
    wrapper = textwrap.TextWrapper(width = 90)

    # Detect and format special parts
    for line in lines:
        line = line.strip()
        if line.startswith("**Date:"):
            formatted_article.append(f"\n📅 {line.replace('**', '')}")
        elif line.startswith("**By:"):
            formatted_article.append(f"✍️  {line.replace('**', '')}\n")
        elif line.startswith("---"):
            continue  # skip the horizontal line
        elif line.startswith("**") and line.endswith("**"):
            title = line.replace("**", "")
            formatted_article.append(f"\n📰 {title}\n" + "-" * len(title))
        elif line.startswith('"') or line.startswith("“"):
            # Quote handling
            formatted_article.append("\n🗨️  " + wrapper.fill(line))
        else:
            formatted_article.append(wrapper.fill(line))
    
    # Join the final result
    formatted_output = "\n\n".join(formatted_article)
    
    # Print or use it
    return formatted_output

In [23]:
final_text = formatting_text(final_story)
print(final_text)

**Textbook:** "A Divided World: Exploring the Consequences of a Persistent Cold War"




📰 Chapter 5: The Enduring Divide
------------------------------




📰 The Berlin Wall Stands Tall
---------------------------



November 9, 1989, was supposed to be a day of hope and unity. Instead, the Berlin Wall, a
physical and symbolic barrier between East and West Berlin, remained standing. The
protests, the demands for freedom, and the expectations of a new era of cooperation were
met with a firm response from the Soviet Union and its Eastern European allies.




📰 Consequences of a Persistent Cold War
-------------------------------------



The failure of the Berlin Wall to fall marked a turning point in modern history. The
division between East and West, which had defined the post-World War II era, persisted.
Geopolitical tensions continued to simmer, and the world remained mired in a Cold War
stalemate.




📰 The Eastern Bloc: A Fortress of Ideology
--------------------------------------