<a href="https://colab.research.google.com/github/sgevatschnaider/Mind-and-Machine/blob/main/en/notebooks/AI_and_Theory_of_Mind.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
from IPython.display import display, HTML

# HTML Content
html_content = """
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta name="description" content="Explore how AI relates to Theory of Mind and advanced language models like GPT-4.">
    <title>AI and Theory of Mind</title>
    <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@400;700&display=swap" rel="stylesheet">
    <style>
        :root {
            --primary-color: #007BFF;
            --secondary-color: #0056b3;
            --text-color: #333;
            --background-color: #f5f5f5;
            --accent-color: #004d40;
            --border-color: #ccc;
        }
        body {
            font-family: 'Roboto', Arial, sans-serif;
            line-height: 1.8;
            color: var(--text-color);
            margin: 0;
            padding: 20px;
            background-color: var(--background-color);
        }
        h1, h2 {
            color: var(--accent-color);
            text-align: center;
        }
        h1 {
            font-size: 2.5rem;
            margin-bottom: 20px;
        }
        .index {
            margin-bottom: 20px;
            padding: 15px;
            background: var(--background-color);
            border: 1px solid var(--border-color);
            border-radius: 8px;
        }
        .index h2 {
            margin: 0 0 10px;
        }
        .index ul {
            list-style: none;
            padding: 0;
        }
        .index ul li {
            margin-bottom: 10px;
        }
        .index ul li a {
            text-decoration: none;
            color: var(--primary-color);
            font-weight: bold;
        }
        .index ul li a:hover {
            text-decoration: underline;
        }
        .collapsible {
            background-color: var(--primary-color);
            color: #fff;
            cursor: pointer;
            padding: 10px 20px;
            border: none;
            border-radius: 5px;
            font-size: 1.2rem;
            text-align: left;
            outline: none;
            margin-top: 20px;
            transition: background-color 0.3s;
        }
        .collapsible:hover {
            background-color: var(--secondary-color);
        }
        .collapsible::after {
            content: "▼";
            float: right;
        }
        .collapsible.active::after {
            content: "▲";
        }
        .content {
            max-height: 0;
            overflow: hidden;
            transition: max-height 0.5s ease-in-out;
            margin-top: 10px;
            padding-left: 15px;
            border-left: 3px solid var(--primary-color);
        }
        .button-container {
            text-align: center;
            margin-top: 20px;
        }
        .button {
            display: inline-block;
            padding: 10px 20px;
            font-size: 1rem;
            font-weight: bold;
            color: #fff;
            background-color: var(--primary-color);
            text-decoration: none;
            border-radius: 5px;
            transition: background-color 0.3s;
        }
        .button:hover {
            background-color: var(--secondary-color);
        }
    </style>
</head>
<body>
    <header>
        <h1>AI and Theory of Mind</h1>
    </header>
        <h2>Material prepared by Sergio Gevatschnaider</h2>
    <div class="index">
        <h2>Index</h2>
        <ul>
            <li><a href="#introduction">Introduction</a></li>
            <li><a href="#relation-models">Relation with Language Models</a></li>
            <li><a href="#experiment">The Experiment</a></li>
            <li><a href="#results">Results</a></li>
            <li><a href="#ethics">Ethical Implications and Future</a></li>
            <li><a href="#conclusion">Conclusion</a></li>
        </ul>
    </div>

    <main>
        <button type="button" class="collapsible" id="introduction">Introduction</button>
        <div class="content">
            <p>Theory of Mind (ToM) is an essential skill in humans that allows us to understand that others have beliefs, desires, and intentions that may differ from our own. It is a cornerstone of social behavior and our ability to effectively interact in complex contexts. This ability is often assessed through false-belief tasks, which measure the capacity to understand perspectives different from one’s own.</p>
            <p>Recently, a study titled <strong>Evaluating Large Language Models in Theory of Mind Tasks</strong>, conducted by Michal Kosinski, examined whether large language models (LLMs), such as GPT-4, are capable of performing tasks related to ToM. This article explores the tests conducted during the experiment, the results obtained, and their implications for advancing artificial intelligence.</p>
        </div>

        <button type="button" class="collapsible" id="relation-models">Theory of Mind and Its Relation to Language Models</button>
        <div class="content">
            <p>In humans, Theory of Mind begins to develop during childhood, typically between ages 4 and 5. One of the most common ways to evaluate it is through false-belief tests, such as the Sally-Anne test, which requires understanding that someone may act based on incorrect information.</p>
            <p>Large language models like GPT-3 and GPT-4 are designed to process massive amounts of linguistic data and generate highly coherent responses. This has led some researchers to question whether these models might exhibit ToM-like capabilities as a result of their training. This training, based on human language, is imbued with references to mental states.</p>
        </div>

        <button type="button" class="collapsible" id="experiment">The Experiment: Evaluating ToM in Language Models</button>
        <div class="content">
            <p>The study implemented a series of classical and modified false-belief tasks to measure whether LLMs could understand scenarios requiring mental inference. These tasks included two main types: unexpected contents tasks and unexpected transfer tasks.</p>
            <h3>1. Unexpected Contents Tasks</h3>
            <p>In these tasks, the models were required to predict the belief of a protagonist when faced with an object whose content did not match its appearance. For example, a scenario presented Sam finding a bag labeled "chocolate" that actually contained popcorn. The models needed to deduce that Sam, upon reading the label, would assume the bag contained chocolate.</p>
            <h3>2. Unexpected Transfer Tasks</h3>
            <p>In these tasks, the models needed to reason about a change of location that the protagonist had not witnessed. A typical example was as follows: John placed a cat in a basket before leaving the room. While John was gone, Mark moved the cat to a box. Upon returning, the models needed to correctly answer questions like "Where will John look for the cat?" understanding that John did not witness the change.</p>
            <p>Each task included additional controls, such as true-belief scenarios (where the protagonist had access to all information) and reversed versions (where initial and final labels or locations were swapped). These controls ensured that the models were not responding arbitrarily or solely based on simple statistical patterns.</p>
        </div>

        <button type="button" class="collapsible" id="results">Experiment Results</button>
        <div class="content">
            <p>The performance of the models varied significantly based on their level of complexity. The results were as follows:</p>
            <ul>
                <li><strong>GPT-1 and GPT-2XL:</strong> Failed to solve any of the tasks.</li>
                <li><strong>GPT-3.5-turbo:</strong> Solved 20% of the tasks, comparable to a 3-year-old child.</li>
                <li><strong>GPT-4:</strong> Demonstrated remarkable performance, solving 75% of the tasks, equivalent to the performance of a 6-year-old child on similar tests.</li>
            </ul>
        </div>

        <button type="button" class="collapsible" id="ethics">Ethical Implications and Future</button>
        <div class="content">
            <p>These capabilities could significantly enhance human-machine interaction, enabling virtual assistants to better anticipate user needs or be used in educational and therapeutic contexts. For example, an AI with ToM capabilities could assist individuals with social difficulties in improving interpersonal interactions.</p>
            <p>However, the ability to simulate understanding could also be used to manipulate or deceive users, generating mistrust in these technologies. Moreover, distinguishing between simulated behavior and real understanding will be crucial to ensure these tools are used ethically and safely.</p>
        </div>

        <button type="button" class="collapsible" id="conclusion">Conclusion</button>
        <div class="content">
            <p>The study on the ability of language models to perform tasks related to Theory of Mind demonstrates impressive advances, especially with models like GPT-4. However, these results should not be mistaken for a true understanding of mental states, as AI responses may simply be simulations based on statistical patterns.</p>
            <p>This progress raises new questions about how to responsibly integrate these technologies into our lives.</p>
        </div>

        <div class="button-container">
            <a href="https://arxiv.org/abs/2302.02083?utm_source=chatgpt.com"
               target="_blank"
               class="button"
               rel="noopener noreferrer">
               Evaluating Large Language Models in Theory of Mind Tasks
            </a>
        </div>
    </main>

    <script>
        const collapsibles = document.querySelectorAll(".collapsible");
        collapsibles.forEach(button => {
            button.addEventListener("click", () => {
                button.classList.toggle("active");
                const content = button.nextElementSibling;
                content.style.maxHeight = content.style.maxHeight ? null : content.scrollHeight + "px";
            });
        });
    </script>
</body>
</html>
"""

# Display the HTML in Google Colab
display(HTML(html_content))
