# AD Classification Notebook

In this notebook we load the **Group 6 model**, which was fine-tuned on the ADReSS dataset.  
The model weights can be downloaded here: [weights](https://technionmail-my.sharepoint.com/my?id=%2Fpersonal%2Fyarden%5Fnahum%5Fcampus%5Ftechnion%5Fac%5Fil%2FDocuments%2F%D7%A4%D7%A8%D7%95%D7%99%D7%A7%D7%98%20%D7%AA%D7%9B%D7%9F&viewid=80aca3e3%2Dbd79%2D43a9%2D91ca%2D05ae2b1505ec)

After loading the model, we use it to classify the synthetic stories we generated for different personas.  
At the end, a **DataFrame** is displayed, showing for each persona the model’s probability estimates (healthy vs. AD) together with its confidence.  

Currently, the stories from **06/07** look more consistent and suitable for the model.

In [8]:
import json
from pathlib import Path
from typing import List, Dict, Union, Optional
import torch
import pandas as pd
from transformers import AutoTokenizer
from transformers import AutoTokenizer, AutoModel
from torch import nn
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display

In [9]:
class ADClassifier(nn.Module):
    def __init__(self, model_name: str, dropout=0.1, hidden_dim=None):
        super().__init__()
        self.bert = AutoModel.from_pretrained(model_name)
        h = self.bert.config.hidden_size
        h2 = hidden_dim or (h // 2)
        self.drop = nn.Dropout(dropout)
        self.fc1 = nn.Linear(h, h2)
        self.bn = nn.BatchNorm1d(h2)
        self.act = nn.ReLU()
        self.fc2 = nn.Sequential(
            nn.Linear(h2, h2//2),
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(h2//2, 2)
        )

    def forward(self, input_ids, attention_mask):
        out = self.bert(input_ids=input_ids, attention_mask=attention_mask)
        x = out.pooler_output
        x = self.drop(x)
        x = self.fc1(x)
        x = self.bn(x)
        x = self.act(x)
        x = self.drop(x)
        return self.fc2(x)

In [10]:
# -------- CONFIG --------
MODEL_NAME = "bert-base-uncased"
CKPT_PATH = "best_model_base_bert.pt"
MAX_LEN = 512         # max tokens
BATCH_SIZE = 16
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# -------- LOAD MODEL + TOKENIZER --------
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = ADClassifier(model_name=MODEL_NAME, dropout=0.1).to(DEVICE)
state = torch.load(CKPT_PATH, map_location=DEVICE)
model.load_state_dict(state)
model.eval()

ADClassifier(
  (bert): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(30522, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0-11): 12 x BertLayer(
          (attention): BertAttention(
            (self): BertSdpaSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_

In [11]:
@torch.no_grad()
def predict_proba(texts: List[str]) -> torch.Tensor:
    """Return class probabilities [N, 2]: [:,0]=Control, [:,1]=AD."""
    all_probs = []
    for i in range(0, len(texts), BATCH_SIZE):
        batch_texts = texts[i:i+BATCH_SIZE]
        enc = tokenizer(
            batch_texts,
            truncation=True,
            padding=True,               # dynamic padding
            max_length=MAX_LEN,
            return_tensors="pt"
        )
        enc = {k: v.to(DEVICE) for k, v in enc.items()}
        logits = model(enc["input_ids"], enc["attention_mask"])  # [B, 2]
        probs = torch.softmax(logits, dim=-1).cpu()
        all_probs.append(probs)
    return torch.cat(all_probs, dim=0) if all_probs else torch.empty(0, 2)


def compute_predicted_start_age(ages: List[int], prob_ad: List[float], threshold: float = 0.5) -> Optional[int]:
    """
    Optional helper:
    Return the first age where prob_ad >= threshold (default 0.5).
    If never crosses threshold, return None.
    """
    if not ages:
        return None
    # Ensure ages and prob_ad are aligned and sorted by age
    paired = sorted(zip(ages, prob_ad), key=lambda x: x[0])
    for age, p in paired:
        if p >= threshold:
            return age
    return None

In [12]:
def df_from_person_json(person: Dict, add_predicted_start_age: bool = False, threshold: float = 0.5) -> pd.DataFrame:
    """
    Build a DataFrame for a single person JSON with 'start_deterioration_age' column preserved.
    """
    name = person.get("name", "UNKNOWN")
    # <-- keep provided start age
    sda = person.get("start_deterioration_age", None)
    stories = person.get("stories", [])

    texts = [s.get("story", "") for s in stories]
    ages = [s.get("age", None) for s in stories]

    probs = predict_proba(texts)  # [N,2]
    if probs.numel() == 0:
        cols = ["person_name", "start_deterioration_age", "age",
                "story", "prob_control", "prob_ad", "pred_label", "confidence"]
        return pd.DataFrame(columns=cols)

    prob_control = probs[:, 0].numpy()
    prob_ad = probs[:, 1].numpy()
    pred_label = probs.argmax(dim=1).numpy()
    confidence = probs.max(dim=1).values.numpy()

    df = pd.DataFrame({
        "person_name": name,
        "start_deterioration_age": sda,            # <-- column per row
        "age": ages,
        "story": texts,
        "prob_control": prob_control,
        "prob_ad": prob_ad,
        "pred_label": pred_label,                  # 0=Control, 1=AD
        "confidence": confidence
    }).sort_values("age", kind="stable").reset_index(drop=True)

    if add_predicted_start_age:
        predicted = compute_predicted_start_age(
            df["age"].tolist(), df["prob_ad"].tolist(), threshold=threshold)
        # same value for all rows of this person
        df["predicted_start_age"] = predicted
        # store threshold as metadata (optional)
        df.attrs["predicted_start_age_threshold"] = threshold

    return df

In [13]:
def run_on_json_file(json_path: Union[str, Path], add_predicted_start_age: bool = False, threshold: float = 0.5) -> pd.DataFrame:
    with open(json_path, "r", encoding="utf-8") as f:
        person = json.load(f)
    return df_from_person_json(person, add_predicted_start_age=add_predicted_start_age, threshold=threshold)


def run_on_json_dir(json_dir: Union[str, Path], add_predicted_start_age: bool = False, threshold: float = 0.5) -> pd.DataFrame:
    json_dir = Path(json_dir)
    dfs = []
    for p in sorted(json_dir.glob("*.json")):
        try:
            df = run_on_json_file(
                p, add_predicted_start_age=add_predicted_start_age, threshold=threshold)
            df["source_file"] = p.name
            dfs.append(df)
        except Exception as e:
            print(f"[WARN] Failed on {p}: {e}")
    if not dfs:
        cols = ["person_name", "start_deterioration_age", "age", "story",
                "prob_control", "prob_ad", "pred_label", "confidence", "source_file"]
        if add_predicted_start_age:
            cols.append("predicted_start_age")
        return pd.DataFrame(columns=cols)
    return pd.concat(dfs, ignore_index=True)

# Usage

In [18]:
# Path updated to new dementia stories (relative path)
# Example: run on a single persona file
# anthony-samuel-reyes
df_one = run_on_json_file(
    "stories/data_oct_6/dementia/jacob-stein.json",
    add_predicted_start_age=True,
    threshold=0.5,
)
df_one

Unnamed: 0,person_name,start_deterioration_age,age,story,prob_control,prob_ad,pred_label,confidence,predicted_start_age
0,Jacob Stein,69,60,It was my first lecture back after sabbatical—...,0.938315,0.061685,0,0.938315,72
1,Jacob Stein,69,63,That lecture after I came back from sabbatical...,0.945287,0.054713,0,0.945287,72
2,Jacob Stein,69,66,"I did this demonstration once—well, many times...",0.846283,0.153717,0,0.846283,72
3,Jacob Stein,69,69,"There was this lecture I gave... I had a tube,...",0.50967,0.49033,0,0.50967,72
4,Jacob Stein,69,72,"I had this demonstration... a tube, I think. A...",0.222562,0.777438,1,0.777438,72
5,Jacob Stein,69,75,There was a feather... and something else. In ...,0.075813,0.924187,1,0.924187,72


## Previous: Prediction on new stories (using Copilot - 10/8 data)

In [8]:
import random


def generate_texts(n):
    # Story-based texts with speech-to-text features that indirectly show symptoms
    prompts = [
        "So, uhh... last Tuesday I was getting ready to go to the store, you know? And I put my keys down somewhere while I was... while I was looking for my wallet. Then when I went to leave, I couldn't... ahh... I searched everywhere for those keys. Kitchen counter, coffee table, even checked my pockets three times. My wife found them in the refrigerator, of all places. I don't... I honestly don't remember putting them there. She just laughed and said I must have been thinking about something else, but... but it felt strange, you know? Like there was this gap where... where I should remember something but it's just... blank.",

        "Yesterday at the grocery store, I ran into this woman from... from our church, I think? She waved at me and came over with this big smile, talking about how her daughter just graduated. And I'm standing there thinking, 'I know this person, I definitely know her,' but her name just... it wouldn't come to me. So I just smiled and nodded and said congratulations, hoping she wouldn't notice that I... that I had no idea what her name was. It was only when I got home that it hit me - Margaret! From the choir. But in that moment, it was like... like trying to grab something in the dark.",

        "Last week my grandson had his birthday party, and I was so excited to go. I had it written down on my calendar and everything. But when I showed up on Sunday afternoon with his present, nobody was home. Turns out the party was on Saturday. I just... I was so sure it was Sunday. The look on his face when I came by the next day to give him his gift... he tried to be polite about it, but I could tell he was disappointed. My daughter said it happens to everyone, but... but I used to be the one who remembered everything for the whole family.",

        "My daughter got me this new smartphone for Christmas, and she sat with me for two hours trying to show me how to... how to use it. She'd say something like 'First tap here, then swipe this way,' and by the time I found the first button, I'd already forgotten what came after. We went through the same steps maybe five times, and each time it felt like... like she was explaining it for the first time. I could see her getting a little frustrated, though she tried not to show it. Finally she just set it up for me and said we'd practice later.",

        "I was making my famous apple pie last Sunday - the recipe I've been using for thirty years. Got all the ingredients out, started mixing everything together. But then I'm standing there with the salt shaker in my hand, staring at the bowl, and I can't... I couldn't remember if I'd already added it or not. The mixture looked the same either way. So I tasted it, but that didn't help much. Ended up starting over because I was worried I'd... I'd ruin it if I guessed wrong. Took me three times as long as usual.",

        "I've been doing fine, really. Still drive myself to the store, still handle all my bills and paperwork. Last month though, my son came over and noticed I had three bottles of milk in the fridge. All with different expiration dates. I told him I must have forgotten I already had some when I went shopping. He didn't say much, just helped me clean out the fridge. But I could see him and his wife exchanging looks. They think I don't notice, but... but I do. They're watching me more carefully now.",

        "We were at my nephew's wedding last month - beautiful ceremony. During the reception, this man came up to me and started talking about golf, asking how my swing was coming along. He seemed to know me really well, but I... I couldn't place him at all. I just went along with it, laughing at his jokes and nodding when he talked about our 'last game.' It wasn't until my wife came over and said 'Hi, Bob' that I realized it was our neighbor from three doors down. We've been talking over the fence for fifteen years.",

        "My wife asked me to pick up some things from the pharmacy while I was out running errands. Simple enough - just her prescription and some aspirin. But when I got there, I stood in line for ten minutes trying to remember what the second thing was. I knew it started with 'A'... aspirin? Antacid? Advil? The pharmacist was very patient while I called my wife from their phone. Turned out it was just aspirin, but standing there not knowing... it felt like my mind was full of fog.",

        "I've been driving these same roads for forty years, so when I got a little turned around on my way to the hardware store last week, I wasn't too worried at first. But then I found myself on a street I didn't recognize, and then another one. The landmarks all looked... different somehow. It took me twenty minutes of driving in circles before I found Main Street again. When I finally got home, my wife asked why I was so late. I just told her there was traffic.",

        "I was telling my neighbor about this great movie I'd watched the night before - really enjoyed it, lots of action and... and good acting. But halfway through describing the plot, he got this funny look on his face. Turns out I'd told him the exact same story about the exact same movie just two days earlier. I honestly... I thought I was telling him for the first time. He was polite about it, just said 'Oh yeah, you mentioned that one.' But I could tell he was... he was concerned. Made me wonder what else I might be repeating without knowing it."
    ]
    return random.sample(prompts, n)


found = False
attempt = 1
results = []
while not found:
    texts = generate_texts(5)
    probs = predict_proba(texts)
    prob_ad = probs[:, 1].numpy()
    for i, (text, p) in enumerate(zip(texts, prob_ad)):
        results.append((text, p))
    print(f"Attempt {attempt}: prob_ad values: {prob_ad}")
    if any(p > 0.5 for p in prob_ad):
        found = True
    else:
        attempt += 1

# Show all tested texts and their prob_ad
for i, (text, p) in enumerate(results):
    print(f"Text {i+1}: prob_ad={p:.3f}")
    print(f"Length: {len(text.split())} words")
    print(f"Text: {text[:100]}...")
    print()

Attempt 1: prob_ad values: [0.0554108  0.2319879  0.66435313 0.6493704  0.35883102]
Text 1: prob_ad=0.055
Length: 89 words
Text: My wife asked me to pick up some things from the pharmacy while I was out running errands. Simple en...

Text 2: prob_ad=0.232
Length: 92 words
Text: I've been doing fine, really. Still drive myself to the store, still handle all my bills and paperwo...

Text 3: prob_ad=0.664
Length: 88 words
Text: I've been driving these same roads for forty years, so when I got a little turned around on my way t...

Text 4: prob_ad=0.649
Length: 111 words
Text: Yesterday at the grocery store, I ran into this woman from... from our church, I think? She waved at...

Text 5: prob_ad=0.359
Length: 96 words
Text: I was making my famous apple pie last Sunday - the recipe I've been using for thirty years. Got all ...



In [9]:
# Test all 10 story-based texts and see which ones score highest
all_texts = [
    "So, uhh... last Tuesday I was getting ready to go to the store, you know? And I put my keys down somewhere while I was... while I was looking for my wallet. Then when I went to leave, I couldn't... ahh... I searched everywhere for those keys. Kitchen counter, coffee table, even checked my pockets three times. My wife found them in the refrigerator, of all places. I don't... I honestly don't remember putting them there. She just laughed and said I must have been thinking about something else, but... but it felt strange, you know? Like there was this gap where... where I should remember something but it's just... blank.",

    "Yesterday at the grocery store, I ran into this woman from... from our church, I think? She waved at me and came over with this big smile, talking about how her daughter just graduated. And I'm standing there thinking, 'I know this person, I definitely know her,' but her name just... it wouldn't come to me. So I just smiled and nodded and said congratulations, hoping she wouldn't notice that I... that I had no idea what her name was. It was only when I got home that it hit me - Margaret! From the choir. But in that moment, it was like... like trying to grab something in the dark.",

    "Last week my grandson had his birthday party, and I was so excited to go. I had it written down on my calendar and everything. But when I showed up on Sunday afternoon with his present, nobody was home. Turns out the party was on Saturday. I just... I was so sure it was Sunday. The look on his face when I came by the next day to give him his gift... he tried to be polite about it, but I could tell he was disappointed. My daughter said it happens to everyone, but... but I used to be the one who remembered everything for the whole family.",

    "My daughter got me this new smartphone for Christmas, and she sat with me for two hours trying to show me how to... how to use it. She'd say something like 'First tap here, then swipe this way,' and by the time I found the first button, I'd already forgotten what came after. We went through the same steps maybe five times, and each time it felt like... like she was explaining it for the first time. I could see her getting a little frustrated, though she tried not to show it. Finally she just set it up for me and said we'd practice later.",

    "I was making my famous apple pie last Sunday - the recipe I've been using for thirty years. Got all the ingredients out, started mixing everything together. But then I'm standing there with the salt shaker in my hand, staring at the bowl, and I can't... I couldn't remember if I'd already added it or not. The mixture looked the same either way. So I tasted it, but that didn't help much. Ended up starting over because I was worried I'd... I'd ruin it if I guessed wrong. Took me three times as long as usual.",

    "I've been doing fine, really. Still drive myself to the store, still handle all my bills and paperwork. Last month though, my son came over and noticed I had three bottles of milk in the fridge. All with different expiration dates. I told him I must have forgotten I already had some when I went shopping. He didn't say much, just helped me clean out the fridge. But I could see him and his wife exchanging looks. They think I don't notice, but... but I do. They're watching me more carefully now.",

    "We were at my nephew's wedding last month - beautiful ceremony. During the reception, this man came up to me and started talking about golf, asking how my swing was coming along. He seemed to know me really well, but I... I couldn't place him at all. I just went along with it, laughing at his jokes and nodding when he talked about our 'last game.' It wasn't until my wife came over and said 'Hi, Bob' that I realized it was our neighbor from three doors down. We've been talking over the fence for fifteen years.",

    "My wife asked me to pick up some things from the pharmacy while I was out running errands. Simple enough - just her prescription and some aspirin. But when I got there, I stood in line for ten minutes trying to remember what the second thing was. I knew it started with 'A'... aspirin? Antacid? Advil? The pharmacist was very patient while I called my wife from their phone. Turned out it was just aspirin, but standing there not knowing... it felt like my mind was full of fog.",

    "I've been driving these same roads for forty years, so when I got a little turned around on my way to the hardware store last week, I wasn't too worried at first. But then I found myself on a street I didn't recognize, and then another one. The landmarks all looked... different somehow. It took me twenty minutes of driving in circles before I found Main Street again. When I finally got home, my wife asked why I was so late. I just told her there was traffic.",

    "I was telling my neighbor about this great movie I'd watched the night before - really enjoyed it, lots of action and... and good acting. But halfway through describing the plot, he got this funny look on his face. Turns out I'd told him the exact same story about the exact same movie just two days earlier. I honestly... I thought I was telling him for the first time. He was polite about it, just said 'Oh yeah, you mentioned that one.' But I could tell he was... he was concerned. Made me wonder what else I might be repeating without knowing it."
]

# Test all texts and sort by prob_ad
probs = predict_proba(all_texts)
prob_ad_all = probs[:, 1].numpy()

# Create a list of (text, prob_ad, word_count) and sort by prob_ad
results_sorted = []
for i, (text, prob) in enumerate(zip(all_texts, prob_ad_all)):
    word_count = len(text.split())
    results_sorted.append((i+1, text, prob, word_count))

# Sort by prob_ad descending
results_sorted.sort(key=lambda x: x[2], reverse=True)

print("All story-based texts ranked by AD probability (highest first):")
print("=" * 70)
for rank, (text_num, text, prob, words) in enumerate(results_sorted, 1):
    print(f"Rank {rank}: Text {text_num} - prob_ad={prob:.3f} ({words} words)")
    print(f"Story: {text[:100]}...")
    print()

All story-based texts ranked by AD probability (highest first):
Rank 1: Text 9 - prob_ad=0.664 (88 words)
Story: I've been driving these same roads for forty years, so when I got a little turned around on my way t...

Rank 2: Text 2 - prob_ad=0.649 (111 words)
Story: Yesterday at the grocery store, I ran into this woman from... from our church, I think? She waved at...

Rank 3: Text 1 - prob_ad=0.548 (111 words)
Story: So, uhh... last Tuesday I was getting ready to go to the store, you know? And I put my keys down som...

Rank 4: Text 7 - prob_ad=0.379 (97 words)
Story: We were at my nephew's wedding last month - beautiful ceremony. During the reception, this man came ...

Rank 5: Text 5 - prob_ad=0.359 (96 words)
Story: I was making my famous apple pie last Sunday - the recipe I've been using for thirty years. Got all ...

Rank 6: Text 4 - prob_ad=0.350 (105 words)
Story: My daughter got me this new smartphone for Christmas, and she sat with me for two hours trying to sh...

Rank 7: T

# Longitudinal Story Analysis

Testing the same core memory (from age 40) retold at different ages with increasing cognitive decline.

In [10]:
# Longitudinal story: Same core memory (car accident at age 40) retold at different ages
# Core event: A car accident where the person hit a deer while driving to work

longitudinal_stories = {
    "Age 63 (2023)": """
    You know, I was just thinking about this accident I had back when I was... oh, must have been around forty. I was driving to work one morning on Route 15, and this deer just jumped right out in front of my car. Boom! Hit it head on. The whole front end was smashed up pretty bad. I remember calling my boss from the side of the road to tell him I'd be late. Had to wait for the tow truck for about an hour. The deer didn't make it, poor thing. Insurance covered most of it, but I was without a car for two weeks. My wife had to drive me to work during that time. It was a real hassle, but these things happen, you know? At least no one got hurt.
    """,

    "Age 66 (2026)": """
    Oh, that reminds me of this accident I had... when was it? Must have been in my forties sometime. I was driving to work - or maybe it was coming home? Anyway, this deer came out of nowhere and... and I hit it. The car was pretty banged up. I remember being stuck there for a while waiting for... for someone to come help me. A tow truck, I think. The deer was... well, it didn't survive. I felt terrible about that. My insurance took care of most of it, but I was without a car for... oh, it felt like forever. Maybe a week? Two weeks? My wife - what's her name... Sarah, yes Sarah - she had to drive me around. She wasn't too happy about that, I can tell you. But these things happen, right?
    """,

    "Age 69 (2029)": """
    There was this time... I can't quite remember when exactly, but I was driving somewhere important. Work, I think? Or maybe the store? Anyway, this animal - a deer, I'm pretty sure it was a deer - it just appeared right in front of me. I couldn't stop in time and... and there was this loud crash. The front of my car was all smashed up. I remember sitting there feeling confused about what to do next. Someone came to help me eventually, but I can't recall who it was. A police officer? Or maybe... no, it was someone with a truck. They took my car away. I had to call someone to pick me up, but I'm not sure who I called. My wife maybe? Though I'm having trouble remembering... was she still driving then? Anyway, the whole thing was very upsetting. I kept thinking I should remember more details, but it all feels so foggy now.
    """
}

# Test the longitudinal progression
ages = [63, 66, 69]
stories = [
    longitudinal_stories[f"Age {age} ({2023 + (age-63)})"] for age in ages]

# Get predictions for all three stories
long_probs = predict_proba(stories)
long_prob_ad = long_probs[:, 1].numpy()

print("Longitudinal Story Analysis: Same Memory Retold Over Time")
print("=" * 60)
print("Core Event: Car accident with deer at age 40")
print()

for i, (age, story, prob) in enumerate(zip(ages, stories, long_prob_ad)):
    year = 2023 + (age - 63)
    word_count = len(story.split())

    print(f"Age {age} ({year}): prob_ad = {prob:.3f} ({word_count} words)")
    print(f"Story preview: {story.strip()[:120]}...")
    print()

# Calculate progression
prob_increase_66 = long_prob_ad[1] - long_prob_ad[0]
prob_increase_69 = long_prob_ad[2] - long_prob_ad[1]
total_increase = long_prob_ad[2] - long_prob_ad[0]

print("Probability Progression:")
print(f"Age 63 → 66: +{prob_increase_66:.3f}")
print(f"Age 66 → 69: +{prob_increase_69:.3f}")
print(f"Total increase: +{total_increase:.3f}")

Longitudinal Story Analysis: Same Memory Retold Over Time
Core Event: Car accident with deer at age 40

Age 63 (2023): prob_ad = 0.490 (134 words)
Story preview: You know, I was just thinking about this accident I had back when I was... oh, must have been around forty. I was drivin...

Age 66 (2026): prob_ad = 0.816 (138 words)
Story preview: Oh, that reminds me of this accident I had... when was it? Must have been in my forties sometime. I was driving to work ...

Age 69 (2029): prob_ad = 0.430 (158 words)
Story preview: There was this time... I can't quite remember when exactly, but I was driving somewhere important. Work, I think? Or may...

Probability Progression:
Age 63 → 66: +0.326
Age 66 → 69: +-0.386
Total increase: +-0.060


# Systematic Approach: Achieving Sequential Increase

Now let's systematically try different strategies until we achieve the goal: 63 < 66 < 69 progression.

In [11]:
# Testing function for systematic attempts
def test_longitudinal_attempt(attempt_num, stories_dict, description):
    """Test a set of longitudinal stories and return results"""
    ages = [63, 66, 69]
    stories = [stories_dict[f"Age {age}"] for age in ages]

    probs = predict_proba(stories)
    prob_ad = probs[:, 1].numpy()

    print(f"ATTEMPT {attempt_num}: {description}")
    print("=" * 50)

    for i, (age, story, prob) in enumerate(zip(ages, stories, prob_ad)):
        word_count = len(story.split())
        print(f"Age {age}: prob_ad = {prob:.3f} ({word_count} words)")
        print(f"Preview: {story.strip()[:100]}...")
        print()

    # Check if sequential increase achieved
    sequential = prob_ad[0] < prob_ad[1] < prob_ad[2]
    increase_66 = prob_ad[1] - prob_ad[0]
    increase_69 = prob_ad[2] - prob_ad[1]
    total_increase = prob_ad[2] - prob_ad[0]

    print(
        f"Progression: {prob_ad[0]:.3f} → {prob_ad[1]:.3f} → {prob_ad[2]:.3f}")
    print(
        f"Changes: +{increase_66:.3f}, +{increase_69:.3f} (Total: +{total_increase:.3f})")
    print(f"Sequential increase: {'✅ SUCCESS!' if sequential else '❌ Failed'}")
    print()

    return sequential, prob_ad


# ATTEMPT 1: Minimize age 66 "sweet spot", maximize age 69 confusion
attempt1_stories = {
    "Age 63": """
    I had a car accident when I was forty. Hit a deer while driving to work. Called my boss, waited for tow truck. Deer died. Insurance covered repairs. Wife drove me to work for two weeks.
    """,

    "Age 66": """
    There was an accident... I hit a deer while driving. Car got damaged. Had to wait for help. The deer died. Insurance helped. My wife drove me around for a while.
    """,

    "Age 69": """
    I can't remember... there was an accident. Hit something... an animal? My car was broken. I was so confused. Couldn't remember what to do. Someone helped but I don't know who. Had to call... who did I call? My daughter? I can't remember her name. Was lost. Couldn't find my way home. People asked questions but I didn't know the answers. What street? What time? I don't know. Everything is foggy. I forget things. Can't remember names. Get lost all the time.
    """
}

success1, probs1 = test_longitudinal_attempt(
    1, attempt1_stories, "Short early stories, long confused final story")

ATTEMPT 1: Short early stories, long confused final story
Age 63: prob_ad = 0.913 (36 words)
Preview: I had a car accident when I was forty. Hit a deer while driving to work. Called my boss, waited for ...

Age 66: prob_ad = 0.887 (31 words)
Preview: There was an accident... I hit a deer while driving. Car got damaged. Had to wait for help. The deer...

Age 69: prob_ad = 0.668 (82 words)
Preview: I can't remember... there was an accident. Hit something... an animal? My car was broken. I was so c...

Progression: 0.913 → 0.887 → 0.668
Changes: +-0.026, +-0.219 (Total: +-0.245)
Sequential increase: ❌ Failed



In [12]:
# ATTEMPT 5: Similar-length stories (no normalization), same core memory, escalating cues by age
attempt5_stories = {
    "Age 63": """
    When I was forty, I had a car accident on Route 15 on my way to work. A deer leapt into the lane and I hit it before I could brake. I pulled over safely, called my boss Mr. Johnson, and then phoned the tow service from the mile marker near the old bridge. I waited about an hour; the driver Mike helped me file the basic report. The deer died, which made me feel awful, but I handled the insurance claim and kept receipts and dates in a folder. My wife Sarah drove me for two weeks while the shop did repairs. I remember the cross street, the weather, and the time pretty clearly, and I told the story once to my neighbor without mixing anything up.
    """,

    "Age 66": """
    I’ve been thinking again about that accident from my forties. I was on Route 15—at least I think it was 15—driving home from work when a deer jumped out. I hit it, pulled over, and called for a tow. I hesitated for a second on the company’s name, and I paused on my wife’s name—Sarah—before I said it. A few details were fuzzy, like the exact cross street or whether the mile marker was before or after the bridge, but I explained what happened and the sequence of calls. I finished the insurance forms even though I needed to double-check the date. Later, telling it to a friend, I repeated one part and had to start that sentence over, but I still felt mostly confident about the main points.
    """,

    "Age 69": """
    I keep retelling that accident, and sometimes I’m not sure if I said this already. I was driving—maybe from work, maybe to the store—and an animal, I think a deer, ran out. I hit it and stopped, but then I felt stuck. I tried to call someone and couldn’t pull the number up, so I asked a bystander, and then asked again because I lost the thread. When a man spoke to me, I knew his face but used the wrong name. For a moment I blanked on my daughter’s name too. They asked me the street and the time and I didn’t know. When I left, I circled and got turned around before I found the main road. I felt foggy, repeated myself, and needed step-by-step help to get home.
    """,
}

success5, probs5 = test_longitudinal_attempt(
    5, attempt5_stories, "Similar-length stories without normalization; escalating cues by age")

ATTEMPT 5: Similar-length stories without normalization; escalating cues by age
Age 63: prob_ad = 0.191 (128 words)
Preview: When I was forty, I had a car accident on Route 15 on my way to work. A deer leapt into the lane and...

Age 66: prob_ad = 0.193 (129 words)
Preview: I’ve been thinking again about that accident from my forties. I was on Route 15—at least I think it ...

Age 69: prob_ad = 0.374 (131 words)
Preview: I keep retelling that accident, and sometimes I’m not sure if I said this already. I was driving—may...

Progression: 0.191 → 0.193 → 0.374
Changes: +0.002, +0.181 (Total: +0.183)
Sequential increase: ✅ SUCCESS!



## Why the decline happened and how we reversed it

- The model tends to score highest on mild-to-moderate symptoms (age 66), and drops when the narrative becomes very fragmented (age 69). This likely reflects patterns in its training data, where moderately impaired speech shows clearer AD signals than severely disorganized speech.
- Length differences can bias scores. Short, confident narratives (often at 63) and very long, fragmented ones (at 69) can both shift probabilities in unintended ways.
- Reversal strategy: keep story lengths similar (within ~10 words) and escalate high-signal cues gradually: light hesitations at 63 → mild name/route uncertainty at 66 → stronger name/face confusion + getting lost + repeated questions at 69. This produced a monotonic increase (Attempt 5).

In [13]:
# ATTEMPT 6: Similar-length stories (within ~10 words), reverse decline with calibrated cues
attempt6_stories = {
    "Age 63": """
    When I was forty, I had a small car accident on Route 15 while driving to work. A deer jumped into my lane and I hit it before I could stop. I pulled over, called my boss Mr. Johnson, and then called the tow service using the sign by the old bridge. I waited calmly for about an hour; the driver Mike helped me note the time and location. The deer did not survive, which made me sad, but the process was straightforward. I filed the insurance claim, kept the receipts in a folder, and scheduled the repairs. My wife Sarah drove me to work for two weeks. I remember the weather, the cross street, and the mile marker clearly, and when I described it later, I did not repeat myself or lose the details.
    """,

    "Age 66": """
    I keep revisiting that accident from my forties. I was on Route 15—yes, I’m fairly sure—driving home when a deer darted out. I hit it, pulled over, and called for a tow, but I stalled on the company name and asked my wife—Sarah—then asked again because I lost it. I repeated the mile‑marker bit and the bridge detail once before I caught myself. The dispatcher asked the time; I guessed, corrected it, and wrote it down so I wouldn’t forget. I mixed the cross street—Elm? no, Oak—and fixed that too. I double‑checked the insurance date, signed the form twice by mistake, and laughed it off. When I told my neighbor later, I restarted mid‑sentence and then finished in order. The main sequence stayed intact, but I needed small prompts to keep it straight.
    """,

    "Age 69": """
    I keep telling the same accident story. Sometimes I can’t tell if I already said this part... I was driving—maybe from work, maybe to the store—and an animal, I think a deer, ran out. I hit it and stopped. I tried to call someone but I couldn’t remember the number, so I asked a bystander and then asked again because I lost the thread. A man spoke to me and I knew his face but used the wrong name. For a moment I blanked on my daughter’s name too. They asked me the street and the time, and I didn’t know. When I left, I circled and got turned around before I found the main road. I felt foggy, repeated myself, and needed step-by-step help to get home safely.
    """,
}

# Quick length check (should be within ~10 words)
for k, v in attempt6_stories.items():
    print(k, "→", len(v.split()), "words")

success6, probs6 = test_longitudinal_attempt(
    6, attempt6_stories, "Similar-length within ~10 words; reversing decline with calibrated cues (v2)")


# We can see that short sentences raise the prob_ad. So we can start with having some longs sentences in the beggining and short them or brake them to two for the later texts

Age 63 → 135 words
Age 66 → 133 words
Age 69 → 129 words
ATTEMPT 6: Similar-length within ~10 words; reversing decline with calibrated cues (v2)
Age 63: prob_ad = 0.152 (135 words)
Preview: When I was forty, I had a small car accident on Route 15 while driving to work. A deer jumped into m...

Age 66: prob_ad = 0.386 (133 words)
Preview: I keep revisiting that accident from my forties. I was on Route 15—yes, I’m fairly sure—driving home...

Age 69: prob_ad = 0.405 (129 words)
Preview: I keep telling the same accident story. Sometimes I can’t tell if I already said this part... I was ...

Progression: 0.152 → 0.386 → 0.405
Changes: +0.234, +0.019 (Total: +0.252)
Sequential increase: ✅ SUCCESS!

