# Task 0 (Try 2): Story-Level Generation

**Problem:** The paragraph-level approach in Try 1 made the AI text too easy to detect. The explicit structural constraints and lens-based prompts created predictable patterns in punctuation density, sentence rhythm, and readability metrics.

**Solution:** Generate longer continuous narratives (1,500-3,000 words) and segment them naturally into paragraphs afterward. This allows organic variation in:
- Sentence length and rhythm
- Punctuation patterns  
- Stylistic drift across paragraphs
- Topic emergence through narrative flow

This notebook generates Classes 2 and 3 using the story-level approach. Class 1 remains the same human text from Project Gutenberg.

**Output files:**
- `class2_ai_story_paragraphs_try2.json` - 500 AI-generated paragraphs (neutral style, story-based)
- `class3_ai_story_paragraphs_try2.json` - 500 AI-generated paragraphs (author-styled, story-based)

## Imports

All dependencies consolidated in one place.

In [None]:
import os
import json
import time
import uuid
import re
import hashlib
import random
from itertools import product
from google import genai


# API key from environmentos.environ["GOOGLE_API_KEY"] = "AIzaSyC-yEM1oYrnRGRl31LhO1jUVPksDadQBk0"

## API Configuration

Again using Gemini's gemma-3-27b-it model via Google's GenAI API as it has a free tier for experimentation. Works well for fiction generation without needing paid API access.

**Temperature:** Using higher diversity settings (temp=1.0, top_p=0.99) to encourage natural variation in story flow.

In [None]:
client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
MODEL_NAME = "models/gemma-3-27b-it"

def call_gemma_api(prompt, temperature=1):
    """Call Gemma API with retry logic and high diversity settings."""
    try:
        response = client.models.generate_content(
            model=MODEL_NAME,
            contents=prompt,
            config={
                "temperature": temperature,
                "top_p": 0.99,
                "top_k": 50,
                "max_output_tokens": 3500
            }
        )
        return response.text.strip()
    except Exception as e:
        print("API error:", e)
        return None

## Thematic Anchors

Using the same 9 corpus-level topics extracted from Austen + Gaskell novels as in try-1. As said before, these aren't per-book topics but themes that span the entire literary corpus. They provide narrative direction without over-constraining the generation.



In [3]:
topics = [
    "Courtship and Marriage",
    "Domestic Life and Family Obligation",
    "Social Class and Reputation",
    "Moral Judgment and Personal Character",
    "Gender Roles and Social Constraint",
    "Work, Industry, and Economic Struggle",
    "Community, Gossip, and Social Networks",
    "Individual Desire versus Social Expectation",
    "Change, Mobility, and Social Reform"
]


In [4]:
AUTHORS = ["austen", "gaskell"]

## Author Style Guides (Class 3 Only)

For Class 3, we provide stylistic guides the same way as try1 to mimic Austen or Gaskell's narrative voice.

In [4]:
AUTHOR_STYLE_GUIDES = {
    "austen": """
- Use syntactically complex but balanced sentences
- Employ indirect evaluation and mild irony
- Prefer formal, polite diction
- Embed moral judgment subtly within narration
- Avoid emotional excess or overt sentiment
""",
    "gaskell": """
- Use emotionally expressive but controlled language
- Emphasize social conditions and human impact
- Allow moral concern to be explicit
- Alternate between reflection and description
- Use grounded, socially aware narration
"""
}


## Segmentation Function

After generating a long story, we split it into 100-200 word chunks. The split points are random within that range, creating natural variation in paragraph length (unlike Try 1's uniform chunks).

In [None]:
def segment_story(text, min_words=100, max_words=200):
    words = text.split()
    paras = []
    i = 0
    while i < len(words):
        size = random.randint(min_words, max_words)
        chunk = words[i:i+size]
        if len(chunk) >= min_words:
            paras.append(" ".join(chunk))
        i += size
    return paras


def normalize_hash(text): 
    text = text.lower()    
    text = re.sub(r"[^\w\s]", "", text)
    text = re.sub(r"\s+", " ", text)
    return hashlib.md5(text.strip().encode()).hexdigest()

## Class 2: Neutral AI Stories

Prompt structure:
- Randomly sample 3 topics from the 9 available
- Request 2,000-2,500 word continuous narrative
- No explicit structural constraints (unlike Try 1's lens + constraint system)
- Variation ID included for prompt diversity (Gemma is deterministic otherwise)

In [None]:
def build_class2_story_prompt(topic):
    vid = uuid.uuid4().hex[:8]
    return f"""
Topic:
"{topic}"

Task:
Write a continuous fictional narrative of approximately 2,000-2,500 words.

Guidelines:
- Neutral voice
- Do NOT imitate any author
- Allow natural variation in sentence length and punctuation
- Avoid structured exposition or list-like explanations
- Let the topic emerge organically through events or reflection

Rules:
- No explicit summaries
- No references to real novels or characters
- No need to maintain strict structure

Variation ID: {vid}

Return only the story text.
""".strip(), vid


## Generate Class 2 Dataset

Strategy:
1. Generate long stories (not individual paragraphs)
2. Segment each story into 100-200 word chunks
3. Deduplicate using first 500 characters as key
4. Collect until we have 500 paragraphs

Each story produces multiple paragraphs, so we need fewer API calls than Try 1 (which generated 500 separate paragraphs).

In [None]:
class2_story_paragraphs = []
seen = set()
story_count = 0

while len(class2_story_paragraphs) < 500:
    topic = random.sample(topics, k=3)
    prompt, vid = build_class2_story_prompt(topic)
    story = call_gemma_api(prompt)

    if not story:
        continue

    story_count += 1
    paras = segment_story(story)

    added = 0
    for p in paras:
        if len(class2_story_paragraphs) >= 500:
            break

        # Validate word count
        word_count = len(p.split())
        if word_count < 100 or word_count > 200:
            continue

        # MD5 hash for robust deduplication
        key = normalize_hash(p)
        if key in seen:
            continue

        seen.add(key)
        class2_story_paragraphs.append({
            "text": p,
            "label": "ai",
            "variant": "story",
            "topic": topic,
            "variation_id": vid
        })
        added += 1

    print(f"Story {story_count}: added {added} paragraphs (total: {len(class2_story_paragraphs)}/500)")
    time.sleep(1.0)

print(f"\nClass 2 story dataset complete: {len(class2_story_paragraphs)} paragraphs")

with open("class2_ai_story_paragraphs_try2.json", "w") as f:
    json.dump(class2_story_paragraphs, f, indent=2)

Story 1: added 10 paragraphs (total: 10/500)
Story 2: added 9 paragraphs (total: 19/500)
Story 3: added 9 paragraphs (total: 28/500)
Story 4: added 11 paragraphs (total: 39/500)
Story 5: added 11 paragraphs (total: 50/500)
Story 6: added 10 paragraphs (total: 60/500)
Story 7: added 9 paragraphs (total: 69/500)
Story 8: added 10 paragraphs (total: 79/500)
Story 9: added 9 paragraphs (total: 88/500)
Story 10: added 10 paragraphs (total: 98/500)
API error: 503 UNAVAILABLE. {'error': {'code': 503, 'message': 'The model is overloaded. Please try again later.', 'status': 'UNAVAILABLE'}}
Story 11: added 10 paragraphs (total: 108/500)
Story 12: added 10 paragraphs (total: 118/500)
Story 13: added 11 paragraphs (total: 129/500)
Story 14: added 10 paragraphs (total: 139/500)
API error: 503 UNAVAILABLE. {'error': {'code': 503, 'message': 'The model is overloaded. Please try again later.', 'status': 'UNAVAILABLE'}}
Story 15: added 10 paragraphs (total: 149/500)
Story 16: added 11 paragraphs (total

## Class 3: Author-Styled AI Stories

Similar to Class 2 but with:
- Author style guidance (Austen or Gaskell tendencies)
- Same story-level generation approach
- Topic subsets for narrative direction

In [None]:
def build_class3_story_prompt(topics, author):
    vid = uuid.uuid4().hex[:8]
    style_guide = AUTHOR_STYLE_GUIDES[author]

    topic_block = "\n".join(f"- {t}" for t in topics)

    return f"""
Write a continuous fictional narrative of approximately 2,000-2,500 words.

The story should naturally engage with the following themes:
{topic_block}

Write in a style inspired by {author.title()}, following these tendencies:
{style_guide}

Guidelines:
- Maintain a consistent narrative voice shaped by the authorâ€™s style
- Allow natural variation in sentence length, rhythm, and punctuation
- Let themes emerge organically through character experience
- Accept minor stylistic unevenness, as in real long-form prose

Rules:
- Do NOT reference specific novels, characters, or places
- Do NOT quote or paraphrase existing text
- Avoid modern slang or anachronisms
- No explicit summaries or sectioning

Return only the story text.
""".strip(), vid


## Generate Class 3 Dataset

Same process as Class 2:
1. Generate styled stories with author guidance
2. Segment naturally into paragraphs
3. Deduplicate and collect 500 paragraphs

The continuous narrative approach should produce more human-like artifacts than Try 1's structured paragraph generation.

In [None]:
class3_story_paragraphs = []
seen = set()
story_count = 0

TARGET_PARAS = 500
AUTHORS = list(AUTHOR_STYLE_GUIDES.keys())

while len(class3_story_paragraphs) < TARGET_PARAS:
    author = random.choice(AUTHORS)
    topic_subset = random.sample(topics, k=3)

    prompt, vid = build_class3_story_prompt(
        topics=topic_subset,
        author=author
    )

    story = call_gemma_api(prompt, temperature=1.0)

    if not story:
        continue

    story_count += 1
    paras = segment_story(story)

    added = 0
    for p in paras:
        if len(class3_story_paragraphs) >= TARGET_PARAS:
            break

        # Validate word count
        word_count = len(p.split())
        if word_count < 100 or word_count > 200:
            continue

        # MD5 hash for robust deduplication
        key = normalize_hash(p)
        if key in seen:
            continue

        seen.add(key)
        class3_story_paragraphs.append({
            "text": p,
            "label": "ai",
            "variant": "story",
            "author_style": author,
            "topics": topic_subset,
            "variation_id": vid
        })
        added += 1

    print(f"Story {story_count} ({author}): added {added} paragraphs (total: {len(class3_story_paragraphs)}/{TARGET_PARAS})")

    with open("class3_ai_story_paragraphs_try2.json", "w") as f:

        json.dump(class3_story_paragraphs, f, indent=2)print(f"\nClass 3 story dataset complete: {len(class3_story_paragraphs)} paragraphs")


    time.sleep(1.0)

Story 1 (gaskell): added 10 paragraphs (total: 10/500)
Story 2 (gaskell): added 9 paragraphs (total: 19/500)
Story 3 (austen): added 11 paragraphs (total: 30/500)
Story 4 (gaskell): added 10 paragraphs (total: 40/500)
Story 5 (austen): added 11 paragraphs (total: 51/500)
API error: 503 UNAVAILABLE. {'error': {'code': 503, 'message': 'The model is overloaded. Please try again later.', 'status': 'UNAVAILABLE'}}
Story 6 (austen): added 10 paragraphs (total: 61/500)
Story 7 (austen): added 9 paragraphs (total: 70/500)
Story 8 (gaskell): added 9 paragraphs (total: 79/500)
Story 9 (gaskell): added 9 paragraphs (total: 88/500)
Story 10 (austen): added 9 paragraphs (total: 97/500)
Story 11 (austen): added 10 paragraphs (total: 107/500)
Story 12 (austen): added 9 paragraphs (total: 116/500)
Story 13 (austen): added 9 paragraphs (total: 125/500)
Story 14 (austen): added 9 paragraphs (total: 134/500)
Story 15 (gaskell): added 9 paragraphs (total: 143/500)
API error: 503 UNAVAILABLE. {'error': {'c

## Done

Generated two AI datasets using story-level generation:
- `class2_ai_story_paragraphs_try2.json`: 500 AI paragraphs (neutral)
- `class3_ai_story_paragraphs_try2.json`: 500 AI paragraphs (styled)

Class 1 remains the same human text from Try 1. The key difference is generation granularity: stories instead of paragraphs.

### Dataset Summary

**Class 1 (Human):**
- Same as Try 1: 6 novels (3 Austen + 3 Gaskell)
- Processing: HTML cleaning, boilerplate removal, re-chunking
- Output: ~4000+ paragraphs, 100-200 words each

**Class 2 (AI-Neutral):**
- Source: Gemini API (gemma-3-27b-it)
- Strategy: Story-level generation (1.5-3k words) then segmentation
- Output: 500 paragraphs, 100-200 words each

**Class 3 (AI-Styled):**
- Source: Gemini API (gemma-3-27b-it)
- Strategy: Story-level with author guidance, then segmentation
- Output: 500 paragraphs, 100-200 words each

The continuous narrative approach should produce more human-like variation than Try 1's paragraph-level constraints.

### Why This Matters

Try 2 addresses a key limitation of Try 1: the paragraph-level generation with explicit structural constraints made AI text too easy to detect. By generating longer narratives and segmenting naturally:
- Sentence rhythm varies organically
- Punctuation patterns emerge naturally
- Stylistic drift occurs across paragraphs (like real writing)

In Task 2, we'll compare detectability across try1, try2, and try3 to see which generation strategy is most human-like.

### Next: Try 3 - Few-Shot Style Mimicry

**Limitation of Try 2:**
While story-level generation creates more natural flow than Try 1's isolated paragraphs, the style guidance is still abstract (e.g., "use complex sentences," "employ irony"). The model interprets these guidelines based on its training, which may not capture the **specific nuances** of Austen or Gaskell's actual prose.


**Try 3 Approach:**
Instead of abstract style guides, Try 3 will use **few-shot prompting with actual examples**:
1. **Extract 3-5 representative paragraphs** from each author's novels (Class 1)
2. **Show these as examples** in the prompt before generation
3. **Generate stories** that mimic the demonstrated style more precisely
4. **Extract paragraphs** as in Try 2

**Why This Might Work Better:**
- Model sees **concrete examples** of Austen/Gaskell's actual writing
- Can capture **specific patterns**: sentence structure, word choice, punctuation habits
- Mimics the "in-context learning" that makes LLMs good at style transfer
- Should produce Class 3 text that's **harder to distinguish** from Class 1

**Risk/Trade-off:**
- **Direct phrase copying**: Model might paraphrase or repeat exact sentences from examples
- **Template rigidity**: Outputs might follow the same structural pattern as examples
- **Reduced diversity**: Strong anchoring might limit stylistic exploration beyond what examples demonstrate

**Mitigation:**
- Use examples from **different novels/chapters** to show style variety
- Post-process similarity checks to detect direct copying
- Use high temperature (1.0+) to encourage variation while maintaining style
- Rotate different example sets across generations

**Why topic leakage is NOT a concern:**
- All classes already share the same 9 topics by design
- Examples show HOW to write about those topics in the target style
- The goal is learning style patterns, not introducing new content areas

If Try 3 works, Class 3 should approach human-level stylistic authenticity. If it overfits, we'll see verbatim copying or mechanical template-following rather than genuine style transfer.