In [15]:
#@markdown #**Welcome to PaperStack!**

#@markdown ### **What is PaperStack?**
#@markdown PaperStack is a **multi-agent AI pipeline** for **automated document generation**.
#@markdown It demonstrates how specialized AI agents can work together to produce **structured, domain-specific writing**—whether for research papers, technical reports, or other formal documents.

#@markdown This notebook is an **experiment in automation**, not an attempt to generate human-quality academic work.
#@markdown Instead, it provides a **transparent and modular** way to observe and tweak how AI systems compose structured content step by step.
#@markdown Since all outputs are fully AI-generated, they are **not attributable to any person or entity**.

#@markdown ---

#@markdown #**How It Works**

#@markdown ### **1. User Input**

#@markdown Once the user inputs a topic, the AI pipeline progresses through a structured sequence of tasks to produce, refine, and format the document.

#@markdown ### **2. AI Pipeline Overview**
#@markdown PaperStack operates through a structured multi-agent process, ensuring systematic content generation.

#@markdown **Task-Specific System Prompts**
#@markdown - Each AI agent is assigned a specialized role, receiving structured prompts tailored to its function within the pipeline. These task-specific instructions ensure focused execution at each stage.

#@markdown **Strict Formatting Guidelines for Outputs (JSON)**
#@markdown - Al output content adheres to standardized **JSON formatting**, maintaining consistency and ensuring compatibility for further processing.

#@markdown **Dual-Layer JSON Validation**
#@markdown - Generated JSON responses undergo **automated validation** to check for structural and syntactical correctness. If JSON parsing fails, the result is passed to a **LLM JSON fixing agent** to correct errors. This loop repeats until parsing is successul and the result is passed to the next agent in the pipeline.

#@markdown **Drafting and Paragraph-Level Revisions**
#@markdown - The system first drafts the document, then performs **paragraph-by-paragraph revisions**, enhancing clarity, coherence, and logical flow.

#@markdown **Section Scanning and Coherence Revisions**
#@markdown - After paragraph-level refinements, each section is analyzed for **coherence** and **repetition**. A separate AI agent provides targeted revision instructions for each to ensure logical consistency and reduce redundancy across the document.

#@markdown **Attribution Scanning and Citation Insertion**
#@markdown - The system scans the document to identify statements requiring attribution. Where necessary, citations are inserted in a structured format to maintain proper referencing.

#@markdown **Abstract Generation**
#@markdown - A summarization step creates a **concise abstract**, distilling the core arguments and conclusions of the document.

#@markdown **Formatting for Printing**
#@markdown - The final document is structured for readability and prepared for output in a format suitable for review.

In [16]:
#@markdown ##**Future Plans**
#@markdown
#@markdown - Add more fields for user inputs, including:
#@markdown   - Additional context to guide the AI's understanding
#@markdown   - Additional fields of study
#@markdown   - Key points or arguments the user wants included
#@markdown   - User prompted hypotheses, methodology, data, and results
#@markdown   - Preferred philosophical frameworks or schools of thought
#@markdown   - Specific philosophers or works to reference
#@markdown   - Citation style preferences
#@markdown   - Option to include particular counterarguments or opposing views, and how to address them
#@markdown   - Writing style preference (academic, accessible, narrative)
#@markdown
#@markdown - Refine and improve the attribution scanner
#@markdown - Link to a database of works like Google Scholar and field-specific OAI-PMH databases to validate works cited
#@markdown - Incorporate specialized fine-tuned models for editing, revision, in-text attribution, and citation
#@markdown - Transfer to a web UI for improved UX

In [35]:

#@markdown # **Getting Started**

#@markdown ###**Step 1: Get your API key.**

#@markdown PaperStack requires a **Together AI API Key** to generate content using the **Llama-3.3-70B-Instruct-Turbo-free** model.


#@markdown - Visit [**Together AI**](https://together.ai) and sign up or log in.
#@markdown - Navigate to **API Keys** in your account settings.
#@markdown - Click **Generate New Key**, then copy it.

#@markdown   ### **Step 2. Store Your API Key in Google Colab**
#@markdown - In the **left sidebar menu**, click the **key** icon .
#@markdown - Click **"Add a new secret"**.
#@markdown - In the **"Name"** field, enter:
#@markdown   `TogetherAPI`
#@markdown - In the **"Value"** field, paste your **API key**.
#@markdown - Toggle **"Notebook access"** to **ON**.
#@markdown - Press the  ▶ in the upper left corner of this cell.

#@markdown This notebook ensures **API key security** using Google Colab's **Secrets** feature.
#@markdown - API keys are **never displayed in outputs** or stored in the notebook.
#@markdown - If the notebook is **copied, shared, or downloaded**, **API keys do not transfer**.
#@markdown - Users must re-enter API credentials each session for security.
#@markdown
#@markdown For more information on how the Secrets feature works in Colab, refer to:
#@markdown [How to Use Secrets in Google Colab](https://medium.com/@parthdasawant/how-to-use-secrets-in-google-colab-450c38e3ec75)

%%capture
!pip install together
!pip install pylatex
!apt-get install -y texlive texlive-latex-extra texlive-xetex
from pylatex import Document, Section, NoEscape, Command
from google.colab import userdata, files
from together import Together
from IPython.display import Markdown, display
import json
import re
import os
from tqdm import tqdm

KEY = userdata.get('TogetherAPI')
client = Together(api_key = KEY)

CACHE_FILE = "essay_cache.json"

def save_cache(data):
    """Saves the current state of essay generation to a JSON cache."""
    with open(CACHE_FILE, "w") as f:
        json.dump(data, f, indent=4)

def load_cache():
    """Loads cached essay generation progress."""
    if os.path.exists(CACHE_FILE):
        with open(CACHE_FILE, "r") as f:
            return json.load(f)
    return {}

def clear_cache():
    """Clears the essay generation cache."""
    if os.path.exists(CACHE_FILE):
        os.remove(CACHE_FILE)
        print("Cache cleared.")

In [2]:
#@markdown ## **Step 2: Set the Max JSON fixing attempts.**

#@markdown Part of what makes PaperStack's AI agent pipeline robust is the JSON fixing agent, which ensures that each agent's output can be broken down and interpreted correctly. If the parsing fails, the JSON fixing agent will try to fix it and put it back into the pipeline. This operation loops until the output is fixed, or the maximum number of attempts is reached. Parsing errors are rare, and fixing usually works on the first try. However, it's possible that an error could persist and stall the pipeline indefinitely. To prevent that, you can set the maximum number of attepts here.

MAX_JSON_ATTEMPTS = 5 #@param {"type":"integer", "placeholder":"Maximum number of times the JSON fixing agent should try"}

#@markdown After you set the maximum, press the ▶ in the upper left corner of this cell.

In [43]:
#@markdown ## **Step 3: Generate the Paper.**

#@markdown Type your topic in the field below. PaperStack performs best with specific and well-posed questions or detailed suggestions that are rich enough to be discussed at length. Vague, broad, or simple topics tend to produce lower quality results with a high degree of repitition.

Topic = "The phenomenological implications of the control scheme of \"getting over it with Bennet Foddy\", i.e., how it subverts C. Thi. Nguyen's aesthetic beauty of games through the lens of Merleau-Ponty philosophy and the anatomy and physiology of human hands. " #@param {"type":"string", "placeholder":"Type the topic or title"}

#@markdown Choose output format:
format_choice = "PDF"  #@param ["PDF", "plain text", "markdown", "LaTeX"]

#@markdown Choose Output method (PDF can *only* be saved to file):

method_choice = "save to file" #@param ["display", "save to file"]

#@markdown Press the ▶ in the upper left corner of this cell to generate the paper. This could take 8-12 minutes, so please be patient.

def call_llm(prompt, system_prompt):
    """
    General function to call the LLM with a system prompt and user input.
    """
    response = client.chat.completions.create(
        model="meta-llama/Llama-3.3-70B-Instruct-Turbo-Free",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": prompt}
        ],
        max_tokens=None,
        temperature=0.7,
        top_p=0.7,
        top_k=50,
        repetition_penalty=1,
        stop=["<|eot_id|>", "<|eom_id|>"],
        stream=False
    )
    return response.choices[0].message.content.strip()

def context_agent(topic):
    system_prompt = (
        "You are a philosophy researcher specializing in historical and conceptual analysis in the field of philosophy. Your task is to identify key philosophical works, "
        "thinkers, and concepts relevant to the given topic and summarize their relevance in relation to the relevant arguments and concepts. "
        "Your response must be formatted strictly as JSON and contain no extra text or explanations."
    )

    prompt = f"""
    Provide context for the following topic:
    {topic}

    Your response must be formatted as a JSON object with the following structure:
    {{
      "philosophers": [
        {{
          "name": "string",
          "work": "string",
          "relevance": "string"
        }}
      ],
      "concepts": [
        {{
          "name": "string",
          "definition": "string",
          "relevance": "string"
        }}
      ]
    }}

    - Ensure at least two philosophers and two concepts are included.
    - Explanations must be concise yet specific, directly connecting each philosopher and concept to the topic.
    """

    return call_llm(prompt, system_prompt)

def thesis_agent(topic, context):
    system_prompt = (
        "You are a philosophy professor specializing in academic philosophical writing. Your task is to generate a strong, clear, and well-reasoned thesis statement on the given topic using the context provided. "
        "The thesis should be debatable, precise, and philosophically rigorous."
    )

    prompt = f"""
    Generate a thesis statement for the following topic:
    Topic: {topic}

    Context: {context}

    Your response must be formatted as a JSON object with the following structure:
    {{
      "thesis": "string"
    }}

    Ensure the thesis is concise (one or two sentences) and presents a clear position that can be logically argued.
    """

    return call_llm(prompt, system_prompt)


def argument_agent(thesis):
    system_prompt = (
        "You are a formal logician. Your task is to construct a rigorous philosophical argument in support of the given thesis. "
        "Your response must follow formal logical principles, ensuring clear premises that lead to a reasoned conclusion. "
        "Additionally, you must present a counterargument and a refutation of that counterargument. "
        "Your output must conform to JSON structure with no additional text, comments, characters, markdown beginning with { and "
        "ending with }"
    )

    prompt = f"""
    Construct a structured argument based on the following thesis:
    {thesis}

    Your response must be formatted as a JSON object:

    {{
      "argument": {{
        "premises": [
          "string"
        ],
        "conclusion": "string"
      }},
      "counterargument": {{
        "premises": [
          "string"
        ],
        "conclusion": "string"
      }},
      "refutation": "string"
    }}

    Do not include any additional text before or after the JSON response. Do not include any markdown or other formatting. Only ouput the text of the JSON.
    """

    return call_llm(prompt, system_prompt)

def discussion_agent(argument,context):
    system_prompt = (
        "You are a philosophy seminar leader facilitating an advanced discussion on the given argument in a graduate-level philosophy course. "
        "Your task is to generate a set of meaningful philosophical questions and accompanying exposition that critically explore, expand, and challenge the argument."
        "You and the other participants in the philosophy seminar have familiarized yourselves with the context."
    )

    prompt = f"""
    Discuss the following arguments in the appropriate context:

    Context:
    {context}

    Arguments:
    {argument}

    Your response must be formatted as a JSON object with the following structure:
    {{
      "questions": [
        {{
          "id": "integer",
          "content": "string",
          "answer": "string",
          "category": "string"
        }}
      ]
    }}

    Each question should be categorized under one of the following:
    - "validity" (questions about logical structure and consistency)
    - "soundness" (questions about the truth of premises)
    - "alternative perspectives" (questions that explore different viewpoints)
    - "implications" (questions about consequences of accepting the argument)

    Keep questions open-ended and specific enough to guide meaningful discussion, ensuring clarity and coherence.
    """

    return call_llm(prompt, system_prompt)

def overview_agent(thesis, arguments, context, discussion):
    system_prompt = (
      """You are an academic writer specializing in structuring philosophical essays for maximum clarity, coherence, and intellectual depth. Your task is to generate a comprehensive and well-organized overview of the provided reference materials for a major academic work.

      The overview should not merely summarize each section but should reconstruct the content into a logically structured and compelling exposition of the central themes, arguments, and discussions.

      Your overview should:

      - Present the topic, thesis, and core arguments in a clear and logically progressive manner.
      - Synthesize key ideas, discussions, and counterarguments into a cohesive narrative.
      - Reorganize content if necessary to enhance clarity, argumentative strength, and thematic development.
      - Ensure fluid transitions between concepts and sections to guide the reader effectively.
      - Capture nuance, theoretical depth, and the broader implications of the discussion."""
    )

    prompt = f"""
    Generate an essay overview based on the following:

    Thesis:
    {thesis}

    Arguments:
    {arguments}

    Context:
    {context}

    Discussion:
    {discussion}

    Your response must be formatted as a JSON object with the following structure:
    {{
      "overview": "string",
      "sections": [
        {{
          "title": "string",
          "summary": "string"
        }}
      ]
    }}

    - The "overview" field should contain a 2-3 sentence high-level summary of the essay.
    - The "sections" array should include each major section of the essay, with a brief summary of its purpose and content.
    - Ensure clarity, coherence, and proper structuring.
    """

    return call_llm(prompt, system_prompt)

def structure_agent(overview):
    system_prompt = (
        "You are an academic writing strategist in the field of philosophy leading a team of professional philosophers "
        "in the authorship of an academic paper. Your task is to create a structured outline for a philosophy essay "
        "based on the provided overview. The outline must be logically structured, ensuring coherence and flow. However, "
        "the structure need not follow the input structure. Rather, the structure should optimally present the thesis in a "
        "logical and cohesive progression of detailed academic exposition and discussion to present and discuss the topic"
        "purpose of drawing meaningful conclusions regarding the thesis. "
        "It should be formatted as a JSON object that can be parsed programmatically."
    )

    prompt = f"""
    Generate a structured outline for the following essay overview:

    {overview}

    Your response must be formatted as a JSON object with the following structure:
    {{
      "title": "string",
      "sections": [
        {{
          "title": "string",
          "summary": "string",
          "paragraphs": [
            {{
              "id": "integer",
              "topic": "string",
              "details": "string"
            }}
          ]
        }}
      ]
    }}

    - The "title" field should contain the title of the essay.
    - Each "sections" object should include a section title and a brief summary of its purpose.
    - Each section should contain a "paragraphs" array with numbered paragraph entries, including a topic and a description of what it should cover.
    - Ensure that the outline provides a logical flow from introduction to conclusion.
    """

    return call_llm(prompt, system_prompt)

def split_paragraph(paragraph_text):
    """Splits a paragraph into sentences while avoiding splitting after common abbreviations."""
    if not paragraph_text.strip():
        return []  # Return empty list for empty input

    # Define common abbreviations that should not cause sentence splits
    abbreviations = {
        "Dr.": "Dr<abbr>",
        "Mr.": "Mr<abbr>",
        "Ms.": "Ms<abbr>",
        "Mrs.": "Mrs<abbr>",
        "Jr.": "Jr<abbr>",
        "Sr.": "Sr<abbr>",
        "vs.": "vs<abbr>",
        "etc.": "etc<abbr>",
        "e.g.": "eg<abbr>",
        "i.e.": "ie<abbr>"
    }

    # Step 1: Temporarily replace abbreviations
    for abbr, placeholder in abbreviations.items():
        paragraph_text = paragraph_text.replace(abbr, placeholder)

    # Step 2: Split sentences using a simple regex
    sentences = re.split(r'(?<=[.!?])\s+', paragraph_text.strip())

    # Step 3: Restore abbreviations
    sentences = [sentence.replace("<abbr>", ".") for sentence in sentences]

    # Return structured sentence objects
    return [{"id": i, "text": sentence.strip()} for i, sentence in enumerate(sentences) if sentence.strip()]

def drafting_agent(section_outline, overview):
    system_prompt = (
        "You are an academic writer. Your task is to write a well-structured and logically coherent paragraph "
        "based on the provided section outline, while ensuring it aligns with the overall structure and flow "
        "outlined in the essay overview. The paragraph should follow academic standards for clarity and precision. "
        "Ensure that the paragraph remains consistent with the topic and details provided in the outline."
    )

    prompt = f"""
    Generate a paragraph based on the following outline and essay overview:

    Overview:
    {overview}

    Section Outline:
    {section_outline}

    Your response must be formatted as a JSON object with the following structure:
    {{
      "text": "string"
    }}

    - The paragraph should be clear, well-structured, and aligned with the topic and details provided.
    - Ensure logical flow and coherence within the paragraph.
    - Maintain consistency with the essay's overall structure as outlined in the overview.
    """

    return call_llm(prompt, system_prompt)

def gross_revision_agent(paragraph, thesis, argument, section_outline):
    system_prompt = (
        """You are a philosophy editor specializing in deepening academic writing. "
        "Your task is to revise the provided paragraph to enhance its philosophical depth, detail, exposition, and clarity. "
        "Strengthen the argument by expanding key points, improving explanations, and incorporating additional support where necessary. "
        "Ensure that every claim is well-articulated, logically developed, and contextualized within the broader discussion. "
        "Focus on increasing precision and depth without altering the intended meaning or introducing unrelated ideas. "
        "Clarify abstract or ambiguous statements, reinforce logical connections, and provide additional exposition where needed to make the argument more rigorous and comprehensive. "
        "Maintain an academic tone, ensuring that the paragraph aligns seamlessly with the section’s argument and the overarching thesis of the essay."""
    )

    prompt = f"""
    Revise the following paragraph for deeper philosophical engagement, refining arguments, and adding necessary detail.

    Thesis:
    {thesis}

    Section Outline:
    {section_outline}

    Supporting Argument:
    {argument}

    Original Paragraph:
    {paragraph}

    Your response must be formatted as a JSON object with the following structure:
    {{
      "text": "string"
    }}

    - Ensure the revised paragraph maintains logical coherence with the thesis and section argument.
    - Improve depth, precision, and clarity in philosophical reasoning.
    - Preserve the intended meaning while enhancing readability and engagement.
    """

    return call_llm(prompt, system_prompt)

def section_instruction_agent(section, thesis, argument):
    system_prompt = (
        "You are an academic writing editor. Your task is to analyze a section of a philosophy essay "
        "and generate paragraph-specific revision instructions. Identify redundancies, improve logical flow, "
        "and suggest wording adjustments. You may not restructure the section or recommend the addition or "
        "removal of full paragraphs. Do NOT rewrite the section—only provide structured revision guidance."
    )

    prompt = f"""
    Analyze the following section and generate structured revision instructions for each paragraph.

    Thesis:
    {thesis}

    Argument:
    {argument}

    Section Title: {section["title"]}

    Section Content:
    {" ".join(paragraph["text"] for paragraph in section["paragraphs"])}

    Your response must be formatted as a JSON object:
    {{
      "revision_instructions": [
        {{
          "id": "integer",
          "instructions": "string"
        }}
      ]
    }}

    - Identify redundant ideas across paragraphs.
    - Improve logical flow between paragraphs.
    - Suggest rewording for clarity and conciseness.
    - Do NOT rewrite the section, only provide structured revision guidance.
    """

    return call_llm(prompt, system_prompt)

def paragraph_revision_agent(paragraph, instructions, thesis, argument):
    system_prompt = (
        "You are an academic writing editor. Your task is to refine a single paragraph based on structured revision instructions. "
        "Ensure clarity, conciseness, and logical alignment. Apply the suggested revisions, but do NOT remove the paragraph."
    )

    prompt = f"""
    Refine the following paragraph based on the provided revision instructions.

    Thesis:
    {thesis}

    Argument:
    {argument}

    Revision Instructions:
    {instructions}

    Original Paragraph:
    {paragraph["text"]}

    Your response must be formatted as a JSON object:
    {{
      "text": "string"
    }}

    - Implement the suggested improvements while maintaining the paragraph's meaning.
    - Improve clarity, conciseness, and logical flow.
    - Do NOT remove the paragraph.
    """

    return call_llm(prompt, system_prompt)

def section_refinement_agent(section, thesis, argument):
    system_prompt = (
        "You are an academic writing editor specializing in philosophical essays. Your task is to revise the following section "
        "by removing redundant content, ensuring coherence between paragraphs, and maintaining logical flow. Eliminate unnecessary "
        "repetition and redundant language to improve the flow of the text while preserving detail, depth and rigor in argumentation."
        "Add additional support or exposition that serves to conceptually connect and unify the section into a cohesive whole. "
        "This is the last step of the revision process, so your final product should be exemplary of excellent academic writing, "
        "rich philosophical inquiry, and clear and effective communication in the field of academic philosophy."
    )

    prompt = f"""
    Refine the following section of a philosophy essay to improve coherence and eliminate redundancy.

    Thesis:
    {thesis}

    Argument:
    {argument}

    Section Title: {section["title"]}

    Section Summary: {section["summary"]}

    Section Content:
    {section['paragraphs']}

    Your response must be formatted as a JSON object with the following structure:
    {{"title": "string",
      "summary": "string",
      "paragraphs":
        {{
          "id": "integer",
          "text": "string"
        }}
    }}
    - Maintain logical flow and academic rigor.
    - Ensure each paragraph contributes uniquely to the argument.
    - Improve readability by reducing unnecessary repetition.
    - Keep the response strictly formatted as JSON.
    """

    return call_llm(prompt, system_prompt)

def citation_insertion_agent(paragraph):
    system_prompt = (
        "You are a reference manager specializing in academic philosophy. Your task is to insert parenthetical citations "
        "of the form (LastName) into the sentences of the given paragraph for the purpose of attributing non-original ideas "
        "to their owners. Inserting parenthetical references is the ONLY edit you are permitted to make to any sentence. "
        "For each sentence, if the source is mentioned in the sentence by name, you do not need to edit the sentence. "
        "The result of your work should be identical to the original sentence apart from the parenthetical citations you add."
    )

    prompt = f"""
    Insert appropriate citations into the following paragraph using the provided citation suggestions.

    Paragraph:
    {paragraph}

    Your response must be formatted as a JSON object with the following structure:

    {{
      "text": "string"
    }}

    - The "text" field is the full text of the paragraph with citations inserted where appropriate. "
    - Use the format (LastName) for the appropriately attributable entity.
    - Do not include any additional information like works or year in the parenthetical citation. ONLY the last name of the identity receiving the attribution should appear in the parenthetical citation.
    """
    return call_llm(prompt, system_prompt)

def citation_extraction_agent(paragraph):
    system_prompt = (
        "You are a reference extraction assistant. Your task is to extract all parenthetical citations from the provided paragraph."
        "and return in a notated aggregated JSON format."
        "Your output must conform to JSON structure with no additional text, comments, characters, markdown beginning with { and "
        "ending with }. You may not deviate from the JSON structure or use alternate keys."
    )

    prompt = f"""
    Extract citations from the following paragraph:

    {paragraph}

    Your response must be formatted as a JSON object with the following structure:

   {{
      "works": [
        {{
          "identity": "string",
          "description": "string",
          "relevance": "string"
        }}
      ]
    }}

    - The "identity" field should the name of the author or entity to whom attribution should be made.
    - The "description" field should be a brief sentence noting why the attribution was made to the author or entity.
    - Only extract unique citations (do not create duplicate entries for the same author or entity).
    - Do not add any extra text outside of the JSON object.
    """

    return call_llm(prompt, system_prompt)

def works_cited_agent(paragraph, thesis, abstract):
    system_prompt = (
        "You are an academic reference organizer. Your task is to extracted parenthetical citations from a paragraph. Based on "
        "the textual context and the thesis of the paper from which the paragraph is taken, you will create a works cited entry "
        "to give attribution to the author of concepts and ideas that influenced the paper from which the paragraph is taken. "
        "Your output must conform to JSON structure with no additional text, comments, characters, markdown beginning with { and "
        "ending with }. You may not deviate from the JSON structure or use alternate keys."
    )

    prompt = f"""

    Thesis:
    {thesis}

    Abstract:
    {abstract}

    Citation List:
    {paragraph}

    Your response must be formatted as a JSON object:

    {{ works:
      [
        {{
          "identity": "string",
          "description": "string",
          "relevance": "string"
        }},
      ]
    }}

    - The "identity" field should the name of the author or entity to whom attribution is made.
    - The "description" field gives a brief overview of the author or entity and the totality of their work.
    - The "relevance" field gives a few sentences on how the author or entity
    - Standardize the citation formatting for consistency.
    - Do not include any extra text outside the JSON object.
    """

    return call_llm(prompt, system_prompt)

def works_cited_aggregation_agent(works_cited):
    system_prompt = (
        "You are an academic reference organizer. Your task is to review the following authors and the accompanying notes, which "
        "were generated automatically based on parenthetical citations. As you can see, there are many repititions and redundancies. "
        "Distill this list down to unique authors, and combine the description and relevance for each author to create a single "
        "attribution entry for each author. Your output must conform to JSON structure with no additional text, comments, characters, "
        "markdown beginning with { and ending with }. The JSON structure output should be flat. You may not deviate from the JSON structure or use alternate keys."
    )

    prompt = f"""

    Citation List:
    {works_cited}

    Your response must be formatted as a flat JSON object:

    {{
      {{
          "identity": "string",
          "description": "string",
          "relevance": "string"
        }},
    }}

    - The "identity" field should the name of the author or entity to whom attribution is made as 'Last, First'.
    - The "description" field gives a brief overview of the author or entity and the totality of their work.
    - The "relevance" field gives a few sentences on how the author or entity
    - Standardize the citation formatting for consistency.
    - Do not include any extra text outside the JSON object.
    """

    return call_llm(prompt, system_prompt)

def abstract_agent(essay_json):
    system_prompt = (
        "You are an academic summarizer specializing in philosophy. Your task is to write a concise, well-structured abstract "
        "that effectively summarizes the entire essay, including the thesis, key arguments, philosophical context, discussion points, and conclusion. "
        "Ensure the abstract is engaging, clear, and informative while adhering to academic standards."
    )

    # Step 1: Summarize each section separately to reduce input size
    section_summaries = []
    for section in essay_json["sections"]:
        section_prompt = f"""
        Summarize the following section of a philosophy essay:

        Section Title: {section['title']}

        Section Content:
        {section['summary']}

        Paragraphs:
        {" ".join(paragraph['text'] for paragraph in section['paragraphs'])}

        Your response must be formatted as a JSON object:
        {{
          "section_summary": "string"
        }}
        """
        summary_response = call_llm(section_prompt, system_prompt)
        section_summary = parse_json_with_validation(summary_response).get("section_summary", "")
        section_summaries.append({"title": section["title"], "summary": section_summary})

    # Step 2: Use section summaries for the final abstract generation
    abstract_prompt = f"""
    Below are the key components of a philosophy essay. Your task is to generate a concise abstract.

    Thesis:
    {essay_json["thesis"]}

    Section Summaries:
    """
    for section in section_summaries:
        abstract_prompt += f"Section: {section['title']}\nSummary: {section['summary']}\n\n"

    abstract_prompt += """
    Your response must be formatted as a JSON object with the following structure:
    {
      "abstract": "string"
    }

    - Ensure the abstract is concise (150-250 words), engaging, and informative.
    - Clearly summarize the thesis, major arguments, counterarguments, and philosophical significance.
    - Maintain logical coherence and clarity for an academic audience.
    """

    return call_llm(abstract_prompt, system_prompt)

def title_generation_agent(abstract):
    """
    Uses an LLM to generate a concise, informative title from an abstract.

    Parameters:
    - abstract (str): The abstract of the essay.

    Returns:
    - str: JSON-formatted response containing the title.
    """

    system_prompt = (
        "You are an academic writing assistant specialized in generating concise, informative titles. "
        "Your task is to create a clear, engaging, and academically appropriate title based on the provided abstract. "
        "The title should summarize the core theme of the abstract while being brief and compelling."
    )

    prompt = f"""

    Topic:
    {Topic}

    Abstract:
    {abstract}

    Generate a concise, informative title for this abstract. Ensure the title is:
    - No longer than 12 words.
    - Academically appropriate and engaging.
    - Clearly related to the main theme of the abstract.

    Your response must be formatted as a JSON object:
    {{
      "title": "string"
    }}
    """

    response = call_llm(prompt, system_prompt)
    return response  # The main function will handle JSON parsing

def json_fixing_agent(response_text):
    """
    Uses an LLM to clean and fix malformed JSON structures, ensuring proper formatting and structure.
    This function is designed to be called iteratively within a validation loop.
    """

    system_prompt = (
        "You are an expert JSON repair agent. You will be given a malformed JSON object. Your task is to clean the text input "
        "and correct any malformed JSON found in a given text response. The JSON may have any number of errors. You "
        "must find them and correct them. Your response must contain ONLY valid JSON written with no extra text, explanations, "
        "or markdown. Output only the text charachters that represent the correct JSON. Your response should start with { and "
        "end with }."
    )

    prompt = f"""
    The following text contains malformed JSON that may have structural errors or extra text.
    Extract and correct the JSON, ensuring it is properly formatted and syntactically valid.

    Response to fix:
    {response_text}

    Return ONLY the corrected JSON.
    """

    fixed_response = call_llm(prompt, system_prompt)

    return fixed_response

class MaxJsonAttemptsExceededError(Exception):
    """Raised when JSON parsing fails after the maximum number of attempts."""
    def __init__(self, attempts, message="JSON parsing failed after maximum number of attempts. This is rare, so try again!"):
        self.attempts = attempts
        self.message = f"{message} ({attempts} attempts)"
        super().__init__(self.message)

def parse_json_with_validation(response_text):
    """
    Attempts to parse a JSON response by running it through the JSON Fixing Agent iteratively
    until it parses successfully or reaches a retry limit.
    """
    global MAX_JSON_ATTEMPTS
    attempts = 0
    try:
      while attempts < MAX_JSON_ATTEMPTS:
          try:
              # Remove markdown-style code blocks if they exist
              response_text = re.sub(r'```json|```', '', response_text).strip()

              # Extract JSON block safely (no lookbehind)
              json_match = re.search(r'\{.*\}', response_text, re.DOTALL)
              if not json_match:
                  raise json.JSONDecodeError("No valid JSON found", response_text, 0)

              cleaned_json_text = json_match.group(0)
              parsed_json = json.loads(cleaned_json_text)

              return parsed_json  # Successfully parsed JSON
          except json.JSONDecodeError:
              print(f"JSON parsing failed. Running JSON Fixing Agent (attempt {attempts + 1})...")
              response_text = json_fixing_agent(response_text)
              attempts += 1

      # Raise a custom error after exceeding max attempts
      raise MaxJsonAttemptsExceededError(MAX_JSON_ATTEMPTS)
    except MaxJsonAttemptsExceededError as e:
      print(f"Error: {e}")  # Handle the max attempts exceeded error

def convert_json_to_plaintext(essay_json):
    """
    Converts the final structured JSON essay into a plain-text readable format.
    """
    plaintext = ""
    plaintext += f"{essay_json['title']}\n\n"
    plaintext += f"Abstract\n{essay_json['abstract']}\n\n"

    plaintext += "Body\n\n"
    for section in essay_json['sections']:
        plaintext += f"{section['title']}\n\n"
        for paragraph in section['paragraphs']:
            plaintext += f"{paragraph['text']}\n\n"

    plaintext += "Works Cited\n\n"
    for philosopher in essay_json['context']['philosophers']:
        plaintext += f"{philosopher['name']}. *{philosopher['work']}*. {philosopher['relevance']}.\n"
    plaintext += "\n"
    for concept in essay_json['context']['concepts']:
        plaintext += f"{concept['name']}. {concept['definition']}. {concept['relevance']}.\n"
    plaintext += "\n"

    return plaintext

def save_to_file(filename, content):
    """Helper function to save content to a file."""
    with open(filename, "w", encoding="utf-8") as file:
        file.write(content)
    print(f"File saved: {filename}")

def convert_json_to_markdown(essay_json):
    """Converts the essay JSON into Markdown format."""
    markdown_text = f"# {essay_json['title']}\n\n"
    markdown_text += f"**Abstract**\n\n{essay_json['abstract']}\n\n"

    for section in essay_json['sections']:
        markdown_text += f"## {section['title']}\n\n"
        for paragraph in section['paragraphs']:
            markdown_text += f"{paragraph['text']}\n\n"

    markdown_text += "### Works Cited\n\n"
    for entry in essay_json["works_cited"]:
        markdown_text += f"- **{entry['identity']}**: {entry['description']}\n  - {entry['relevance']}\n"

    return markdown_text

def convert_json_to_latex(essay_json):
    """Converts the essay JSON into LaTeX format."""
    latex_text = r"""
\title{""" + essay_json['title'] + r"""}
\author{}
\date{}
\begin{document}
\maketitle

\section*{Abstract}
""" + essay_json['abstract'] + r"""

"""

    for section in essay_json['sections']:
        latex_text += f"\\section{{{section['title']}}}\n"
        for paragraph in section['paragraphs']:
            latex_text += paragraph['text'] + "\n\n"

    latex_text += r"\section*{Works Cited}"

    for entry in essay_json["works_cited"]:
        latex_text += f"\n\\textbf{{{entry['identity']}}}: {entry['description']}\n\n"
        latex_text += f"\\textit{{{entry['relevance']}}}\n\n"

    latex_text += r"\end{document}"

    return latex_text

def compile_tex_to_pdf(tex_filename):
    """Compiles a .tex file to .pdf using pdflatex."""
    command = f"pdflatex -interaction=nonstopmode -output-directory={os.path.dirname(tex_filename)} {tex_filename}"
    try:
        subprocess.run(command, shell=True, check=True)
        print(f"PDF compiled: {tex_filename.replace('.tex', '.pdf')}")
    except subprocess.CalledProcessError as e:
        print(f"Error compiling LaTeX to PDF: {e}")

def output_essay(essay_json, format_choice, method_choice):
    """Handles different output formats and methods."""

    if format_choice == "plain text":
        text_output = convert_json_to_plaintext(essay_json)
        if method_choice == "display plain text":
            print(text_output)
        elif method_choice == "save txt to file":
            save_to_file("essay.txt", text_output)

    elif format_choice == "markdown":
        markdown_output = convert_json_to_markdown(essay_json)
        if method_choice == "display markdown":
            display(Markdown(markdown_output))
        elif method_choice == "save markdown to txt to file":
            save_to_file("essay.md", markdown_output)

    elif format_choice == "PDF":
        tex_output = convert_json_to_latex(essay_json)
        if method_choice == "save to tex":
            save_to_file("essay.tex", tex_output)
        elif method_choice == "save file":
            tex_filename = "essay.tex"
            save_to_file(tex_filename, tex_output)
            compile_tex_to_pdf(tex_filename)

def generate_philosophy_essay(topic):
    """
    Generates a full philosophy essay with caching support.
    If the process crashes, it resumes from the last completed step.
    """

    cache = load_cache()
    cache["topic"] = topic  # Always store topic in case we need to reset

    if "context" in cache:
        print("Using cached context...")
        context = cache["context"]
    else:
        print("Gathering context...")
        context_response = context_agent(topic)
        context = parse_json_with_validation(context_response)
        cache["context"] = context
        save_cache(cache)

    if "thesis" in cache:
        print("Using cached thesis...")
        thesis = cache["thesis"]
    else:
        print("Generating thesis...")
        thesis_response = thesis_agent(topic, context)
        thesis = parse_json_with_validation(thesis_response)["thesis"]
        cache["thesis"] = thesis
        save_cache(cache)

    if "arguments" in cache:
        print("Using cached arguments...")
        arguments = cache["arguments"]
    else:
        print("Generating arguments...")
        argument_response = argument_agent(thesis)
        arguments = parse_json_with_validation(argument_response)
        cache["arguments"] = arguments
        save_cache(cache)

    if "discussion" in cache:
        print("Using cached discussion...")
        discussion = cache["discussion"]
    else:
        print("Generating discussion...")
        discussion_response = discussion_agent(arguments, context)
        discussion = parse_json_with_validation(discussion_response)
        cache["discussion"] = discussion
        save_cache(cache)

    if "overview" in cache:
        print("Using cached overview...")
        overview = cache["overview"]
    else:
        print("Creating overview...")
        overview_response = overview_agent(thesis, arguments, context, discussion)
        overview = parse_json_with_validation(overview_response)
        cache["overview"] = overview
        save_cache(cache)

    if "structure" in cache:
        print("Using cached structure...")
        structure = cache["structure"]
    else:
        print("Generating essay structure...")
        structure_response = structure_agent(overview)
        structure = parse_json_with_validation(structure_response)
        cache["structure"] = structure
        save_cache(cache)

    paragraphs = [p for section in structure["sections"] for p in section["paragraphs"]]

    if "drafted_paragraphs" in cache:
        print("Using cached drafted paragraphs...")
        drafted_paragraphs = cache["drafted_paragraphs"]
    else:
        drafted_paragraphs = run_with_progress("Drafting...", paragraphs, drafting_agent, overview)
        cache["drafted_paragraphs"] = drafted_paragraphs
        save_cache(cache)

    index = 0
    for section in structure["sections"]:
        for paragraph in section["paragraphs"]:
          paragraph = drafted_paragraphs[index]
          index += 1

    paragraphs = [p for section in structure["sections"] for p in section["paragraphs"]]

    if "revised_paragraphs" in cache:
        print("Using cached revised paragraphs...")
        revised_paragraphs = cache["revised_paragraphs"]
    else:
        revised_paragraphs = run_with_progress("Revising...", paragraphs, gross_revision_agent, thesis, arguments, structure)
        cache["revised_paragraphs"] = revised_paragraphs
        save_cache(cache)

    index = 0
    for section in structure["sections"]:
        for paragraph in section["paragraphs"]:
            paragraph = revised_paragraphs[index]
            index += 1

    paragraphs = [p for section in structure["sections"] for p in section["paragraphs"]]

    if "refined_sections" in cache:
        print("Using cached refined sections...")
        refined_sections = cache["refined_sections"]
    else:
        refined_sections = run_with_progress("Revising for coherence...", structure["sections"], section_refinement_agent, thesis, arguments)
        cache["refined_sections"] = refined_sections
        save_cache(cache)

    for i, refined_section in enumerate(refined_sections):
      structure["sections"][i] = refined_section

    paragraphs = [p for section in structure["sections"] for p in section["paragraphs"]]

    essay_json = {
        "thesis": thesis,
        "sections": structure["sections"],
    }

    print("Generating abstract...")
    if "abstract" in cache:
        print("Using cached abstract...")
        abstract = cache["abstract"]
    else:
        for section in structure["sections"]:
            section["paragraphs"] = [
                {"text": p["text"]} if (isinstance(p, dict) and "text" in p) else {"text": p} for p in section["paragraphs"]
            ]
        abstract_response = abstract_agent(essay_json)
        abstract = parse_json_with_validation(abstract_response)["abstract"]
        cache["abstract"] = abstract
        save_cache(cache)

    print("Generating title...")
    if "title" in cache:
        print("Using cached title...")
        title = cache["title"]
    else:
        title_response = title_generation_agent(abstract)
        title = parse_json_with_validation(title_response)["title"]
        essay_json["title"] = title
        cache["title"] = title
        save_cache(cache)

# Begin DUPLICAted section ===========================

    if "cited_paragraphs" in cache:
        print("Using cached cited paragraphs...")
        cited_paragraphs = cache["cited_paragraphs"]
    else:
        cited_paragraphs = run_with_progress("Adding citations...", paragraphs, citation_insertion_agent)
        cache["cited_paragraphs"] = cited_paragraphs
        save_cache(cache)

    index = 0
    for section in structure["sections"]:
        for paragraph in section["paragraphs"]:
            paragraph = cited_paragraphs[index]
            index += 1

    paragraphs = [p for section in structure["sections"] for p in section["paragraphs"]]
    cache["cited_paragraphs"] = cited_paragraphs
    save_cache(cache)

    index = 0
    for section in structure["sections"]:
        for paragraph in section["paragraphs"]:
            paragraph["text"] = cited_paragraphs[index]["text"]
            index += 1

    if "cited_paragraphs" in cache:
        print("Using cached cited paragraphs...")
        cited_paragraphs = cache["cited_paragraphs"]
    else:
        cited_paragraphs = run_with_progress("Adding citations to text...", paragraphs, citation_insertion_agent)
        cache["cited_paragraphs"] = cited_paragraphs
        save_cache(cache)

# END DUPLICATION ===========================

    index = 0
    for section in structure["sections"]:
        for paragraph in section["paragraphs"]:
            paragraph = cited_paragraphs[index]
            index += 1

    paragraphs = [p for section in structure["sections"] for p in section["paragraphs"]]
    cache["cited_paragraphs"] = cited_paragraphs
    save_cache(cache)

    if "citations" in cache:
        print("Using cached citations list...")
        citations = cache["citations"]
    else:
        citations = run_with_progress("Collecting citations...", paragraphs, works_cited_agent, thesis, abstract)
        cache["citations"] = citations
        save_cache(cache)

    from collections import defaultdict

    # Process the citations variable
    print("Consolidating citations...")
    unique_citations = consolidate_citations(citations)
    cache["unique_citations"] = unique_citations
    save_cache(cache)

    print("Essay generation complete!")
    essay_json = {
        "title": title,
        "thesis": thesis,
        "context": context,
        "arguments": arguments,
        "discussion": discussion,
        "sections": structure["sections"],
        "abstract": abstract,
        "works_cited": unique_citations
    }
    cache["essay_json"] = essay_json
    save_cache(cache)

    return essay_json

def consolidate_citations(citations):
      unique_citations = {}

      for entry in citations:
          for work in entry["works"]:
              identity = work["identity"]
              if identity not in unique_citations:
                  unique_citations[identity] = {
                      "identity": identity,
                      "description": work["description"],
                      "relevance": []
                  }
              unique_citations[identity]["relevance"].append(work["relevance"])

      # Convert the dictionary back to a list
      return list(unique_citations.values())

def convert_json_to_markdown(essay_json):
    """Converts the essay JSON into Markdown format."""
    markdown_text = f"# {essay_json['title']}\n\n"
    markdown_text += f"**Abstract**\n\n{essay_json['abstract']}\n\n"

    for section in essay_json['sections']:
        markdown_text += f"## {section['title']}\n\n"
        for paragraph in section['paragraphs']:
            markdown_text += f"{paragraph['text']}\n\n"

    markdown_text += "### Works Cited\n\n"
    for entry in essay_json["works_cited"]:
        markdown_text += f"- **{entry['identity']}**: {entry['description']}\n  - {entry['relevance']}\n"

    return markdown_text

def convert_json_to_latex(essay_json):
    """
    Converts the essay JSON into a properly formatted LaTeX document.

    Parameters:
    - essay_json (dict): The structured JSON object containing the essay content.

    Returns:
    - Document: A `pylatex` Document object ready for PDF compilation.
    """

    # Initialize LaTeX document with necessary packages and settings
    doc = Document()
    doc.preamble.append(NoEscape(r"\usepackage[margin=1in]{geometry}"))  # 1-inch margins
    doc.preamble.append(NoEscape(r"\usepackage{hyperref}"))  # Hyperlinks support
    doc.preamble.append(NoEscape(r"\setlength{\parindent}{0cm}"))  # Remove paragraph indentation

    # Add title, author, and date
    doc.preamble.append(Command('title', essay_json['title']))
    doc.preamble.append(Command('author', 'Generated by AI'))
    doc.preamble.append(Command('date', NoEscape(r'\today')))

    doc.append(NoEscape(r"\maketitle"))  # Generate the title page

    # Abstract Section
    doc.append(NoEscape(r"\section*{Abstract}"))
    doc.append(NoEscape(essay_json['abstract'] + r" \\" + "\n\n"))  # Ensure line break after abstract

    # Add sections and content
    for section in essay_json['sections']:
        with doc.create(Section(section['title'])):
            for paragraph in section['paragraphs']:
                doc.append(NoEscape(paragraph['text'] + r"\\[1em]"))

    # Works Cited Section
    doc.append(NoEscape(r"\section*{Works Cited}"))

    for entry in essay_json["works_cited"]:
        doc.append(NoEscape(f"\\textbf{{{entry['identity']}}}: {entry['description']} \\\\[1em]" + "\n\n"))
        #doc.append(NoEscape(f"\\textit{{{entry['relevance']}}} \\"))  # Italicized relevance + Line break

    return doc

def compile_tex_to_pdf(doc, output_filename="essay"):
    """
    Compiles a LaTeX document into a PDF and provides a download link in Google Colab.

    Parameters:
    - doc (Document): A `pylatex` Document object.
    - output_filename (str): The name of the output PDF file (without extension).

    Output:
    - Saves the PDF in the current working directory and provides a download link.
    """

    try:
        pdf_filename = f"{output_filename}.pdf"
        doc.generate_pdf(output_filename, compiler="pdflatex", clean_tex=False)

        # Provide a download link in Google Colab
        files.download(pdf_filename)
        print(f"PDF compiled successfully: {pdf_filename}")
    except Exception as e:
        print(f"Error compiling LaTeX to PDF: {e}")


def output_essay(essay_json, format_choice, method_choice):
    """
    Handles different output formats and methods.

    Parameters:
    - essay_json (dict): The structured JSON object containing the essay.
    - format_choice (str): Output format ("plain text", "markdown", "tex").
    - method_choice (str): Output method depending on the format.

    Actions:
    - Displays or saves the essay in the chosen format.
    """
    if format_choice == "PDF" and method_choice == "display":
      print("PDF can only be saved to file. If you'd like a PDF, please change the method to \"save_to_file\" and try again.")
      return None

    if format_choice == "plain text":
        text_output = convert_json_to_plaintext(essay_json)
        if method_choice == "display plain text":
            print(text_output)
        elif method_choice == "save txt to file":
            save_to_file("essay.txt", text_output)

    elif format_choice == "markdown":
        markdown_output = convert_json_to_markdown(essay_json)
        if method_choice == "display markdown":
            display(Markdown(markdown_output))  # Display Markdown in Jupyter/Colab
        elif method_choice == "save markdown to txt to file":
            save_to_file("essay.md", markdown_output)

    elif format_choice == "PDF":
        tex_output = convert_json_to_latex(essay_json)
        compile_tex_to_pdf(tex_output, "essay")

def save_to_file(filename, content):
    """
    Saves content to a file.

    Parameters:
    - filename (str): The name of the file to save.
    - content (str): The content to write to the file.

    Output:
    - Saves the content as a file in the current working directory.
    """
    with open(filename, "w", encoding="utf-8") as f:
        f.write(content)
    print(f"File saved: {filename}")

def run_with_progress(task_name, items, agent_function, *args):
    """
    Runs an agent function with a dynamic progress bar.

    Parameters:
    - task_name (str): Name of the task (e.g., "Processing Sections", "Generating Titles").
    - items (list): The list of items to iterate over (e.g., sections, paragraphs).
    - agent_function (function): The function to run for each item.
    - *args: Additional arguments to pass to the agent function.

    Returns:
    - list: A list of results from running the agent function on each item.
    """

    results = []

    with tqdm(total=len(items), desc=task_name, bar_format="{l_bar}{bar} {n_fmt}/{total_fmt} items") as pbar:
        for item in items:
            results.append(parse_json_with_validation(agent_function(item, *args)))
            pbar.update(1)  # Update progress bar

    return results

cache = load_cache()
if "essay_json" in cache:
    print("Using cached essay...")
    essay_json = cache["essay_json"]
else:
    essay_json = generate_philosophy_essay(Topic)
print("")
print("Outputting essay...")
output_essay(essay_json, format_choice, method_choice)

Using cached essay...

Outputting essay...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

PDF compiled successfully: essay.pdf


In [18]:
clear_cache()

Cache cleared.
