 # Citation Assistant for Buddhist AI Paper



 This notebook creates acknowledgment sentences and citations based on Daniel's edit comments.

 It will help you acknowledge the existence of scholarly works without claiming deep familiarity.

 ## Import Libraries

In [25]:
import os
import json
import glob
import openai
import anthropic
from pathlib import Path
from dotenv import load_dotenv
import numpy as np
import re

# Load environment variables
load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "")
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY", "")

# Set API keys
openai.api_key = OPENAI_API_KEY
claude_client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY)


 ## Define Daniel's Edit Comments

In [26]:
# Daniel's edit comments with their context
DANIEL_EDITS = [
    {
        "id": "buddhist_epistemology",
        "comment": "For this particular point—pattern recognition in the absence of discrete facts—I think you definitely want to consult the edited volume Apoha: Buddhist Nominalism and Human Cognition. Dunne also has a nice summary in his article entitled something like 'Key Points in Dharmakīrti's Apoha Theory.'",
        "context": "In LLMs, meaning emerges from relationship patterns rather than discrete facts.",
        "section": "2. Beyond Databases: Considering Vector Spaces as Epistemological Frameworks",
        "files": ["Apoha-Buddhist Nominalism and Human Cognition", "Dunne_J_Key_Features_of_Dharmakirtis_Apoha_Theory"]
    },
    {
        "id": "dignaga_dharmakirti",
        "comment": "Dates for these, and citations. For citations, the most recent book on Digṅāga I'm aware of is Digṅāga's Investigation of the Percept (certainly relevant here). For Dharmakīrti I recommend Dunne or Tillemans in this case. Dan Arnold's twin books—one on Digṅāga, the other on Dharmakīrti—also come to mind.",
        "context": "specifically the epistemological tradition (pramāṇavāda) developed by Digṅāga and Dharmakīrti",
        "section": "2. Beyond Databases: Considering Vector Spaces as Epistemological Frameworks",
        "files": ["Dan Arnold - Brains, Buddhas, and Believing", "Dreyfus_RecognizingReality", "Dunne_Foundations"]
    },
    {
        "id": "embodiment",
        "comment": "Personally, I'd love to see a footnote here—maybe a reference to Diane's paper in this volume, or to Michael Radich, or even to earlier work like Thompson and Varela.",
        "context": "This exploration examines conventional embodiment requirements for awakened manifestation.",
        "section": "Introduction",
        "files": ["DDenis. AI.April.2025"]
    },
    {
        "id": "sensory_liberation",
        "comment": "Good. You might cite James Gentry's book on Power Objects here (or mention that you cite him below). I also recommend Peter Woods' recent RYI MA thesis which explores 'liberation by wearing, tasting, hearing, seeing' explicitly",
        "context": "Historical precedents—particularly Tibetan 'liberation through sensory encounters' practices—provide frameworks for considering how awakened presence might potentially manifest through various means.",
        "section": "Introduction",
        "files": ["Peter-Woods-MA-Thesis-2022"]
    },
    {
        "id": "buddha_nature",
        "comment": "in Mahāyāna this means the complete perfection of wisdom and means; I think you're speaking more about Vajrayāna/Mahāmudrā/Dzogchen, so it might be good to just be clear on that.",
        "context": "Just as tathagatagarbha represents the potential for awakening that manifests through specific and appropriate causes and conditions",
        "section": "3. Exploring AI as Potential Manifestation Vehicle",
        "files": ["Dreyfus_RecognizingReality"]
    },
    {
        "id": "mirror_concept",
        "comment": "Citation? Perhaps A Mirror is for Reflection (edited volume)?",
        "context": "Just as the Buddha taught that dharma functions as a mirror reflecting one's own mind (dharmādāsa)",
        "section": "5. Mathematical Integration and Cross-Disciplinary Knowledge Transfer",
        "files": ["Jake H. Davis, Owen Flanagan - A mirror is for reflection"]
    },
    {
        "id": "comparative_religion",
        "comment": "If you wanted to be unbelievably thorough here, you might reference that this is precisely Mircea Eliade's method in, e.g., Patterns in Comparative Religion. Eliade (culprit-weather example), but in terms of comparative religion/detextualizing them to make comparison. possibly cite...A Magic Still Dwells Patton, Kimberly.",
        "context": "reducing complex Buddhist philosophical concepts to vector representations necessarily involves significant simplification, potentially losing contextual and cultural dimensions",
        "section": "5. Mathematical Integration and Cross-Disciplinary Knowledge Transfer",
        "files": ["A Magic Still Dwells Comparative Religion in the Postmodern Age"]
    }
]


 ## Helper Functions

In [27]:
# Function to load all available text files
def load_all_texts(directory="."):
    all_texts = {}
    text_files = glob.glob(f"{directory}/*.txt")
    
    for file_path in text_files:
        file_name = Path(file_path).stem
        try:
            with open(file_path, "r", encoding="utf-8") as f:
                all_texts[file_name] = f.read()
            print(f"Loaded text from: {file_name}")
        except Exception as e:
            print(f"Error loading {file_path}: {e}")
    
    return all_texts

# Function to load all available embeddings
def load_all_embeddings(directory="."):
    all_embeddings = {}
    embedding_files = glob.glob(f"{directory}/*_embeddings.json")
    
    for file_path in embedding_files:
        file_name = Path(file_path).stem.replace("_embeddings", "")
        try:
            with open(file_path, "r", encoding="utf-8") as f:
                all_embeddings[file_name] = json.load(f)
            print(f"Loaded embeddings from: {file_name}")
        except Exception as e:
            print(f"Error loading {file_path}: {e}")
    
    return all_embeddings


 ## Search Functions

In [28]:
# Function to search for relevant passages in text
def text_search(query, texts_dict, target_files=None):
    if target_files:
        # Filter to only the specified files
        texts_dict = {k: v for k, v in texts_dict.items() if any(target in k for target in target_files)}
    
    if not texts_dict:
        return []
    
    # Simple keyword search as fallback
    results = []
    
    for doc_name, text in texts_dict.items():
        # Split into paragraphs
        paragraphs = text.split('\n\n')
        
        for i, para in enumerate(paragraphs):
            # Check if query keywords are in the paragraph
            query_words = query.lower().split()
            word_matches = sum(1 for word in query_words if word.lower() in para.lower())
            
            if word_matches >= len(query_words) // 2:  # At least half of the words match
                results.append({
                    "document": doc_name,
                    "paragraph": para,
                    "paragraph_index": i,
                    "relevance": word_matches / len(query_words)
                })
    
    # Sort by relevance
    results.sort(key=lambda x: x["relevance"], reverse=True)
    
    return results[:3]  # Return top 3 results

# Function to search for relevant passages using embeddings
def semantic_search(query, embeddings_dict, target_files=None, top_n=3):
    if target_files:
        # Filter to only the specified files
        embeddings_dict = {k: v for k, v in embeddings_dict.items() if any(target in k for target in target_files)}
    
    if not embeddings_dict:
        return []
    
    # Get embedding for the query
    try:
        response = openai.embeddings.create(
            model="text-embedding-3-small",
            input=query
        )
        query_embedding = response.data[0].embedding
        query_embedding_array = np.array(query_embedding)
        
        # Search across all documents
        all_results = []
        
        for doc_name, embeddings in embeddings_dict.items():
            for item in embeddings:
                embed_array = np.array(item["embedding"])
                # Cosine similarity
                similarity = np.dot(query_embedding_array, embed_array) / (
                    np.linalg.norm(query_embedding_array) * np.linalg.norm(embed_array)
                )
                
                all_results.append({
                    "document": doc_name,
                    "chunk": item["chunk"],
                    "similarity": float(similarity),
                    "chunk_index": item["chunk_index"]
                })
        
        # Sort by similarity
        all_results.sort(key=lambda x: x["similarity"], reverse=True)
        
        return all_results[:top_n]
    except Exception as e:
        print(f"Error in semantic search: {e}")
        return []


 ## Citation Generator

In [29]:
# Function to generate acknowledgment sentence and citation
def generate_acknowledgment(edit_info, passages, text_files):
    # Extract basic information about the source documents
    source_details = []
    
    for passage in passages:
        doc_name = passage.get("document")
        if "chunk" in passage:
            content = passage.get("chunk", "")
        else:
            content = passage.get("paragraph", "")
        
        # Try to extract author information from the document name or content
        author_match = re.search(r'(\w+),?\s+(\w+\.?)', doc_name)
        author = author_match.group(1) if author_match else doc_name.split('-')[0].split(' ')[0]
        
        # Try to extract year from content
        year_match = re.search(r'\b(19|20)\d{2}\b', content[:500])
        year = year_match.group(0) if year_match else "n.d."
        
        # Try to extract title
        title = doc_name
        if " - " in doc_name:
            title = doc_name.split(" - ")[1]
        elif "_" in doc_name:
            title = doc_name.replace("_", " ")
        
        source_details.append({
            "document": doc_name,
            "author": author,
            "year": year,
            "title": title,
            "content_sample": content[:300]  # First 300 chars for context
        })
    
    # Create prompt for Claude to generate acknowledgment and citation
    source_info = ""
    for i, s in enumerate(source_details):
        source_info += f"Source {i+1}: {s['document']}\n"
        source_info += f"Content sample: {s['content_sample']}...\n\n"
    
    prompt = f"""
    I'm writing an academic paper on AI and Buddhist wisdom. My editor has suggested acknowledging certain works in the scholarly conversation without claiming deep familiarity with them.

    Here's the editor's comment: "{edit_info['comment']}"
    
    It refers to this sentence in my paper: "{edit_info['context']}"
    
    This appears in section: "{edit_info['section']}"

    I have found relevant passages from these sources:

    {source_info}

    Please help me with:
    
    1. A brief (1-2 sentence) acknowledgment that mentions these works exist in the scholarly conversation and their general relevance to my point
    2. Properly formatted Chicago-style in-text citations for these works
    3. Complete bibliography entries for each work

    The acknowledgment should:
    - Be appropriate for inserting near the editor's comment
    - Simply nod to the existence of these works rather than claiming deep engagement
    - Connect to my paper's point about {edit_info['id'].replace('_', ' ')}
    - Be concise (max 2 sentences)
    - Use academic but accessible language

    Format your response in two clear sections:
    1. ACKNOWLEDGMENT: (your 1-2 sentence acknowledgment with in-text citations)
    2. BIBLIOGRAPHY: (one bibliography entry per line)
    """
    
    # Use Claude to generate the acknowledgment and citations
    try:
        message = claude_client.messages.create(
            model="claude-3-opus-20240229",
            max_tokens=1000,
            temperature=0.2,
            system="You are a helpful academic writing assistant with expertise in Buddhist studies and Chicago citation format.",
            messages=[
                {"role": "user", "content": prompt}
            ]
        )
        
        response_text = message.content[0].text
        
        # Parse the response manually instead of relying on JSON
        acknowledgment = ""
        bibliography = []
        
        if "ACKNOWLEDGMENT:" in response_text:
            parts = response_text.split("BIBLIOGRAPHY:")
            if len(parts) > 0:
                ack_part = parts[0].split("ACKNOWLEDGMENT:")[1].strip()
                acknowledgment = ack_part
            
            if len(parts) > 1:
                bib_part = parts[1].strip()
                bibliography = [line.strip() for line in bib_part.split("\n") if line.strip()]
        
        return {
            "acknowledgment": acknowledgment,
            "bibliography": bibliography
        }
    except Exception as e:
        print(f"Error generating acknowledgment with Claude: {e}")
        # Fallback to a simple acknowledgment
        return {
            "acknowledgment": f"Scholars have addressed {edit_info['id'].replace('_', ' ')} in works such as {', '.join([s['document'] for s in source_details])}.",
            "bibliography": [f"{s['document']}." for s in source_details]
        }

 ## Main Process - Load Files and Process Daniel's Edits

In [30]:
# Load text files and embeddings
print("Loading text files and embeddings...")
text_files = load_all_texts()
embeddings = load_all_embeddings()

if not text_files and not embeddings:
    print("No text files or embeddings found. Please make sure the PDF extraction script has completed.")
else:
    print(f"\nFound {len(text_files)} text files and {len(embeddings)} embedding files.")


Loading text files and embeddings...
Loaded text from: A Magic Still Dwells Comparative Religion in the Postmodern Age
Loaded text from: Apoha-Buddhist Nominalism and Human Cognition
Loaded text from: Dan Arnold - Brains, Buddhas, and Believing - The Problem of Intentionality in Classical Buddhist and Cognitive-Scientific Philo
Loaded text from: Dreyfus_RecognizingReality
Loaded text from: Dunne_Foundations FDP smaller file
Loaded text from: Dunne_J_Key_Features_of_Dharmakirtis_Apoha_Theory
Loaded text from: Jake H. Davis, Owen Flanagan - A mirror is for reflection _ understanding Buddhist ethics-Oxford University Press (2017)
Loaded text from: Peter-Woods-MA-Thesis-2022
Loaded embeddings from: Apoha-Buddhist Nominalism and Human Cognition
Loaded embeddings from: Dan Arnold - Brains, Buddhas, and Believing - The Problem of Intentionality in Classical Buddhist and Cognitive-Scientific Philo
Loaded embeddings from: Dreyfus_RecognizingReality
Loaded embeddings from: Dunne_Foundations FDP 

 ## Process Each Edit Comment

In [31]:
results = {}

print("\nProcessing Daniel's edit comments...")
for edit in DANIEL_EDITS:
    print(f"\nProcessing: {edit['id']}")
    edit_id = edit['id']
    
    # Combine the context and comment for better search results
    search_query = f"{edit['context']} {edit['comment']}"
    
    # Try semantic search first if embeddings are available
    passages = []
    if embeddings:
        print("  Using semantic search with embeddings...")
        passages = semantic_search(search_query, embeddings, edit['files'])
    
    # Fall back to text search if no results or no embeddings
    if not passages and text_files:
        print("  Falling back to text search...")
        passages = text_search(search_query, text_files, edit['files'])
    
    if not passages:
        print(f"  No relevant passages found for {edit_id}")
        continue
    
    print(f"  Found {len(passages)} relevant passages")
    
    # Generate acknowledgment and citation
    acknowledgment_info = generate_acknowledgment(edit, passages, text_files)
    
    results[edit_id] = {
        "edit_comment": edit['comment'],
        "context": edit['context'],
        "section": edit['section'],
        "acknowledgment": acknowledgment_info.get("acknowledgment", ""),
        "bibliography": acknowledgment_info.get("bibliography", []),
        "passages": passages
    }



Processing Daniel's edit comments...

Processing: buddhist_epistemology
  Using semantic search with embeddings...
  Found 3 relevant passages

Processing: dignaga_dharmakirti
  Using semantic search with embeddings...
  Found 3 relevant passages

Processing: embodiment
  Using semantic search with embeddings...
  Falling back to text search...
  No relevant passages found for embodiment

Processing: sensory_liberation
  Using semantic search with embeddings...
  Found 3 relevant passages

Processing: buddha_nature
  Using semantic search with embeddings...
  Found 3 relevant passages

Processing: mirror_concept
  Using semantic search with embeddings...
  Found 3 relevant passages

Processing: comparative_religion
  Using semantic search with embeddings...
  Falling back to text search...
  No relevant passages found for comparative_religion


 ## Generate Final Report

In [32]:
# Generate report
print("\nGenerating citation report...")
report_path = "acknowledgment_suggestions.md"

with open(report_path, "w", encoding="utf-8") as f:
    f.write("# Acknowledgment Suggestions for Buddhist AI Paper\n\n")
    f.write("This report provides suggested acknowledgment sentences and citations based on Daniel's edit comments.\n\n")
    
    for edit_id, info in results.items():
        f.write(f"## {edit_id.replace('_', ' ').title()}\n\n")
        f.write(f"**Daniel's Comment:** {info['edit_comment']}\n\n")
        f.write(f"**Context in Paper:** \"{info['context']}\"\n\n")
        f.write(f"**Section:** {info['section']}\n\n")
        
        f.write("### Suggested Acknowledgment\n\n")
        f.write(f"{info['acknowledgment']}\n\n")
        
        f.write("### Bibliography Entries\n\n")
        for entry in info['bibliography']:
            f.write(f"* {entry}\n")
        
        f.write("\n---\n\n")

print(f"Acknowledgment report generated: {report_path}")
print("\nYou can now review the suggestions and incorporate them into your paper.")



Generating citation report...
Acknowledgment report generated: acknowledgment_suggestions.md

You can now review the suggestions and incorporate them into your paper.


 ## Display Sample Results

In [33]:
# Display a sample result if available
if results:
    sample_key = list(results.keys())[0]
    print(f"Sample result for '{sample_key}':")
    print("\nSuggested acknowledgment:")
    print(results[sample_key]["acknowledgment"])
    print("\nBibliography entries:")
    for entry in results[sample_key]["bibliography"]:
        print(f"- {entry}")

Sample result for 'buddhist_epistemology':

Suggested acknowledgment:
The notion that meaning emerges from relational patterns rather than discrete facts has been explored in Buddhist epistemology, particularly in Dharmakīrti's apoha theory which examines meaning as exclusion (Siderits, Tillemans, and Chakrabarti 2011; Dunne 2011). While a full treatment is beyond the scope of this paper, these works provide valuable context for considering the epistemological implications of vector space models.

Bibliography entries:
- Dunne, John D. 2011. "Key Features of Dharmakīrti's Apoha Theory." In Apoha: Buddhist Nominalism and Human Cognition, edited by Mark Siderits, Tom Tillemans, and Arindam Chakrabarti, 84-108. New York: Columbia University Press.
- Siderits, Mark, Tom Tillemans, and Arindam Chakrabarti, eds. 2011. Apoha: Buddhist Nominalism and Human Cognition. New York: Columbia University Press.
