# Categories of intertextuality 
(ordered by their potential edge weight in document network)

### I. Explicit Interaction  
(Explicit intertextuality involves direct textual borrowing, transformation, or reference, often with clear markers or attribution.)

1. **Quotation**  
   (Corresponds to Fine-grained Quotation Detection and Attribution)  
   - Paper: "Fine-grained Quotation Detection and Attribution in German News Articles"  
     - Link: [https://aclanthology.org/2024.konvens-main.22.pdf](https://aclanthology.org/2024.konvens-main.22.pdf)  
     - Reference: Petersen-Frey, F., & Biemann, C. (2024). Fine-grained quotation detection and attribution in German news articles. In Proceedings of the 20th KONVENS Conference (pp. 191–203).  
     - Summary: This paper presents a method to automatically detect and attribute quotations in German news articles using a sequence-to-sequence transformer model with constrained decoding. The model predicts five types of quotations (Direct, Indirect, Reported, Free Indirect, and Indirect/Free Indirect) along with roles like Speaker, Cue, Addressee, and Frame. It significantly improves upon existing baselines, making it feasible for the first time to extract attributed quotations from German news articles with high accuracy.  
     - Literature Review Context: The paper builds on previous work in quotation detection and attribution, including approaches for English and German news texts. It contrasts with earlier methods like rule-based systems (e.g., Bögel & Gertz, 2015) and neural models (e.g., Papay & Padó, 2019), offering a more advanced, structured solution for handling complex, fine-grained quotation and role detection tasks. The work also draws on structured generation techniques, as explored by Zhang et al. (2023), for handling text spans in a sequence-to-sequence model.

2. **Translation**  
   (Translation involves the transformation of a text from one language to another, where the context, culture, and time-period may shift, thus changing its meaning and interpretation.)  
   - Paper: "Neural Machine Translation by Jointly Learning to Align and Translate"  
     - Link: [https://formacion.actuarios.org/wp-content/uploads/2024/05/1409.0473-Neural-Machine-Translation-By-Jointly-Learning-To-Align-And-Translate.pdf](https://formacion.actuarios.org/wp-content/uploads/2024/05/1409.0473-Neural-Machine-Translation-By-Jointly-Learning-To-Align-And-Translate.pdf)  
     - Reference: Miola, R. S. (2004). Seven Types of Intertextuality. In Shakespeare, Italy, and Intertextuality.  
     - Summary: This influential paper by Bahdanau, Cho, and Bengio (2014) introduces a novel approach to neural machine translation (NMT) that combines both the alignment and translation processes. The model utilizes a bidirectional RNN encoder and an attention mechanism to improve translation performance, especially for longer sentences. This approach overcomes the limitations of traditional encoder-decoder models that rely on fixed-length vectors.  
     - Literature Review Context: Translation can be viewed as a form of intertextuality, where the original text is absorbed and transformed into another language. This aligns with Kristeva's (1980) notion of intertextuality as the interaction and transformation of texts. In the context of NMT, the source and target texts are interconnected, and the attention mechanism enables a dynamic relationship between the two.

3. **Plagiarism**  
   (Plagiarism detection is based on the idea that no two authors should produce the same blocks of text, making unauthorized textual overlap a form of explicit intertextuality.)  
   - Paper: "Academic Plagiarism Detection: A Systematic Literature Review"  
     - Link: [https://dl.acm.org/doi/pdf/10.1145/3345317](https://dl.acm.org/doi/pdf/10.1145/3345317)  
     - Summary: This paper systematically reviews 239 research papers published between 2013 and 2018 on computational methods for academic plagiarism detection. It categorizes the methods into extrinsic and intrinsic approaches and presents a detailed analysis of various techniques such as n-gram matching, latent semantic analysis (LSA), explicit semantic analysis (ESA), and stylometry. The paper also emphasizes the role of non-textual content analysis (e.g., citation patterns, figures, and mathematical content) and machine learning in detecting more sophisticated forms of plagiarism. The authors propose a novel typology for plagiarism forms and detection methods, and they identify a research gap in the lack of thorough performance evaluations of plagiarism detection systems. The review highlights the potential of integrating heterogeneous analysis methods (textual and non-textual) using machine learning as a promising direction for future research.  
     - Mathematical Formalism:  
       - Techniques such as n-gram models (word and character-based), vector space models (VSM), and cosine similarity are commonly used for comparing texts.  
       - The Jaccard Similarity for comparing n-grams is defined as:
         $$
         \text{Jaccard Similarity} = \frac{|A \cap B|}{|A \cup B|}
         $$
         where $A$ and $B$ are sets of n-grams from the two texts.
       - The cosine similarity between two vectors $\mathbf{A}$ and $\mathbf{B}$ (such as tf-idf vectors) is given by:
         $$
         \text{Cosine Similarity} = \frac{\mathbf{A} \cdot \mathbf{B}}{\|\mathbf{A}\| \|\mathbf{B}\|}
         $$
       - Latent Semantic Analysis (LSA) uses Singular Value Decomposition (SVD) to reduce the dimensionality of a term-document matrix $X$, represented as:
         $$
         X = U \Sigma V^T
         $$
         where:
         - $U$ is the matrix of left singular vectors (terms),
         - $\Sigma$ is the diagonal matrix of singular values,
         - $V^T$ is the matrix of right singular vectors (documents).
       - Citation-based similarity compares citation patterns across documents. The similarity is typically calculated by:
         $$
         \text{Citation-based Similarity} = \frac{\text{Length of matched citation sequence}}{\text{Total citation length in document}}
         $$
       - Machine Learning Classifiers such as Support Vector Machines (SVM) use the following decision function, where $w$ is the weight vector and $b$ is the bias term:
         $$
         f(x) = w^T x + b
         $$
     - Method/Procedure:  
       - Data Collection: The review collected research papers using keyword-based searches on Google Scholar and Web of Science. The inclusion period was from 2013 to 2018, but seminal papers from earlier years were also considered. Papers focused on text-based plagiarism detection were prioritized.  
       - Analysis: The papers were categorized into three layers: plagiarism detection methods, plagiarism detection systems, and plagiarism policies. The main focus of the review is on detection methods, which were further divided into extrinsic (comparing a suspicious document to external sources) and intrinsic (analyzing stylistic anomalies within a single document) approaches.  
       - Plagiarism Detection Approaches:  
         - Extrinsic: Two-stage process involving candidate retrieval (using n-gram matching, search engines, etc.) followed by detailed analysis (e.g., text alignment, paraphrase detection).  
         - Intrinsic: Focuses on stylometry, measuring stylistic features such as word frequencies, sentence structure, and part-of-speech (PoS) patterns to detect writing style anomalies.  
         - Non-textual Analysis: Citation-based plagiarism detection (CbPD), as well as analysis of figures, tables, and mathematical expressions, is employed to detect obfuscated forms of plagiarism.  
     - Literature Review Context: Plagiarism detection is a crucial area of intertextuality detection, focusing on identifying inappropriate similarities between academic texts. The paper highlights that techniques like semantic text analysis, non-textual content features (such as citations, figures, and mathematical content), and machine learning have enhanced the ability to detect more complex forms of plagiarism. As the field evolves, there is a growing emphasis on addressing gaps in performance evaluation and the potential for integrating various detection methods to improve accuracy and robustness in detecting disguised plagiarism.

4. **Allusion**  
   (Allusion depends on recognizing references to earlier works, often across different time periods or cultural contexts, thus involving a shift in meaning based on the audience's knowledge.)  
   - Paper: "Allusions in the Age of the Digital: four ways of looking at a corpus"  
     - Link: [https://hcommons.org/deposits/download/hc:26884/CONTENT/visualizing-absence-revised-v4-5-19.docx/](https://hcommons.org/deposits/download/hc:26884/CONTENT/visualizing-absence-revised-v4-5-19.docx/)  
     - Summary: The paper investigates the intertextuality within the Sidney family, particularly focusing on Mary Wroth’s allusive practices. By using both close and distant reading techniques, the author explores how computational tools can detect allusions, challenging traditional assumptions about intertextuality. The absence of expected textual connections between Wroth and her aunt Mary Sidney Herbert is a central puzzle addressed through various computational analyses.  
     - Literature Review Context: Allusion detection is a significant issue in both traditional and digital literary analysis. The paper aligns with the challenge of finding subtle intertextual markers, as noted by scholars like Forstall et al. (2015), who use computational tools to detect allusions through word-level n-gram matching. This research further extends the conversation by examining how both human and machine-reading approaches can complement each other in identifying intertextual gaps and patterns.

---

### II. Scholarly/Conceptual Relations  
(Scholarly and conceptual relations focus on how texts are connected through intellectual, thematic, or structural frameworks that extend beyond explicit textual references.)

5. **Sources**  
   (Sources represent explicit intellectual or documentary origins that provide the foundation for texts, often involving the identification of original information or ideas.)  
   - Paper: "Information source detection in the SIR model: A sample-path-based approach"  
     - Link: [https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6962907&casa_token=8450_CI5lAQAAAAA:HB0i4WXYeGXUETew1kiE7PkMopWQpwOW8JoqcNJJ9SGT89FxnKhStKRjWaXHnMJDrnKmEsOC](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6962907&casa_token=8450_CI5lAQAAAAA:HB0i4WXYeGXUETew1kiE7PkMopWQpwOW8JoqcNJJ9SGT89FxnKhStKRjWaXHnMJDrnKmEsOC)  
     - Reference: Miola, R. S. (2004). Seven Types of Intertextuality. In Shakespeare, Italy, and Intertextuality.  
     - Summary: This paper addresses the problem of detecting the information source in networks using the SIR (Susceptible-Infected-Recovered) model. Given a network snapshot where infected nodes are identified, but susceptible and recovered nodes are indistinguishable, the task is to find the information source. The authors propose a sample-path-based approach that selects the root node associated with the most likely sample path leading to the observed snapshot. In tree networks, the estimator is shown to be a node that minimizes the infection eccentricity, a concept akin to the Jordan center in graph theory.  
     - Literature Review Context: Sources are a central form of intertextuality in scholarly writing. Romanello (2016) discusses how citation practices create complex webs of intertextual connections that can be computationally modeled through citation networks.  
     - Method/Procedure:  
        1. Model Assumptions: The network is an undirected graph where each node can be in one of three states: susceptible (S), infected (I), or recovered (R). Initially, one node (the information source) is infected, and the infection spreads probabilistically to susceptible neighbors, while infected nodes recover with a certain probability.  
        2. Snapshot Observation: At a given time $t$, a snapshot of the network is taken. Infected nodes are known, but susceptible and recovered nodes cannot be distinguished.  
        3. Optimal Sample Path: The problem is formulated as finding the optimal sample path—the most likely sequence of infections and recoveries leading to the observed snapshot. The source is then inferred by selecting the node that minimizes the maximum distance (infection eccentricity) to infected nodes.  
        4. Reverse Infection Algorithm: The authors propose a reverse-infection algorithm where each infected node broadcasts its identity, and the node that first collects all identities declares itself the source. This node is chosen based on minimizing the sum of distances to the infected nodes.  
        5. Performance: On regular trees, the proposed estimator is shown to be within a constant distance from the actual source with high probability, independent of the number of infected nodes or the time the snapshot is taken. Simulations on real-world networks (e.g., Internet Autonomous Systems, Wikipedia Who-Votes-On-Whom) demonstrate that the reverse-infection algorithm outperforms other heuristics like the closeness centrality method.  
     - Mathematical Formalism:  
        - The infection process evolves according to a discrete-time Markov chain, where the state of each node at time $t$ is denoted by $X_i(t)$.  
        - The maximum likelihood estimation (MLE) problem is to find the source node $s$ that maximizes the probability of generating the observed snapshot $Y$:
        $P(Y | s) = \max_{s} \sum_{\text{all paths}} P(\text{path} \mid s, Y)$  
        - The infection eccentricity of a node $v$ is defined as the maximum distance from $v$ to any infected node:  
        $e(v) = \max_{u \in \text{infected nodes}} d(v, u)$  
        - The estimator is the node that minimizes the infection eccentricity.

6. **Citation-based Intertextuality**  
   (Citation-based intertextuality highlights how texts reference one another in an academic or scholarly context, forming networks of influence and intellectual exchange.)  
   - Paper: "The use of citation context to detect the evolution of research topics: a large-scale analysis"  
     - Link: [https://link.springer.com/article/10.1007/s11192-020-03858-y](https://link.springer.com/article/10.1007/s11192-020-03858-y)  
     - Summary: This paper presents a large-scale analysis of research topics' evolution using citation contexts from biomedical and life sciences publications. By analyzing the citing sentences (citation contexts) of 64,350 PubMed Central papers from 2008 to 2018, the study identifies trends across ten research areas. The paper also examines how these research topics evolve geographically and in major journals in the biomedical and life sciences.  
     - Method/Procedure: The methodology involves:
       1. Data Collection: 64,350 papers from PubMed Central (2008–2018) were selected.
       2. Citation Context Extraction: Citing sentences were extracted to capture the context in which references were made.
       3. Dynamic Topic Modeling: A dynamic topic model (DTM) was utilized to track the evolution of research topics over time. DTM extends traditional topic models like Latent Dirichlet Allocation (LDA) by incorporating temporal information, allowing the identification of changing patterns in topics.
       4. Geographical and Journal Analysis: The study also analyzed the evolution of topics across different countries and journals, highlighting regional and journal-specific research dynamics.
  
     - Math Formalism: The dynamic topic modeling follows the Bayesian framework, where topics are represented as distributions over words, and documents are mixtures of these topics. The model evolves over time, allowing us to track how topics shift. The core of the model is based on the log-likelihood function:

       $$
       \log p(W | \alpha, \beta, \theta, Z) = \sum_{d=1}^{D} \sum_{n=1}^{N_d} \log \left( \sum_{k=1}^{K} \theta_{d,k} \beta_{k, w_{dn}} \right)
       $$

       where $D$ is the number of documents, $N_d$ is the number of words in document $d$, $\theta_{d,k}$ is the topic distribution for document $d$, and $\beta_{k, w}$ is the word distribution for topic $k$. This formulation allows the model to capture how the distribution of topics evolves over time.

     - Literature Review Context: Citation-based intertextuality, as explored in this paper, reflects how scientific ideas propagate and evolve through citation networks. By studying citation contexts, the authors reveal intertextual relationships that showcase patterns of scholarly influence. This approach aligns with previous work by Scheirer et al. (2014), who demonstrated that citation networks can effectively trace intertextual relationships in the scientific literature, highlighting patterns of influence and engagement over time.

7. **Conceptual Intertextuality**  
   (Conceptual intertextuality focuses on how ideas, themes, or concepts are shared across texts, even when direct textual borrowing is not present.)  
    - Paper: "Accurate and effective latent concept modeling for ad hoc information retrieval"  
      - Link: [https://www.cairn.info/load_pdf.php?ID_ARTICLE=DN_171_0061&download=1&from-feuilleteur=1](https://www.cairn.info/load_pdf.php?ID_ARTICLE=DN_171_0061&download=1&from-feuilleteur=1)  
      - Summary: This paper introduces an unsupervised method for latent concept modeling (LCM) in information retrieval (IR). The method, leveraging Latent Dirichlet Allocation (LDA), extracts latent concepts from pseudo-relevant documents to refine user queries. The goal is to address the common issue of under-specified keyword queries by recreating a conceptual representation of the user's information need. The method improves retrieval effectiveness through query expansion by incorporating these learned concepts, showing significant improvements in document retrieval tasks over large TREC collections (Robust04 and ClueWeb09-B).  
      - Method/Procedure:  
        1. Latent Dirichlet Allocation (LDA): The core of the method is based on LDA, where documents are modeled as mixtures of topics, and topics as mixtures of words. LDA is applied to pseudo-relevant feedback documents to extract topic-related concepts.
        2. Topic/Concept Estimation: The number of latent concepts $K$ is estimated using a heuristic that maximizes the Jensen-Shannon divergence between topic pairs. The optimal $K$ is given by:  
           $$\hat{K} = \arg\max_K \frac{1}{K(K-1)} \sum_{(k,k') \in T_K} D(k||k')$$  
           where $D(k||k')$ is the Jensen-Shannon divergence between topics $k$ and $k'$, and $T_K$ is the set of topics.
        3. Feedback Document Selection: To ensure conceptual coherence, the method selects the optimal number of feedback documents $M$ by maximizing the similarity between different concept models generated from varying document samples:  
           $$M = \arg\max_{1 \leq m \leq 20} \sum_{1 \leq n \leq 20, n \neq m} \text{sim}(T_{\hat{K}}(m), T_{\hat{K}}(n))$$  
        4. Concept Weighting: The importance of each concept $k$ is calculated based on its occurrence in top-ranked documents. The score of a concept is given by:  
           $$\delta_k = \sum_{D \in R_Q} P(Q|D) P_{\text{TM}}(k|D)$$  
        5. Document Ranking: The final document ranking is determined by combining the initial query likelihood and the latent concepts as follows:  
           $$s(Q, D) = \lambda P(Q|D) + (1 - \lambda) \prod_{k \in T_{\hat{K}}} \prod_{w \in W_k} P(w|D)^{\hat{\phi}_{k,w} \delta_k}$$  
           where $P(Q|D)$ is the query likelihood, and $P(w|D)$ is the word likelihood in document $D$. $\lambda$ controls the trade-off between the query and the latent concepts.
      - Literature Review Context: Conceptual intertextuality, in the context of IR, refers to how concepts are shared and evolve across related documents. This paper utilizes LDA to model these conceptual links, similar to the intertextual relationships discussed in works like Scheirer et al. (2014), where LSI and LDA are pivotal in uncovering conceptual overlaps between documents.

8. **Semantic Intertextuality**  
   (Semantic intertextuality involves identifying deeper connections between texts based on meaning, even when lexical or surface-level similarities are absent.)  
    - Paper: "Learning Text Similarity with Siamese Recurrent Networks"  
      - Link: [https://aclanthology.org/D16-1054.pdf](https://aclanthology.org/D16-1054.pdf)  
      - Summary: This paper introduces a method for learning text similarity using Siamese Recurrent Neural Networks (RNNs). The model is trained in a supervised way to detect semantic similarity between sentences by encoding them into fixed-length vectors and measuring the distance between them. This approach is particularly effective for identifying semantically equivalent sentences in different contexts, despite lexical variation.  
      - Method/Procedure:  
        1. Model Architecture: The Siamese RNN architecture consists of two identical RNNs with shared weights, which encode two sentences into fixed-length vectors. The sentences are then compared using a distance metric (e.g., cosine similarity or Euclidean distance).
        2. Training: The model is trained on a dataset of sentence pairs labeled with their degree of similarity. The training objective is to minimize the distance between vectors for similar sentence pairs and maximize it for dissimilar pairs.
        3. Evaluation: The model's performance is typically evaluated using standard metrics like Pearson correlation or Mean Squared Error (MSE), which measure how well the predicted similarity scores align with human judgments of similarity.
        
        The model's loss function can be written as:
        $$
        \mathcal{L}(S_1, S_2, y) = \left( \text{similarity}(S_1, S_2) - y \right)^2
        $$
        where $S_1$ and $S_2$ are the sentence vectors, and $y$ is the ground-truth similarity score.
      - Literature Review Context: Semantic intertextuality, as demonstrated in this paper, can be captured by neural models that focus on meaning rather than surface-level lexical overlap. This aligns with Barbu and Trausan-Matu (2017), who explore how neural embeddings and semantic similarity metrics can uncover deeper intertextual connections based on shared meanings.

---

### III. Structural/Textual Patterning  
(Structural and textual patterning refers to how texts are organized, revised, or rephrased, reflecting intertextuality through structural or stylistic similarities.)

9. **Paraphrastic Intertextuality**  
   (Paraphrastic intertextuality involves rephrasing or restating content while maintaining semantic similarity, making it a subtle form of structural intertextuality.)  
   - Paper: "Automatic Machine Translation Evaluation in Many Languages via Zero-Shot Paraphrasing"  
     - Link: [https://arxiv.org/pdf/2004.14564](https://arxiv.org/pdf/2004.14564)  
     - Summary: This paper frames machine translation evaluation as a task of scoring machine translation (MT) outputs using a sequence-to-sequence paraphraser, conditioned on a human reference. By treating paraphrasing as a zero-shot translation task (e.g., Czech to Czech), the authors propose a multilingual NMT system that evaluates paraphrastic intertextuality across 39 languages. The model, named Prism, significantly outperforms BLEU and other metrics by focusing on semantic similarity and fluency rather than direct lexical overlap.  
     - Literature Review Context: Paraphrastic intertextuality, or the identification of semantic equivalence through rephrasings, is difficult to detect using traditional lexical-based metrics like BLEU. As highlighted by Ghiban and Trausan-Matu (2013), computational methods such as word embeddings and neural machine translation models are increasingly being used to capture the nuanced relationships of intertextuality, where the focus is on meaning rather than word overlap. The Prism model introduces a novel approach by using multilingual NMT to evaluate MT outputs based on probabilistic paraphrasing, which better aligns with human judgments of translation quality.

10. **Revision**  
   (Revision intertextuality refers to how texts are reworked or iterated upon, often in response to feedback or critique, such as in the peer-review process.)  
   - Paper: "Revise and Resubmit: An Intertextual Model of Text-based Collaboration in Peer Review"  
     - Link: https://direct.mit.edu/coli/article/48/4/949/112555/Revise-and-Resubmit-An-Intertextual-Model-of-Text  
     - Reference: Miola, R. S. (2004). Seven Types of Intertextuality. In Shakespeare, Italy, and Intertextuality.  
     - Summary: This paper introduces a model of intertextual collaboration in the peer review process. It focuses on three main phenomena: pragmatic tagging, linking, and version alignment. These phenomena are essential for understanding the iterative cycle of review, revision, and resubmission. The authors propose an intertextual graph-based data model that reflects the structure of documents and the relations between them.  
     - Method/Procedure:  
       - Pragmatic Tagging: Classifies statements in the review text based on communicative purpose (e.g., strengths, weaknesses, requests).  
       - Linking: Identifies fine-grained connections between the review text and the manuscript (explicit and implicit links).  
       - Version Alignment: Aligns different versions of the manuscript to trace changes and revisions. The authors formalize the alignment task as a constrained optimization problem using integer linear programming (ILP).  
     - Formalism:  
       The version alignment problem is framed as a node alignment between two graphs $G_t$ and $G_{t+\Delta}$ representing different versions of the manuscript. The objective is to maximize the similarity score between aligned nodes while enforcing one-to-one alignment constraints. The optimization problem can be written as:

       $$ \max \sum_{i,j} x_{i,j} \cdot \text{score}(n^t_i, n^{t+\Delta}_j) $$

       subject to:

       $$ \sum_j x_{i,j} \leq 1 \quad \text{and} \quad \sum_i x_{i,j} \leq 1 $$

       where $x_{i,j}$ is a binary variable indicating if node $i$ in $G_t$ is aligned with node $j$ in $G_{t+\Delta}$. The score function $\text{score}(n^t_i, n^{t+\Delta}_j)$ is computed based on the similarity of the text content between the nodes, and additional constraints ensure that nodes with different types (e.g., paragraphs vs. sections) are not aligned.
     - Literature Review Context: Revisions are seen as a form of intertextual dialogue, where the updated manuscript responds to feedback. This aligns with Bakhtin’s (1981) idea of dialogism, where texts are always in conversation with others, and revisions represent an explicit form of this dialogue.

11. **Parody**  
   (Parody intertextuality involves imitating or mimicking a text, often with a humorous or satirical intent, creating a structural and thematic dialogue between the original and the parody.)  
   - Paper: "Parody Detection: An Annotation, Feature Construction, and Classification Approach to the Web of Parody"  
     - Link: [https://link.springer.com/chapter/10.1007/978-3-319-54499-1_3](https://link.springer.com/chapter/10.1007/978-3-319-54499-1_3)  
     - Summary: This paper focuses on detecting parody in user-generated content, particularly on platforms like YouTube. The authors aim to establish the "Web of Parody," which refers to the relationships between original content and its parodic adaptations. The study employs machine learning techniques to classify content based on both textual and quantitative features. Specifically, the focus is on descriptive text, comments, and quantitative features (e.g., song structure, lyrics) to identify parodic relationships.  
     - Method/Procedure:  
       1. Data Collection: The authors collected data from YouTube, focusing on music video parodies. This data includes titles, lyrics, descriptive text, and comments.  
       2. Feature Construction: Features were derived from both text (e.g., comments, video descriptions) and quantitative aspects (e.g., song structure, beat patterns).  
       3. Annotation: A user annotation process was employed to label content as parody or non-parody.  
       4. Machine Learning Pipeline: A classification task was formulated to predict whether a piece of content is a parody or not. This involves feature selection, construction, and applying machine learning classifiers such as Support Vector Machines (SVMs) and Naive Bayes.  
       5. Results: The framework was tested on the YouTube dataset, and results indicate that combining textual and quantitative features improves the classification accuracy.  

     - Literature Review Context: Parody is a specific form of intertextuality, where original works are transformed into humorous or satirical versions. This transformation often involves a dialogic exchange, as noted by Bakhtin (1981). The authors of this study contribute to the detection of such transformations using computational methods.

12. **Pastiche**  
   (Pastiche intertextuality involves blending multiple styles or imitating the stylistic features of different texts, often without satirical intent, creating a structural amalgamation.)  
    - Paper: "Stylistic Transfer in Natural Language Generation Systems Using Recurrent Neural Networks"  
      - Link: [https://aclanthology.org/W16-6010.pdf](https://aclanthology.org/W16-6010.pdf/)  
      - Summary: The authors explore the task of stylistic transfer in natural language generation (NLG) systems. They propose a method using Recurrent Neural Networks (RNNs), specifically LSTM-based autoencoders, to disentangle style from semantic content. This allows the system to perform stylistic transfer by separating and recombining latent representations of style and content. The process involves training the model on corpora representing different styles, and during testing, swapping the stylistic features between texts to generate output in a new style. This approach can be applied to pastiche generation by blending different styles into a single text.  
      - Method/Procedure:  
        1. Data Collection: Collect corpora for different styles (e.g., Shakespearean English and Simple English Wikipedia).
        2. Training the Model: Train an LSTM-based encoder-decoder model, where the latent representation consists of two components: style and content. This disentanglement between style and content is reinforced through a modified training objective that includes a cross-covariance term to ensure that style and content are separated.
        3. Testing/Generation: For stylistic transfer, during testing, the model is fed text from one style (e.g., A), and the style latent variables are replaced with those corresponding to another style (e.g., B). The output is text in style B while maintaining the semantic content of the input.
        4. Evaluation: The generated text is evaluated based on criteria such as soundness, coherence, and effectiveness, using both human judgments and potentially automatic metrics like BLEU or ROUGE.
        
        The overall training objective is to maximize the conditional log-likelihood of the output given the input, $p(y | x)$, while incorporating a disentanglement loss term to separate style from content.

      - Literature Review Context: Pastiche, like parody, is a form of intertextuality that blends styles from multiple sources. As Kabbara and Cheung (2016) show, stylistic transfer can be computationally modeled using RNNs to capture and separate stylistic and semantic features. This disentanglement allows for the recombination of these features, making it possible to generate texts that blend multiple styles. Ghiban and Trausan-Matu (2013) also explore similar concepts, using Word2Vec for capturing theme and style blending.

---

### IV. Contextual/Cultural Connections  
(Cultural and contextual intertextuality refers to how texts engage in broader dialogues with cultural, ideological, or historical contexts, reflecting the interplay between texts and their surrounding environments.)

13. **Thematic Intertextuality**  
   (Thematic intertextuality involves the recurrence of themes or motifs across texts, often shaped by cultural or contextual factors.)  
   - Paper: "Towards Theme Detection in Personal Finance Questions"  
     - Link: [https://arxiv.org/pdf/2110.01550](https://arxiv.org/pdf/2110.01550)  
     - Summary: This paper presents a method for detecting themes in customer service call transcripts, using a dataset of personal finance questions from StackExchange as a testbed. The approach processes call sections likely to contain the main reason for the call, segments them into sentences, and encodes these sentences using pre-trained sentence embeddings (e.g., Universal Sentence Encoder or SBERT). The encoded sentences are then clustered to identify thematic groupings. The authors find that a combination of the Universal Sentence Encoder and KMeans clustering outperforms more complex methods.  
     - Method/Procedure:  
          1. Data Preparation: The authors use a set of simple heuristics to identify the "Customer Problem Statements" (CPSs) from call center transcripts. These CPSs are then segmented into sentences.  
          2. Sentence Representation: Sentences are transformed into vector representations using different encoding models, including the Universal Sentence Encoder (USE), SBERT-family models, and a baseline TF-IDF weighted n-grams approach.  
          3. Clustering: The encoded sentences are clustered using KMeans or HDBSCAN algorithms. KMeans with $k = 700$ was found to be optimal through an elbow method, while HDBSCAN identified clusters with minimum cluster sizes of 5.  
          4. Evaluation: The clusters are treated as a supervised classification model and evaluated using a custom evaluation procedure. For a given question, the closest cluster is identified for each sentence, and the final prediction is made based on the majority tag distribution from those clusters. The evaluation metric used is Micro-F1.  
          
          The prediction for a question $Q_j$ with sentences $S_j$ is calculated by finding the nearest clusters $i$ for each sentence and then computing a tag prediction as follows:

          $ \text{prediction} = \arg \max_j \left( \frac{1}{k} \sum_{k} p_{ij} \right) $

          Where $p_{ij}$ is the tag distribution for cluster $i$ and $k$ is the number of sentences in the question.

      - Literature Review Context: Thematic intertextuality can be efficiently modeled using techniques like KMeans clustering combined with sentence embeddings, as demonstrated in this paper. However, as Romanello (2016) suggested and is supported by the findings here, multiple techniques (such as USE and SBERT) often need to be compared for optimal performance in capturing subtle thematic patterns.

14. **Cultural Intertextuality**  
   (Cultural intertextuality refers to how texts engage with cultural artifacts, traditions, and representations, creating dialogues between texts and broader cultural frameworks.)  
    - Venue: "Cultural Intertexts"   
      - Link: [https://www.cultural-intertexts.com/volumes/](https://www.cultural-intertexts.com/volumes/)  
      - Summary: Cultural Intertexts is an academic journal that explores the interconnectedness of various cultural artifacts, including literature, music, and visual arts, through the lens of intertextuality. One of its recent issues features a special section on "Representations of the Danube in Literature, Music, and Visual Arts." The journal focuses on how cultural spaces, identities, and memories are constructed and represented across different media. Additionally, it addresses broader themes such as gender, ghostwriting, and the politics of representation in literature. It provides a platform for scholars to disseminate their research on Literary and Cultural Studies, emphasizing both global and local perspectives.
      - Literature Review Context: Cultural intertextuality, as noted by Romanello (2016), requires sophisticated methods to detect, especially when working with multilingual corpora. Word2Vec and similar models help capture these cultural connections across texts.

15. **Ideological Intertextuality**  
   (Ideological intertextuality focuses on how texts reflect, engage with, or propagate ideological discourses, highlighting the connection between texts and broader political or ideological frameworks.)  
    - Paper: "Ideological Perspective Detection Using Semantic Features"  
      - Link: [https://aclanthology.org/S15-1015.pdf](https://aclanthology.org/S15-1015.pdf)  
      - Summary: This paper explores the automatic detection of a person's ideological stance from written text using semantic features such as word sense disambiguation (WSD) and latent semantic models. It evaluates the performance of these features in identifying political perspectives using two datasets: one created from American National Election Studies (ANES) questions and another from ideological debates covering topics like abortion, creationism, and gun rights. The authors demonstrate that combining WSD and latent semantic features improves the classification of ideological perspectives over unigram-based methods.  
      - Method/Procedure:  
1. Datasets:  
            - The paper uses two datasets:  
              (i) A dataset generated from an Amazon Mechanical Turk experiment based on questions from the ANES survey. Answers to open-ended questions are combined and used to predict a person's presidential candidate choice.  
              (ii) The "Ideological Debates" dataset, which contains discussions on topics such as abortion, creationism, gun rights, and gay rights, with pro/against stances.  
        2. Features:  
            - Word Sense Disambiguation (WSD): Two types of WSD are explored:  
              (i) Contextual WSD (WSD-CXT), which uses the modified Lesk algorithm to disambiguate word senses based on surrounding context.  
              (ii) Most Frequent Sense WSD (WSD-MFS), which uses the most frequent sense from WordNet.  
            - Latent Semantic Models:  
              (i) Latent Dirichlet Allocation (LDA) and  
              (ii) Weighted Textual Matrix Factorization (WTMF) are used to generate topic distributions from documents.  
        3. Training:  
            - SVM classifiers are trained using features derived from WSD and latent semantics, combined with unigram features. The classifiers are evaluated using 10-fold cross-validation on the training set and tested on a held-out test set.  
        4. Evaluation:  
            - F$_{\beta=1}$ score is used to measure the performance, and results show that the combination of WSD and latent semantics consistently outperforms unigram baselines in both datasets.  
     - Literature Review Context:  
        - Ideological intertextuality aligns with Bakhtin’s (1981) dialogism, where texts respond to social and political discourses. Detecting ideological markers computationally, as explored by Barbu and Trausan-Matu (2017), is critical for understanding how texts engage with broader socio-political landscapes. The semantic features used in this paper facilitate a nuanced detection of such ideological markers in written discourse.

---

### V. Horizontal Intertextuality  
(Horizontal intertextuality refers to the connections between texts that exist on the same level, often reflecting parallels in theme, structure, or content across different texts.)

16. **Paralogues**  
   (Paralogues represent thematic or analogical connections between texts, where one text illuminates the intellectual, social, or political meanings of another, creating horizontal intertextual relationships.)  
    - Paper: "Cross-Cultural Paralogues in Literary Criticism"  
      - Reference: Miola, R. S. (2004). Seven Types of Intertextuality. In *Shakespeare, Italy, and Intertextuality*.  
      - Summary: Paralogues represent texts that illuminate intellectual, social, theological, or political meanings in other texts. These connections are often made horizontally through analogical discourses, rather than through direct lineation from the author's intent. This type of intertextuality is key in revealing cultural poetics and broader ideological contexts.  
      - Literature Review Context: Paralogues reflect the idea of ideological and cultural intertextuality, where texts engage in broader social and intellectual dialogues. Bakhtin’s (1981) concept of dialogism underpins this form of intertextuality, where texts are part of a larger cultural discourse.