This is the model TwoTree designed by our research team that combines dependency syntax trees and component syntax trees.
Empathetic response generation aims to enable dialogue systems to perceive emotions expressed by speakers and generate empathetic responses accordingly. Present methodologies fall short in proficiently discerning dynamic emotions within complex human conversational patterns, resulting in a shortfall in the model's semantic comprehension and its proficiency in generating suitable empathetic responses. To address this issue, we propose the Integrating Constituency and Dependency Syntax Parse Trees for Empathetic Response Generation Model (ICDM), which integrates two types of trees for a comprehensive analysis of dialogue sentences, effectively capturing dynamic emotions to generate high-quality responses. Initially, we develop a constituency syntactic parsing tree for each dialogue sentence to outline the hierarchical syntactic structure among phrases. Concurrently, we construct a dependency syntactic parsing tree to clarify the grammatical relationships between individual words within the sentence. Subsequently, we utilize a cross-fusion module comprising dual layers of symmetrical Graph Convolutional Networks and Heterogeneous Graph Neural Networks for encoding and seamlessly amalgamating both trees, thus effectively discerning both local and global dynamic emotional nuances. Both syntactic trees are ultimately integrated into the generator, facilitating the generating of empathetic responses. Evaluations conducted on the EMPATHETICDIALOGUES dataset reveal that ICDM surpasses prior benchmarks in profoundly comprehending complex dialogue semantics and precisely perceiving speakers' emotions, thereby leading to the generation of superior empathetic responses.
The Empathetic Response Generation task [1] aims to perceive the emotional feelings expressed by the interlocutor and generate sympathetic responses accordingly. As an important factor in promoting empathy [2][3][4], emotions build bridges for communicators and promote understanding [5]. Relevant research in linguistics and psychology shows that emotional interaction is a continuous process, so the emotion of the conversation is constantly changing [6][7] and is highly related to contextual semantics [8][9]. Understanding and perceiving this dynamic Emotions play an important role in promoting conversational empathy [10]. Most existing methods only model the perception of static emotions in conversations [1], [11], [12], [13], [14], [15]. These methods perceive a single emotion type in the conversation through static emotion labels, and ignore dynamic emotion perception, making the model responses lack empathy. In order to perceive dynamic emotions, other methods perceive multiple emotions of the overall conversation by designing multiple emotion listeners [16] or utilizing imitation strategies based on polar emotions [17], (SEEK) considering the emotional changes between each sentence. . However, these methods ignore the grammatical correlation between semantics and emotion words, resulting in insufficient semantic understanding. To this end, [18] and [19] introduced dependency trees through syntactic analysis to establish the correlation of semantic emotions to capture dynamic emotions; and ([7]) proposed SETD-ERC to construct component graphs to parse sentences to mine latent semantics. information to enhance understanding of emotions.
However, the complex syntactic structure of human discourse makes it difficult for previous methods to fully perceive dynamic emotions. These methods ignore the perception of dynamic emotions or do not simultaneously parse semantics from dependencies and component syntactic relationships, resulting in inaccurate emotion perception and inability to generate good Empathic reply. According to psychological and linguistic research [20][21][22][23], fully considering the syntactic structure in conversational speech, that is, the lexical-grammatical relationship and the relationship between discourse clauses (such as phrases and clauses), is essential for a deep understanding of semantics and Dynamic emotions play an important role. The lexicogrammatical relationship is the dependency syntactic analysis of emotional sentences, which reflects the grammatical relationship between each word in the sentence [24]; the discourse clause relationship is the component syntactic analysis, which identifies the short word structure in the sentence and the hierarchy between phrases. Syntactic relations [25]. Syntactic analysis of conversational sentences to perceive dynamic emotions usually achieves better results [26][27]. As shown in Figure 1, given a context, "the rain" implies a negative emotional background, ChatbotA did not analyze the syntax of the statement, causing the model to be unable to capture the movement of "despite" from negative to positive emotions, thus expressing the wrong emotion. "bad". On the contrary, ChatbotB performs component syntax analysis on the context as shown in (a), and understands the emotional contrast between the subordinate clause "despite the rain" and "surprisingly cheerful" in the main clause; at the same time, it performs the component syntax analysis on the context as shown in (b) Dependency syntax analysis reveals that the relationship between "despite" and "rain" guides the emotion from negative to positive, and the strengthening relationship between "surprisingly" and "cheerful" captures the intensity of the emotion in the sentence, ultimately perceiving the speaker's dynamic emotional feelings. Expresses the positive emotion of "wonderful" and points out that the positive emotion is affected by "despite the rainy weather". Component trees and dependency trees provide in-depth analysis of semantics from phrase fragments and lexical and grammatical roles respectively. The two complement each other and fully promote the model's understanding and subtle perception of dynamic emotions. Therefore, we propose an empathic dialogue generation model (Integrating Constituency and Dependency Syntax Parse Trees for Empathetic Response Generation Model, IICDM) that fully combines component syntax parse trees and dependency syntax parse trees to perceive dynamic emotions to generate high-quality responses. ICDM constructs a component syntax analysis tree (component tree) and a dependency syntax analysis tree (dependency tree) for each dialogue sentence to parse the sentence from the hierarchical relationship between clauses and the grammatical relationship between words respectively, allowing the model to be in-depth Understand complex conversational syntactic structures. Then ICDM designed a cross-fusion module, including a two-layer symmetrical GCN and HGNN structure to encode the two trees and fully integrate them to perceive changing dynamic emotions. Finally, ICDM integrates the semantics and dynamic emotions contained in the two trees into the generation process. Through enhanced semantic understanding and full perception of dynamic emotions, ICDM is able to generate more empathetic responses. We conducted experiments on the EMPATHETICDIALOGUES (ED) benchmark dataset [11] and verified the effect of ICDM on a large language model. Experimental results show that ICDM achieves the best performance on automated and manual indicators, and can deeply understand contextual semantics and more accurately perceive conversational emotions. Furthermore, the results of our analysis of constituent and dependency structures are consistent with linguistic conclusions. Our main contributions are as follows: (1) We introduced the concept and importance of dynamic emotions in psychology and linguistics research, and proposed to use component analysis and dependency analysis in syntactic parsing to enhance the understanding of conversational semantics and better perceive dynamic emotions. (2) We proposed the ICDM model, which constructs a component tree and a dependency tree to parse the syntactic structure of the sentence from clauses to words, understand the contextual semantics, and use the cross-fusion module to perceive dynamic emotions, and finally inject the two trees into Generator to generate empathic responses. (3) Our experimental results on the EMPATHETICDIALOGUES benchmark dataset and validation results on a large language model show that ICDM outperforms the benchmark model in all indicators. Based on ICDM's deep understanding of the speaker's dynamic emotional feelings, it expresses more fluent and empathic responses. Furthermore, additional statistical and analytical experiments show that sentence composition and dependency parsing in conversations are consistent with psychological research.