Skip to content

Create Debate Data Sets used in NAACL'13 paper “Mining User Relations from Online Discussions using Sentiment Analysis and Probabilistic Matrix Factorization”

Notifications You must be signed in to change notification settings

yangliuy/Debate-DataSets_NAACL13

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Debate-DataSets_NAACL13

/** Copyright (C) 2013 by SMU Text Mining Group/Singapore Management University/Peking University

Debate Dataset is distributed for research purpose, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

If you use this dataset, please cite the following paper:

Minghui Qiu, Liu Yang and Jing Jiang. Mining User Relations from Online Discussions using Sentiment Analysis and Probabilistic Matrix Factorization.In Proceedings of the 2013 Conference of North American Chapter of Association for Computational Linguistics: Human Language Technologies (NAACL 2013). (http://aclweb.org/anthology//N/N13/N13-1041.pdf) **/

Brief Introduction

  1. This data sets are used in the paper: Minghui Qiu, Liu Yang and Jing Jiang. Mining User Relations from Online Discussions using Sentiment Analysis and Probabilistic Matrix Factorization. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), Atlanta, GA, 2013.

  2. Descriptions: Folder "sents":

    1. It contains all sents for threads.
    2. Each file is a thread, and each line is a post.
    3. Sent format: each post is in this fomrat "source target url post_id sentence"
  3. Folder "labels": It contains user labels for each thread.

  4. Acknowledge:

    Please note that the above data sets are from: Amjad Abu-Jbara, Pradeep Dasigi, Mona Diab, and Dragomir R. Radev. 2012. Subgroup detection in ideological discussions. In Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pages 399–409.

    The original data sets contain: 117 Wikipedia discussions collected from www.wikipedia.org (directory: wikipedia) 30 Debates collected from www.createdebate.com (directory: createdeate) 12 Political discussions collected from www.politicalforum.com (directory: politicalforum)

About

Create Debate Data Sets used in NAACL'13 paper “Mining User Relations from Online Discussions using Sentiment Analysis and Probabilistic Matrix Factorization”

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages