Skip to content
diegoref edited this page Sep 4, 2017 · 23 revisions

… co-located with: Extended Semantic Web Conference 2017

… sponsored by:       Sponsored by Springer

News

Challenge results for Task 1

1 - Mattia Atzeni and Amna Dridi: "Fine-Grained Sentiment Analysis on Financial Microblogs and News Headlines" - F1-Score: 0.8675

2 - Marco Federici: "A Knowledge-based Approach For Aspect-Based Opinion Mining" - F1-Score: 0.8424

3 - Walid Iguider and Diego Reforgiato Recupero: "Language Independent Sentiment Analysis of theShukran Social Network using Apache Spark" - F1-Score: 0.8378

4 - Giulio Petrucci: "The IRMUDOSA System at ESWC-2017 Challenge on Semantic Sentiment Analysis" - F1-Score: 0.8112

The Challenge is open! Please subscribe to the mailing list to be kept up to date.

The ESWC-17 Challenge on Semantic Sentiment Analysis is open to everyone from industry and academia working within the sentiment analysis area.

Background and Relevance for the Semantic Web community

The development of Web 2.0 has given users important tools and opportunities to create, participate and populate blogs, review sites, web forums, social networks and online discussions. Tracking emotions and opinions on certain subjects allows identifying users' expectations, feelings, needs, reactions against particular events, political view towards certain ideas, etc. Therefore, mining, extracting and understanding opinion data from text that reside in online discussions is currently a hot topic for the research community and a key asset for industry.

The produced discussion spanned a wide range of domains and different areas such as commerce, tourism, education, health, etc. Moreover, this comes back and feeds the Web 2.0 itself thus bringing to an exponential expansion.

This explosion of activities and data brought to several opportunities that can be exploited in both research and industrial world. One of them concerns the mining and detection of users' opinions which started back in 2003 (with the classical problem of polarity detection) and several variations have been proposed. Therefore, today there are still open challenges that have raised interest within the scientific community where new hybrid approaches are being proposed that, making use of new lexical resources, natural language processing techniques and semantic web best practices, bring substantial benefits.

Computer World [1] estimates that 70%-80% of all digital data consists of unstructured content, much of which is locked away across a variety of different data stores, locations and formats. Besides, accurately analyzing the text in an understandable manner is still far from being solved as this is extremely difficult. In fact, mining, detecting and assessing opinions and sentiments from natural language involves a deep (lexical, syntactic, semantic) understanding of most of the explicit and implicit, regular and irregular rules proper of a language.

Existing approaches are mainly focused on the identification of parts of the text where opinions and sentiments can be explicitly expressed such as polarity terms, expressions, statements that express emotions. They usually adopt purely syntactical approaches and are heavily dependent on the source language and the domain of the input text. It follows that they miss many language patterns where opinions can be expressed because this would involve a deep analysis of the semantics of a sentence. Today, several tools exist that can help understanding the semantics of a sentence. This offers an exciting research opportunity and challenge to the Semantic Web community as well. For example, sentic computing is a multi-disciplinary approach to natural language processing and understanding at the crossroads between affective computing, information extraction, and common-sense reasoning, which exploits both computer and human sciences to better interpret and process social information on the Web.

Therefore, the Semantic Sentiment Analysis Challenge looks for systems that can transform unstructured textual information to structured machine processable data in any domain by using recent advances in natural language processing, sentiment analysis and semantic web.

By relying on large semantic knowledge bases, Semantic Web best practices and techniques, and new lexical resources, semantic sentiment analysis steps away from blind use of keywords, simple statistical analysis based on syntactical rules, but rather relies on the implicit, semantics features associated with natural language concepts. Unlike purely syntactical techniques, semantic sentiment analysis approaches are able to detect sentiments that are implicitly expressed within the text, topics referred by those sentiments and are able to obtain higher performances than pure statistical methods.

[1] Computer World, 25 October 2004, Vol. 38, NO 43.

Submissions

Two steps submission

First step:

  • Abstract: no more than 200 words.
  • Paper (max 4 pages): containing the details of the system, including why the system is innovative, which features or functions the system provides, what design choices were made and what lessons were learned, how the semantics has been employed and which tasks the system addresses. Industrial tools with non disclosure restrictions are also allowed to participate, and in this case they are asked to:
    • explain even at a higher level their approach and engine macro-components, why it is innovative, and how the semantics is involved;
    • provide free access (even limited) for research purposes to their engine, especially to make repeatable the challenge results or other experiments possibly included in their paper

Second step (for accepted systems only):

  • Paper (max 15 pages): full description of the submitted system.
  • Web Access: applications should be either accessible via web or downloadable or anyway a RESTful API must be provided to run the challenge testset. If an application is not publicly accessible, password must be provided for reviewers. A short set of instructions on how to use the application or the RESTFul API must be provided as well.
  • The authors will have the possibility to present a poster and a demo advertising their work or networking during a dedicated session.

Please note that:

  • Papers must comply with the LNCS style
  • Papers are submitted in PDF format via the EasyChair submission pages (remember to check the topic Challenge).
  • Accepted papers will be published by Springer.
  • Extended versions of best systems will be invited to journal special issues.
  • All the participants are invited to submit a paper containing the research aspects of their systems to the ESWC 2017 Workshop on Emotions, Modality, Sentiment Analysis and the Semantic Web (http://www.maurodragoni.com/research/opinionmining/events/)

Important Dates

  • Friday March 10th, 2017, 23:59 (CET): First step submission
  • Friday March 17th, 2017, 23:59 (CET): Notification of acceptance
  • Friday March 31th, 2017, 23:59 (CET): Second step submission
  • Friday March 14th, 2017, 23:59 (CET): Reviews
  • Sunday April 23th, 2017, 23:59 (CET): Camera ready Papers
  • Thursday May 25th, 2017, 23:59 (CET): Test data published
  • Sunday May 28th - June 1st, 2017: The Challenge takes place at ESWC-17
  • Sunday May 28th - June 30th, 2017: Camera ready paper for the challenge post proceedings (15 pages document)

Challenge Criteria

This challenge focuses on the introduction, presentation, development and discussion of novel approaches to semantic sentiment analysis. Participants will have to design a semantic opinion-mining engine that exploits Semantic Web knowledge bases, e.g., ontologies, DBpedia, etc., to perform multi-domain sentiment analysis. The main motivation for this challenge is to go beyond a mere word-level analysis of natural language text and provide novel semantic tools and techniques that allow a more efficient passage from (unstructured) natural language to (structured) machine-processable data in potentially any domain.

The submitted systems must provide an output according to Semantic Web standards (RDF, OWL, etc.). Systems must have a semantic flavour (e.g., by making use of Linked Data or known semantic networks within their core functionalities) and authors need to show how the introduction of semantics improves the performance of their methods. Existing natural language processing methods or statistical approaches can be used too as long as the semantics plays a role within the core approach and improves the precision (engines based merely on syntax/word-count will be excluded from the competition). The target language is English and multi-language capability is a plus.

Tasks

The Semantic Sentiment Analysis Challenge is defined in terms of different tasks. The first task is elementary whereas the others are more advanced.

Task #1: Polarity Detection

The basic idea of this task is the binary polarity detection, i.e. for each review of the evaluation dataset (test set), the goal is to detect its polarity value (positive OR negative). The participant semantic opinion-mining engines will be assessed according to precision, recall and F-measure computed on the confusion matrix of detected polarity values. In the final ranking to determine the winner, tools will be ordered according to the average F-measure calculated considering the F-measure obtained on each class. Participants can assume that there will be no neutral reviews. The output format for such a task is the following:

Task #1

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Sentences>
    <sentence id="apparel_0">
        <text>
        GOOD LOOKING KICKS IF YOUR KICKIN IT OLD SCHOOL LIKE ME. AND COMFORTABLE. AND 
        RELATIVELY CHEAP. I'LL ALWAYS KEEP A PAIR OF STAN SMITH'S AROUND FOR WEEKENDS
        </text>
        <polarity>
        positive
        </polarity>
    </sentence>
    <sentence id="apparel_1">
        <text>
        These sunglasses are all right. They were a little crooked, but still cool..
        </text>
        <polarity>
        positive
        </polarity>
    </sentence>
</Sentences>

Input is the same without the polarity tag. Dataset will be composed by one million of reviews collected from the Amazon web site and split in 20 different categories: Amazon Instant Video, Automotive, Baby, Beauty, Books, Clothing Accessories, Electronics, Health, Home Kitchen, Movies TV, Music, Office Products, Patio, Pet Supplies, Shoes, Software, Sports Outdoors, Tools Home Improvement, Toys Games, and Video Games. The classification of each review (positive or negative) has been done according to the guidelines used for the construction of the Blitzer dataset [2]. Participants will evaluate their system by applying a cross-fold validation over the dataset where each fold is clearly delimited. The script to compute Precision, Recall, and F-Measure and the confusion matrix will be provided to participants through the website of the challenge.

[2] Blitzer J., Dredze M., Pereira F.. Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification. Association of Computational Linguistics (ACL), 2007.

Task #2: Polarity Detection in presence of metaphorical language

The basic idea of this task is polarity detection (positive or negative or neutral) of tweets containing expressions such as irony, metaphors, sarcasm. The proposed semantic opinion-mining engines will be assessed according to precision, recall and F-measure computed on the confusion matrix of detected polarity values (positive OR negative) for each tweet of the evaluation dataset. The output format for such a task is the following:

Task #2

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Sentences>
    <sentence id="apparel_0">
        <text>
        I just love working for 6.5 hours without a break or anything. 
        Especially when I'm on my period and have awful cramps.
        </text>
        <polarity>
        negative
        </polarity>
    </sentence>
    <sentence id="apparel_1">
        <text>
        I literally love Stephen A smith haha he's hilarious
        </text>
        <polarity>
        positive
        </polarity>
    </sentence>
</Sentences>

Input is the same without the polarity tag. Dataset will be composed by three thousands of tweets collected from Twitter and already classified with [positive,negative,neutral] polarity values. The manual annotation of each tweet will be performed using Crowdflower [3]. The script to compute Precision, Recall, and F-Measure will be provided to participants through the website of the challenge.

[3] https://www.crowdflower.com/

Task #3: Aspect-Based Sentiment Analysis

The output of this Task will be a set of aspects of the reviewed product and a binary polarity value associated to each of such aspects. So, for example, while for the Task #1 an overall polarity (positive or negative) is expected for a review about a mobile phone, this Task requires a set of aspects (such as speaker', touchscreen', `camera', etc.) and a polarity value (positive OR negative) associated with each of such aspects. Engines will be assessed according to both aspect extraction and aspect polarity detection using precision, recall and F-measure similarly as performed during the first Concept-Level Sentiment Analysis Challenge held during ESWC2014 and re-proposed at SemEval 2015 Task12 [4]

[4] http://alt.qcri.org/semeval2015/task12/

The output format for such a task is the following:

Task #3

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Review rid="1">
        <sentences>
            <sentence id="348:0">
                <text>Most everything is fine with this machine: speed, capacity, build.</text>
                <Opinions>
                    <Opinion aspect="MACHINE" polarity="positive"/>
                </Opinions>
            </sentence>
            <sentence id="348:1">
                <text>The only thing I don't understand is that the resolution of the 
       	          screen isn't high enough for some pages, such as Yahoo!Mail.
                </text>
                <Opinions>
                    <Opinion aspect="SCREEN" polarity="negative"/>
                </Opinions>
            </sentence>
            <sentence id="277:2">
                <text>The screen takes some getting use to, because it is smaller
                 than the laptop.</text>
                <Opinions>
                    <Opinion aspect="SCREEN" polarity="negative"/>
                </Opinions>
            </sentence>
        </sentences>
    </Review>

Input is the same without the Opinions tag and its descendants nodes. As training set, we will use the dataset provided by the last two editions of SemEval; as test set we will extract around 100 sentences from the web where we will annotate aspects and their related polarity. Two experts will annotate the sentences and disagreements will be analyzed. Precision, Recall and F-Measure will be computed with respect to the extraction of concepts and the computation of their polarity. The script to compute Precision, Recall, and F-Measure will be provided to participants through the website of the challenge.

Task #4: Semantic Sentiment Retrieval

This task focuses on the capability of retrieving relevant documents with respect to opinion-based queries given as input to participant systems. The retrieval process has to be supported by semantic resources. This task includes Information Retrieval (detect features of given entities), Named Entity Recognition (detect smartphone models within the review possibly using some sort of knowledge base), Sentiment Analysis (aggregate features opinions for the entity sentiment for either overall or feature based retrieval). The input format for such a task is the following:

Task #4

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<documents>
    <document id="0">
        <text>So far so good. My wife just loves the new Samsung S5: the display is awesome 
        and the colors are very brilliant. However, further memory is necessary for storing
         everything.</text>
    </document>
    <document id="1">
        <text>All the LG G3 have problems with videos: they often are not able to connect
        with tv and when they can, the quality of the image is poor. The only strong point
       is the amount of memory coming from the factory.</text>
    </document>
    <document id="2">
    	<text>The team behind a project to build one of the world's largest telescopes said on 
        Monday it has chosen Spain's Canary Islands in the Atlantic Ocean as a possible alternative to Hawaii.
        The decision follows opposition from Native Hawaiians and environmentalists to plans for 
        constructing the so-called Thirty Meter Telescope (TMT), which would cost $1.4 billion, at the Mauna Kea 
        volcano on Hawaii's Big Island.</text>
    </document>
</documents>

The queries format will be the following:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<queries>
    <query id="q0">
        <text>Documents talking about smartphone display.</text>
    </query>
</queries>

The output format should look like the following:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Ranks>
	<query id="q0">
		<position value="1" documentId="0"/>
	        <position value="2" documentId="1"/>
        </query>
</Ranks>

The entire dataset (train + test) will be built from scratch. 2-4 experts will validate the annotations for this task, which consist in the computation of the relevance of documents according to the Normalized Discounted Cumulated Gain measure. The script to compute Normalized Discounted Cumulated Gain measure will be provided to participants through the website of the challenge. The first 20 documents returned by each participant will be manually judged.

Task #5: Frame entities Identification

The Challenge focuses on semantic fine-grained sentiment analysis. This means that the proposed engines must work beyond word/syntax level, hence addressing a concepts/semantics perspective. This task will evaluate the capabilities of the proposed systems to identify the objects involved in a typical opinion frame according to their role: holders, topics, opinion concepts (i.e. terms referring to highly polarised concepts). For example, in a sentence such as The mayor is loved by the people in the city, but he has been criticized by the state government (taken from [5]), an approach should be able to identify that the people and state government are the opinion holders, is loved and has been criticized represent the opinion concepts, mayor identifies a topic of the opinion and that there are two different opinion polarities mentioned in the sentence. The proposed engines will be evaluated according to precision, recall and F-measure.

[5] Sentiment Analysis and Opinion Mining, Bing Liu, 2012

The output format for such a task is the following:

Task #5

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 <Sentences>
     <sentence id="348:0">
         <text>The mayor is loved by the people in the city, 
         but he has been criticized by the state government.
         </text>
         <Frames>
             <Frame>
                 <holder start="22" end="32" value="the people"/>
                 <topic start="0" end="9" value="The mayor"/>
                 <opinion start="10" end="18" value="is loved"/>
                 <polarity>positive</polarity>
             </Frame>
             <Frame>
                 <holder start="76" end="96" value="the state government"/>
                 <topic start="0" end="9" value="The mayor"/>
                 <opinion start="53" end="72" value="has been criticized"/>
                 <polarity>negative</polarity>
             </Frame>            
         </Frames>
     </sentence>
 </Sentences>

Input is the same without the Frames tag and its descendants nodes. As training set, we will use the dataset adopted for the last edition of the challenge; a new set of 100 annotated sentences will constitute the test set and it will be built from scratch. 2-4 experts will validate the annotations. Precision, Recall, and F-Measure will be computed against the number of recognized entities and the script to compute it will be provided to participants through the website of the challenge.

Task #6: Subjectivity and Objectivity detection

This task has commonly been defined as it follows: given a text, classify it into objective or subjective. Basically, an objective sentence does not contain any opinion within it whereas subjective text does. The proposed engines are strongly encouraged to use semantic web solutions, best practices and technologies to solve this task or to indicate how the semantics (even if implicitly adopted) is employed in their methods.

The output format for such a task is the following:

Task #6

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
 <Sentences>
     <sentence id="348:0">
         <text>The mayor is loved by the people in the city.
         </text>
         <value>objective</value>
     </sentence>
     <sentence id="348:0">
         <text>The mayor he has been elected by many voters.
         </text>
         <value>subjective</value>
     </sentence>
 </Sentences>

Input is the same without the value tag. Training set and test set will be collected from the web, manually annotating 200 documents as subjective or objective. 2-4 experts will validate the annotations for this task.

Judging and Prizes

During the competition, the new test sets will be released and participants will have to send their output within the next 4 hours. Chairs will use the evaluation scripts on those results and the related annotated test sets in order to compute precision, recall in order to come up with a scoreboard of the systems for each task.

One award will be given for each task (the winner of each task will be the one with the highest score in precision-recall analysis) and one more award will be given for the most innovative approach (the system with the best use of common-sense knowledge and semantics and innovative nature of approach).

The awards will consist in Springer vouchers and cash prize (depending on sponsors availability).

Besides, each challenge paper will be included in a Springer book as already done in challenge editions of 2014, 2015 and 2016.

Organizers

  • Diego Reforgiato Recupero, University of Cagliari, Italy (diego.reforgiato@unica.it) Diego Reforgiato Recupero has been an Associate Professor at the Department of Mathematics and Computer Science of the University of Cagliari, Italy since December 2015. He holds a double bachelor from the University of Catania in computer science and a doctoral degree from the Department of Computer Science of University of Naples Federico II. He got the National qualification for Computer Science Engineering and he is a Computer Science Engineer. In 2005 he won the Computer World Horizon Award in the USA for the best research project on OASYS, an opinion analysis system that was commercialized by SentiMetrix (US company he co-founded). In 2008, he won a Marie Curie International Grant and the “Best Researcher Award 2012” for a project about the development of a green router nearing commercialization. In the same year he got to the winning podium of the “Startup Weekend” event held in Catania and was a winner of Telecom Italia Working Capital Award with a grant of 25k euros for the “Green Home Gateway” project. In 2012 he co-founded the Italian company R2M Solution s.r.l. In 2013 he published a paper on Science related to the energy efficiency techniques in Internet. He also co-founded R2M Solution ltd. (UK company founded in 2014), La Zenia s.r.l. (Italian company founded in 2014 for management of sports and recreational events) and B-UP (Italian company founded in 2016 together with colleagues of CNR). He is a patent co-owner in the field of data mining and sentiment analysis (20100023311).

  • Erik Cambria, Nanyang Technological University, Singapore (cambria@ntu.edu.sg) Erik Cambria received his BEng and MEng with honors in Electronic Engineering from the University of Genoa in 2005 and 2008, respectively. In 2012, he was awarded his PhD in Computing Science and Mathematics following the completion of an EPSRC project in collaboration with MIT Media Lab, which was selected as impact case study by the University of Stirling for the UK Research Excellence Framework (REF2014). After working at HP Labs India, Microsoft Research Asia, and NUS Temasek Labs, in 2014 Dr Cambria joined the School of Computer Science and Engineering at NTU as Assistant Professor. His current affiliations also include Rolls-Royce@NTU, A*STAR IHPC, MIT Synthetic Intelligence Lab, and the Brain Sciences Foundation. He is Associate Editor of Elsevier KBS and IPM, Springer AIRE and Cognitive Computation, IEEE CIM, and Editor of the IEEE IS Department on Affective Computing and Sentiment Analysis. Dr Cambria is also recipient of several awards, e.g., the Temasek Research Fellowship, and is involved in many international conferences as Workshop Organizer, e.g., ICDM and KDD, PC Member, e.g., AAAI and ACL, Program and Track Chair, e.g., ELM and FLAIRS, and Keynote Speaker, e.g., CICLing.

  • Emanuele Di Rosa, FINSA s.p.a, Italy (emanuele.dirosa@finsa.it) Emanuele Di Rosa, PhD, is currently Chief Scientific Officer at Finsa s.p.a., leading the dep. of R&D on Machine Learning and Semantic Analysis, and Program Manager of the Web and Mobile department. He is currently focused on the design and implementation of software tools for Opinion Mining in both Product/Service reviews and Social Media Analysis. Previously worked at ETT s.p.a. as Technical Project Leader and Senior Software Engineer where he led a team of software developers working in the field of the new interactive multimedia technologies, mobile apps and web solutions. He was visiting researcher at the Cork Constraint Computation Centre (4C, University College Cork, Ireland) from August to November 2009 and he worked with prof. Barry O’Sullivan, currently Director of the Insight Centre for Data Analytics. He received a PhD in april 2011 from the doctoral school of "Science and Technology for Information and Knowledge" of the University of Genoa. In 2007 he received cum laude a Master Degree in Computer Science and Engineering from the University of Genoa. In 2005 he has been awarded by Confindustria with the "Targa junior" award in the 2nd ed. of the "Perotto Prize" as best software project in Liguria in the under 30 category. His research interests focus on artificial intelligence (both machine learning and automated reasoning) and software engineering. He has been reviewer of scientific papers for the following conferences: SAT 2008, SAT 2009, CP 2009, IJCAI 2009, SAT 2010, KR 2010. He is author and co-author of seventeen international peer-reviewed articles published on conference/workshop/doctoral consortium proceedings and journals.

Program Committee

  • Aldo Gangemi, University of Paris13 and CNR (France and Italy)
  • Valentina Presutti, CNR (Italy)
  • Malvina Nissim, University of Bologna (Italy)
  • Hassan Saif, Open University (UK)
  • Rada Mihalcea, University of North Texas (USA)
  • Ping Chen, University of Houston-Downtown (USA)
  • Yongzheng Zhang, LinkedIn Inc. (USA)
  • Giuseppe Di Fabbrizio, Amazon Inc. (USA)
  • Soujanya Poria, Nanyang Technological University (Singapore)
  • Yunqing Xia, Tsinghua University (China)
  • Rui Xia, Nanjing University of Science and Technology (China)
  • Jane Hsu, National Taiwan University (Taiwan)
  • Rafal Rzepka, Hokkaido University (Japan)
  • Amir Hussain, University of Stirling (UK)
  • Alexander Gelbukh, National Polytechnic Institute (Mexico)
  • Bjoern Schuller, Technical University of Munich (Germany)
  • Amitava Das, Samsung Research India (India)
  • Dipankar Das, National Institute of Technology (India)
  • Stefano Squartini, Marche Polytechnic University (Italy)
  • Cristina Bosco, University of Torino (Italy)
  • Paolo Rosso, Technical University of Valencia (Spain)
  • Sergio Consoli, Philips Research (Netherlands)

MAILING LIST

To ask questions and information please join our Google Group (https://groups.google.com/forum/#!forum/semantic-sentiment-analysis). After you join the group, you can post messages to the topic "ESWC2017 Semantic Sentiment Analysis Challenge"