Skip to content

An automatic opinion implicit and explicit aspect identification and clustering tool for aspect-based opinion mining / sentiment analysis applications. Opcluster-PT also allows the customization for other languages, being minimally necessary a lexical language resource as such as WordNet, deverbal, foreign, diminutive and enhancing Lexicon. OpCl…

Notifications You must be signed in to change notification settings

franciellevargas/OpCluster-PT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

OpCluster: Automatic Extraction and Hierarchical Clustering of Fine-Grained Opinions

OpCluster-PT is a customized version of OpCluster for the Portuguese language.

The Opcluster is an algorithm for extracting and hierarchical clustering of implicit and explicit fine-grained opinions (also called aspects from web constumer reviews. This method relies on the organization of similar implicit and explicit aspects (considering their context of use) inside a tree. For example, in the follow review: "she considers the price of camera very expensive”, here, the consumer employed the term “price” to evaluate an aspect (propriety) of camera. However, consumers may also use the terms “cost”, “value”, “investment”, "cost-benefit", etc. In addition, consumers may use implicit or explicit aspects to refer to the same aspect, e.g., “she got calls at the São Francisco river” and “working anywhere” were employed in smarphone product reviews to implicitly evaluate the aspect “signal”. It is also interest to notice that, in wide range of domains, proper names may also be employed to refer to the aspects. For instance, the proper names “Sony” and “Nikon” may be used to evaluate the “product brand” aspect of digital cameras. Hence, this task is hard!

HOW DO YOU USE THE OPCLUSTER?

  1. Get the download git file folder;
  2. Open the file "OpClusterPT.py" (It's necessary any IDE and the Python Version 2 or 3 installed);
  3. Check if all the input files are in the same folder as the "OpClusterPT.py" file;
  4. Unzip the folders: "OntoPT.tar.xz" and "corp_xml_reli.zip";
  5. Run the algorithm.

We also provide a set taxonomies of aspects and annotated reviews that were used in this master's degree work. However, if you need to apply this algorithm to other data, you need: (1) Download the CORP system - desktop version - (available here: https://www.inf.pucrs.br/linatural/wordpress/recursos-e-ferramentas/) and run it on the new dataset reviews. It will generate a set of XML files with the labeled reviews. These files will be used as input in the OpCluster-PT. Will soon be available the Opcluster-PT 2.0 web version. Finally, additional information can be obtained from my full Master's thesis available here: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-31072018-170236/en.php.


The OpCluster-PT web version is available here: http://www.nilc.icmc.usp.br/opcluster/

CITING

Vargas, F.A. and Pardo, T.A.S. (2018). Aspect clustering methods for sentiment analysis. Proceedings of the 13th International Conference on the Computational Processing of Portuguese (PROPOR). pp. 365-374. Canela-RS/Brazil.


BIBTEX

@inproceedings{DBLP:conf/propor/VargasAndPardo18, author = {Francielle A. Vargas and Thiago A. S. Pardo}, title = {Aspect Clustering Methods for Sentiment Analysis}, booktitle = {Proceedings of the 13th International Conference on the Computational Processing of Portuguese, {PROPOR} }, pages = {365–374}, year = {2018}, address = {Canela, Brazil}, url = {https://link.springer.com/chapter/10.1007/978-3-319-99722-3_37} }


About

An automatic opinion implicit and explicit aspect identification and clustering tool for aspect-based opinion mining / sentiment analysis applications. Opcluster-PT also allows the customization for other languages, being minimally necessary a lexical language resource as such as WordNet, deverbal, foreign, diminutive and enhancing Lexicon. OpCl…

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages