Skip to content

ixa-ehu/sustainable-transport-sentiment-corpus

Repository files navigation

A Corpus for Sentiment Analysis of Sustainable Transport

This repo includes a new dataset, the Gold Standard Corpus (GSC) on Sentiment Analysis for Transport, derived from user reviews about transport, which has been manually annotated. The dataset covers a range of transportation modes according to their sustainability.

More specifically, the new dataset contains 2000 reviews from the transport domain, manually annotated as positive or negative. This corpus is the first of its kind for the transport domain that is publicly available. The annotation process showed that the original classification of TripAdvisor comments according to a scale of 1–5 stars does not necessarily correspond with the real polarity. When manually reviewing these comments, we found that around 25% of the rates had to be manually corrected.

Datasets

NOTE The notebooks use the scripts contained in this repo: https://github.com/ragerri/transformers-training-scripts

More Details and Citation

About

sentiment annotated corpus about sustainable transport

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published