Roman Urdu being perceived as most colloquial and informal text which is the common communication script of sub-continent. Sentiment analysis has been done on many resource rich languages such as English, French, Chineese, Arabic etc but Roman Urdu being resource starved language has less been studied. We present the largest Roman Urdu E-Commerce reviews dataset (RUECD) for the multiclass sentiment analysis. The dataset contains 26,824 Roman Urdu reviews labeled as either Positive (1) , Negative(0) and Neutral (2).
Paper is submitted for the review titled "Attention-Based RU-BiLSTM Sentiment Analysis Model for Roman Urdu". The demo of our proposed model can be accessed on: http://159.65.138.171:5000/