Skip to content

boun-tabi/BounTi-Turkish-Sentiment-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This repository includes training scripts, the finetuned model, and the dataset for BOUN Turkish Sentiment Analysis.

Implementation

You can use the finetuned model with the HuggingFace library. See the model link for more details and demo.

from transformers import pipeline
bounti = pipeline("sentiment-analysis",model="akoksal/bounti")
print(bounti("Bu yemeği pek sevmedim"))
>> [{'label': 'negative', 'score': 0.8012508153915405}]

Dataset

You can find the dataset in the data folder with the training, validation, and test splits.

Due to Twitter copyright, we cannot release the full text of the tweets. We share the tweet IDs, and the full text can be downloaded throught official Twitter API.

Training Validation Test
Positive 1691 188 469
Neutral 3034 338 843
Negative 1008 113 280
Total 5733 639 1592

Model

BERTurk model: Download (1.3 gb)

The scores of the finetuned model with BERTurk:

Accuracy Precision Recall F1
Validation 0.745 0.706 0.730 0.715
Test 0.723 0.692 0.729 0.701

Citation

You can cite the following paper if you use our work:

@INPROCEEDINGS{BounTi,
  author={Köksal, Abdullatif and Özgür, Arzucan},
  booktitle={2021 29th Signal Processing and Communications Applications Conference (SIU)}, 
  title={Twitter Dataset and Evaluation of Transformers for Turkish Sentiment Analysis}, 
  year={2021},
  volume={},
  number={}
  }

About

Twitter Dataset and Finetuned Transformer Model for Turkish Sentiment Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published