Skip to content

Predicts the gender and age of an author based on Twitter corpus for English, Dutch, Italian and Spanish languages.

License

Notifications You must be signed in to change notification settings

yeasmin62/Author_Profiling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Author_Profiling

Multilingual Author Profiling project for the Learning From Data Project.

How To Run:

  1. To Run the scrips - keep the training and test directory under the same directory as the scrips are.

  2. Specify the following command line arguments while you run the python scripis:

    python English_svm.py -tr "training/english/" -te "test/english/"

    python Dutch_svm.py -tr "training/dutch/" -te "test/dutch/"

    python Italian_svm.py -tr "training/italian/" -te "test/italian/"

    python Spanish_svm.py -tr "training/spanish/" -te "test/spanish/"

Please Cite the paper if you use the code/data:

@InProceedings{10.1007/978-3-030-51859-2_46,
	author="Rahman, Md. Ataur
	and Akter, Yeasmin Ara",
	editor="Chen, Joy Iong-Zong
	and Tavares, Jo{\~a}o Manuel R. S.
	and Shakya, Subarna
	and Iliyasu, Abdullah M.",
	title="Multi-lingual Author Profiling: Predicting Gender and Age from Tweets!",
	booktitle="Image Processing and Capsule Networks",
	year="2021",
	publisher="Springer International Publishing",
	address="Cham",
	pages="505--513",
	abstract="This article describes how we build a multi-lingual classification system for author profiling. We have used Twitter corpus for English, Dutch, Italian and Spanish languages for building different models incorporating SVM classifier that predicts the gender and age of an author. We evaluated each model using 3-fold cross-validation on the training dataset for each of these languages. The overall maximum average accuracy for gender classification was 81.3{\%} for Spanish while for classification of age we achieved a maximum accuracy score of 70.3{\%} for English using the cross-validation scheme. For other languages, the results were between 64--76{\%}.",
	isbn="978-3-030-51859-2"
}

About

Predicts the gender and age of an author based on Twitter corpus for English, Dutch, Italian and Spanish languages.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages