This repository links to the resources and additional results from our paper:
David Vilares and Carlos Gómez-Rodríguez. Grounding the Semantics of Part-of-Day Nouns Worldwide using Twitter, PEOPLES2018 (co-located with NAACL HLT 2018) - The 2nd Workshop on Computational Modeling of People's Opinions, Personality, and Emotions in Social Media, accepted, New Orleans, Louisiana, USA, 2018.
The usage of part-of-day nouns, such as 'night', and their time-specific greetings ('good night'), varies across languages and cultures. This paper showed the possibilities that Twitter offers for studying the semantics of these terms and its variability between countries. We mine a worldwide sample of multilingual tweets with temporal greetings, and study how their frequencies vary in relation with local time. The results provide insights into the semantics of these temporal expressions and the cultural and sociological factors influencing their usage.
The first figure shows a greeting area chart for the USA, showing how 'good evening' and 'good afternoon' are well differentiated, with the transition happening over 16:30. This contrasts to countries such as Spain (the second figure), where the language has a single word ('tarde') for 'evening' and 'afternoon', whose greeting spans from over 14:00, as the morning ends late, to 21:00.
Dataset: It is formatted as follows:
TweetID \t Language \t Datetime \t Country \t Part-of-day-noun-category \t Language-dependent-part-of-day-greeting
Additional results and plots (not showed in the paper due to space reasons)
If you have any suggestions, inquiry or bug to report, please contact david.vilares@udc.es