This repository contains the dataset and code for our paper.
We compare AI-generated and human-written inspiring Reddit content across India and the UK. Although there may not be any visible differences to the human eye, by using linguistic methods, we find significant syntactic and lexical cross-cultural differences between generated and real inspiring posts.
All data is available at all_data
Topic Modeling features can be accessed interactively in topic_analysis