Social Meme-ing: Measuring Linguistic Variation in Memes

Much work in the space of NLP has used computational methods to explore sociolinguistic variation in text. In this paper, we argue that memes, as multimodal forms of language comprised of visual templates and text, also exhibit meaningful social variation. We construct a computational pipeline to cluster individual instances of memes into templates and semantic variables, taking advantage of their multimodal structure in doing so. We apply this method to a large collection of meme images from Reddit and make available the resulting SemanticMemes dataset of 3.8M images clustered by their semantic function. We use these clusters to analyze linguistic variation in memes, discovering not only that socially meaningful variation in meme usage exists between subreddits, but that patterns of meme innovation and acculturation within these communities align with previous findings on written language.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
memes		memes
website		website
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Social Meme-ing: Measuring Linguistic Variation in Memes

About

Releases

Packages

Languages

License

naitian/social-memeing

Folders and files

Latest commit

History

Repository files navigation

Social Meme-ing: Measuring Linguistic Variation in Memes

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages