Notebooks for the Seattle PyData 2017 talk on Scattertext
-
Updated
Jan 12, 2018 - HTML
Notebooks for the Seattle PyData 2017 talk on Scattertext
Summer 2017 Social Media Analytics Workshop Series
2018 Computational Text Analysis Notebooks, University of Mannheim
From using xpdf, rvest, and quanteda on United Nations Digital Library search results to applying dictionaries to speeches in United Nations meeting records
Original corpus of articles relating to refugees scraped from Tennessee newspaper The Chattanoogan along with simple code for text-as-data word cloud.
A small showcase for topic modeling with the tmtoolkit Python package. I use a corpus of articles from the German online news website Spiegel Online (SPON) to create a topic model for before and during the COVID-19 pandemic.
A tutorial on using regular expressions in R
The ABC of Computational Text Analysis. BA Seminar, Spring 2021, University of Lucerne
Code and models for 3 different tools to measure appeals to 8 discrete emotions in German political text
'dictvectoR' measures the similarity between a concept dictionary and documents, using fastText word vectors. Implements the "Distributed-Dictionary-Representation" (Garten et al. 2018) method in R.
Empirical framework applied to parliament discourses and Twitter data, with a Discourse Polarization Index.
This repository uses text-as-data methods alongside traditional primary source reading to analyze early American state constitutions. The R scripts create a function to scrape and clean the constitutional text, run sentiment analysis, calculate tf-idf, and perform LDA. This is a work-in-progress.
An Automation Webcrawler for Extracting Central Bankers' Speeches
This is a designed package for replicating the estimates and findings in the article of Factionalism and the Red Guards under Mao's China: Ideal Point Estimation Using Text Data.
The ABC of Computational Text Analysis. BA Seminar, Spring 2022, University of Lucerne
๐ฎ๐ฑ๐ต๐ธ News coverage of Israel-Hamas War ๐ต๐ธ๐ฎ๐ฑ
Material from my Machine Learning for the Social Sciences course
A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
Add a description, image, and links to the text-as-data topic page so that developers can more easily learn about it.
To associate your repository with the text-as-data topic, visit your repo's landing page and select "manage topics."