PhD in Economics Empirical Research Course
-
Updated
Sep 25, 2024 - HTML
PhD in Economics Empirical Research Course
Beautiful visualizations of how language differs among document types.
Code for collecting and cleaning speeches (text) of the US 2020 election campaign. Corresponding publication: "A text dataset of campaign speeches of the main tickets in the 2020 US presidential election", by Ioannis Chalkiadakis, Louise Anglès d’Auriac, Gareth W. Peters, and Divina Frau-Meigs
Summer/ winter schools, workshops and conferences in computational social science 🫂
LinkOrgs: An R package for linking linking records on organizations using half a billion open-collaborated records from LinkedIn
Replication script for the Webscrapping Transcripts of the Parliamentary Debates in the National Council of the Slovak Republic (1994-2023) and the ensuing sentiment analysis
Collection of text corpora for publicly available speeches from Mexican president Andres Manuel Lopez Obrador (AMLO) sourced from YouTube. The dataset includes his daily morning conferences (conferencias mañaneras) 😴🪿
Interpretable data visualizations for understanding how texts differ at the word level
Text analysis with networks.
Literature 📄 and datasets 📚 on automatic populism detection
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)
Material from my Machine Learning for the Social Sciences course
The ABC of Computational Text Analysis. BA Seminar, Spring 2022, University of Lucerne
This is a designed package for replicating the estimates and findings in the article of Factionalism and the Red Guards under Mao's China: Ideal Point Estimation Using Text Data.
An Automation Webcrawler for Extracting Central Bankers' Speeches
This repository uses text-as-data methods alongside traditional primary source reading to analyze early American state constitutions. The R scripts create a function to scrape and clean the constitutional text, run sentiment analysis, calculate tf-idf, and perform LDA. This is a work-in-progress.
Empirical framework applied to parliament discourses and Twitter data, with a Discourse Polarization Index.
'dictvectoR' measures the similarity between a concept dictionary and documents, using fastText word vectors. Implements the "Distributed-Dictionary-Representation" (Garten et al. 2018) method in R.
Add a description, image, and links to the text-as-data topic page so that developers can more easily learn about it.
To associate your repository with the text-as-data topic, visit your repo's landing page and select "manage topics."