SOTorrent
Popular repositories
-
db-scripts Public
SQL and Bash scripts to import the offical Stack Overflow data dump and the SOTorrent data set, to retrieve Stack Overflow references from the BigQuery GitHub data set, and to retrieve data from th…
-
-
posthistory-extractor Public
Extracts the version history of text and code blocks from the official Stack Overflow data dump.
-
metric-evaluation Public
Comparision of different string similarity metrics for reconstructing the history Stack Overflow posts.
Repositories
- preprocessing-pipeline Public
Preprocessing pipeline to extract and normalize text/code blocks from Stack Exchange forum posts and comments.
- posthistory-extractor Public
Extracts the version history of text and code blocks from the official Stack Overflow data dump.
-
- db-scripts Public
SQL and Bash scripts to import the offical Stack Overflow data dump and the SOTorrent data set, to retrieve Stack Overflow references from the BigQuery GitHub data set, and to retrieve data from the SOTorrent dataset for analysis.
- so-internal-refs Public
Scripts used to import and analyze internal web server logs provided by Stack Overflow under an NDA.
-