Skip to content

Commit

Permalink
Raw data
Browse files Browse the repository at this point in the history
  • Loading branch information
maelick committed Mar 8, 2018
1 parent 25a7ac2 commit 01d2ff5
Show file tree
Hide file tree
Showing 7 changed files with 20,028 additions and 0 deletions.
2,001 changes: 2,001 additions & 0 deletions data-raw/kubertenes.csv

Large diffs are not rendered by default.

2,001 changes: 2,001 additions & 0 deletions data-raw/lines.10k.cfo.sample.2000 - Kubernetes (Slackarchive.io).csv

Large diffs are not rendered by default.

2,001 changes: 2,001 additions & 0 deletions data-raw/lines.10k.cfo.sample.2000 - Lucene-dev mailing list.csv

Large diffs are not rendered by default.

10,001 changes: 10,001 additions & 0 deletions data-raw/lines.10k.cfo.sample.2000 - Mozilla (Firefox, Core, OS).csv

Large diffs are not rendered by default.

2,001 changes: 2,001 additions & 0 deletions data-raw/lucene.csv

Large diffs are not rendered by default.

22 changes: 22 additions & 0 deletions data-raw/make_data.R
@@ -0,0 +1,22 @@
library(devtools)
library(data.table)

filenames <- c(mozilla="data-raw/lines.10k.cfo.sample.2000 - Mozilla (Firefox, Core, OS).csv",
kubertenes="data-raw/lines.10k.cfo.sample.2000 - Kubernetes (Slackarchive.io).csv",
lucene="data-raw/lines.10k.cfo.sample.2000 - Lucene-dev mailing list.csv")

MakeFactor <- function(ratings) factor(ratings, labels=c("NL", "Not"))

nlon.data <- rbindlist(lapply(names(filenames), function(x) {
res <- fread(filenames[x], encoding="UTF-8")[!is.na(Disagreement)]
if (x == "lucene") {
res[, Text := gsub("^[>|\\s]+", "", Text, perl=TRUE)]
}
res <- res[, list(source=x, text=Text,
rater1=MakeFactor(Mika),
rater2=MakeFactor(Fabio))]
fwrite(res, sprintf("data-raw/%s.csv", x))
res
}))

devtools::use_data(nlon.data, overwrite=TRUE)
2,001 changes: 2,001 additions & 0 deletions data-raw/mozilla.csv

Large diffs are not rendered by default.

0 comments on commit 01d2ff5

Please sign in to comment.