Article feedback research
Created to look at the value of project quality assessments (GA/FA/etc) in light of broad user feedback available from the Article feedback tool.
Eventually we want to model the relationship between marginal changes in project quality assessment and user feedback.
- Ingests the feedback data and the list of project rated articles.
- Contains three options for sourcing the data: local, remote and intitial--explained below.
- feed.plots.R, feed.Animations.R, feed.ggplot.R
- For now, most of the big stuff is in here. Plots for summary stats and some other things
- Simple linear regression, an aux regression and some binomial/multinomial regressions
All of this stuff is coming from my personal use. I've tried to comment where possible and make sensible decisions but I make no guarantees.
It is also not written in the style of an R package. The code expects you to run each script as needed (after the import script) in order to load the functions and objects into your environment. Then you can plot or model as needed using the supplied functions/objects.
- All code is released under a CC-BY-SA license, as described and linked in license.md
- ggplot2 For plotting
- boot for some bootstrapping (not strictly required but it makes the code easier)
- Animation for saving animations (also requires ImageMagick)
- MASS for a few regression models and utility functions
- XML for reading from mediawiki API
Used for miscellany (or not used inside the script proper)
- xtable for printing tables
Individual script notes
Some things of note:
- The option to download and create directories (initial) as well as to load from local files (local) expect a certain set of folders and will fail rather gracelessly if they are modified. This should be ok unless you change your working directory after loading the files to disk.
- Depending on your internet connection loading the files remotely will take a few minutes. (remote) and (initial) call the MediaWiki API to enumerate assessment categories and downloads the feedback csv.
- Most of these use the full dataset without decimation so they will take a while to render.
- Two basic linear regressions. The first is a simple regression comparion rating average to other variables. The second is an auxillary regression meant to back out the influence of article length on rating average.
- The simple model is checked against a bootstrapped regression for a rough test of error structure.
- Summary stats and a table of differences between assessed articles are also produced.
- Two proportional odds models are fitted/bootstrapped in order to better illustrate the relationship between length, count and rating average on likelihood of assessment. Plots are produced in this script due to the overhead in fitting the function repeatedly.