Skip to content

Computing an approximate reputation score for Stack Overflow users at any time (scripts and RESTful API)

License

Notifications You must be signed in to change notification settings

collab-uniba/SO_reputation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SO_reputation

A script/RESTful API to compute an approximation of the reputation gained by a Stack Overflow user up to a given date.

The script can be tested online at this address (Be patient, it may take a few seconds). Furthermore, you can use it as a web service in your own app. Check the RESTful API documentation page for more.

NOTE: You will need to install Git LFS extension to check out this project. Once installed and initialized, simply run:

$ git lfs clone https://github.com/collab-uniba/SO_reputation.git

DISCLAIMER

The extracted reputation is only an estimate (with a ~10% error). The following rules are not considered:

  • Suggested edit is accepted: +2 (up to +1000 total per user)
  • Bounty awarded to your answer: + full bounty amount
  • One of your answers is awarded a bounty automatically: + half of the bounty amount (see more details about how bounties work)
  • Example you contributed to is voted up: +5
  • Proposed change is approved: +2
  • First time an answer that cites documentation you contributed to is upvoted: +5
  • You place a bounty on a question: - full bounty amount
  • One of your posts receives 6 spam or offensive flags: -100

Fair Use Policy

Please, cite the following works if you intend to use our tool for your own research:

F. Calefato, F. Lanubile, N. Novielli. “Moving to Stack Overflow: Best-Answer Prediction in Legacy Developer Forums.” In Proc. 10th Int’l Symposium on Empirical Softw. Eng. and Measurement (ESEM’16), Ciudad Real, Spain, Sept. 8-9, 2016, DOI:10.1145/2961111.2962585.

@inproceedings{Calefato_esem2016,
 author = {Calefato, Fabio and Lanubile, Filippo and Novielli, Nicole},
 title = {Moving to Stack Overflow: Best-Answer Prediction in Legacy Developer Forums},
 booktitle = {Proc. 10th ACM/IEEE Int'l Symposium on Empirical Software Engineering and Measurement}, 
 series = {ESEM '16},
 year = {2016},
 isbn = {978-1-4503-4427-2},
 location = {Ciudad Real, Spain},
 pages = {13:1--13:10},
 articleno = {13},
 numpages = {10},
 url = {http://doi.acm.org/10.1145/2961111.2962585},
 doi = {10.1145/2961111.2962585},
 publisher = {ACM},
} 

F. Calefato, F. Lanubile, M.C. Marasciulo, N. Novielli. MSR Challenge: “Mining Successful Answers in Stack Overflow.” In Proc. 12th IEEE Working Conf. on Mining Software Repositories (MSR 2015), Florence, Italy, May 16-17, 2015.

@inproceedings{Calefato_msr2015,
 author = {Calefato, Fabio and Lanubile, Filippo and Marasciulo, Maria Concetta and Novielli, Nicole},
 title = {Mining Successful Answers in Stack Overflow},
 booktitle = {Proc. 12th Working Conf. on Mining Software Repositories},
 series = {MSR '15},
 year = {2015},
 isbn = {978-0-7695-5594-2},
 location = {Florence, Italy},
 pages = {430--433},
 numpages = {4},
 url = {http://dl.acm.org/citation.cfm?id=2820518.2820579},
 publisher = {IEEE Press},
} 

DB Setup

If you want to run your script locally or deploy the web service on your own server, you need to setup a MySQL database. From this point forward, we assume that you have already imported the SO dump to a local MySQL db (there are several scripts that you can easily adapt to your purpose; see here and here, for example).

Requirements
  • Java 8+

Go to the scripts/db-setup/ folder and run in batch mode the sql script setup.sh:

$ sh setup.sh

NOTE: Before running, edit the first lines of the file to change the following information:

#!/bin/bash
MYSQL_USER=root
MYSQL_PASS=secret
MYSQL_SO_DB=stackoverflow
...

This script will create several table/views and indexes to speed up the querying process, plus some CSV files, named Question_Answer_?_(asc|desc).csv and Posts_Votes?_(asc|desc).csv. These CSV files are needed by the web service, therefore, whenever you create or update the SO database, you should copy/move them to the subfolder webservice/StackOverflowRESTfulWebService/WebContent/WEB-INF where you deployed the reputation web service.

How to Run

The scripts referenced in this section must be edited prior to execution, in order to customize the following variables:

  • MySQL username
  • MySQL password
  • SO database name

Script: Sequential version

Requirements
  • Python 3
    • PyMySQL version 0.7.9
Usage

From comand line run:

$ python reputation.py

Script: Parallel Version

Need CSV files

Requirements
  • Java 8+
    • Akka version 2.1.4
    • OpenCSV version 3.9
Usage Akka-script

From comand line:

$ java -jar Akka-script-final.jar [UserId1] [date1] (...[UserIdN] [dateN]) -1 

where:

  • UserIdX is an integer representing a user id
  • dateX is a string representing a date in the format YYYY-mm-dd
Test Akka-script

From comand line:

$ java -jar Akka-script-test.jar [n] [date].

where:

  • n is an integer representing the first n users in the dump
  • date is a string representing a date in the format YYYY-mm-dd

Web Service

API

Documentation is available here.

Requirements
  • Java 8
    • OpenCSV version 3.9
    • Jersey version 1.17.1
    • Jackson version 1.9.10
  • Tomcat 8+