This is a program for generating an aggregate to the polls at English Wikipedia's historical US president rankings. The aggregate is generated by ranking presidents by the ratio of favourable to total pairwise comparisons excluding ties. That is, for each poll a president is ranked in, each president ranked below him in the poll counts as a favourable comparison or "victory" and each president ranked above him counts as an unfavourable comparison or "defeat". Presidents are then ranked by their score defined as victories/(victories+defeats).
The program also computes the correct quartile divisions for each individual poll as well as the aggregate, and prints the lowest rank in each quartile (higher rank = higher quartile = lower number). The quartiles are defined by the 'median goes up' rule: the data is divided into top and bottom halves which are divided into the first and last two quartiles, respectively, and in each split the median goes into the top half. Note: ties are not accounted for when computing the quartiles. Only the total number of presidents in the poll is taken into account. This means if there are 40 presidents in a poll so the first quartile would end at 10, but there are five presidents tied at rank 10, the program will still say 10 so the first quartile has four too many presidents and the second has four too few. If you instead end the first quartile at rank 9 and put the tied presidents in the second quartile, the first will have one too few and the second will have one too many, which is a better result, but these considerations will have to be made by the user who notices a tie close to the end of a quartile.
The aggregate is generated from the current table as a .csv file. To import the current table from Wikipedia, go to Google Sheets and enter
=importHTML("https://en.wikipedia.org/wiki/Historical_rankings_of_presidents_of_the_United_States";"table";1) into a cell, then download as .csv.
To run this program, download the .jar from the 'releases' tab of the project's GitHub page, and then run it. To run .jar files you need to have Java installed, then open console (Command Prompt on Windows), go to the folder where the .jar is located and enter
java -jar JARNAME.jar arg1 arg2 arg3 (you can have any number of arguments (
arg) including none), where
JARNAME is the name of the .jar file.
Usage: if the first argument is
--help, prints short help text describing usage, then exits. If the first argument is
--doc, creates javadoc files in folder
JARNAME_doc - where, again,
JARNAME is the name of the .jar being run - within the .jar's folder and opens the main class' documentation file in the default .html program, then exits. If the first argument is not
--doc, first argument is taken as the path to the .csv file; if no arguments, path
US-president-rankings-table.csv is taken as default, and if the table is not found at this location, the user is prompted for the path. Second argument should be
y if the table already has an aggregate, and anything else otherwise; this is only used if no aggregate is found. If
y, the program prints an error message and exits; otherwise the program proceeds. If no second argument, the user is prompted for input as needed.
The table is assumed to contain individual presidents in each row except the first and last, and individual polls in each column except the first three and, if the table has an aggregate, the last. The first row should be a header and the last should display total number of presidents ranked per poll. The first column should contain president numbers, the second should contain president names, the third should contain party affiliations, and if an aggregate is present it should be found in the last column. The program checks for the following criteria and prints an error message and exits if not all hold true:
- All rows have equal length.
- First row's first three entries are or start with "No.", "President", "Political party".
- First row's last entry is or starts with "Aggr." or user specifies that the table has no aggregate.
- Last row's second entry is or starts with "Total in survey".
- There is some string X such that entries that are not in the first or last row or the first three columns, or last column if the table has an aggregate, are either integers optionally followed by " (tie)" or " *" or both, or identical to X (indicating 'not ranked').
The program also checks that the 'Total in survey' numbers are correct and prints corrections if any are wrong, but doesn't exit.
Dependencies (packaged into the .jar): opencsv 4.0, commons-beanutils 1.9.3, commons-lang3 3.6, commons-text 1.1, commons-collections 3.2.2, commons-logging 1.2