Note:  This is not a runnable notebook.  You will need to copy the code to a file on your local machine.

## A script for scraping the kaggle website for additional submission stats

Here's an example run:

```
$ ruby agent_stats.rb 14958858

[...lots of log messages ignored...]

Stats for Submission 14958858
Last 152 Games

As Player 1
 Record: 67-2-1
 Best Win: 1411
 Worst Loss: 1294
 Ties: 1302
 Your errors: 0
 Opponent errors: 0
 Suspect games: 0

As Player 2
 Record: 35-15-9
 Best Win: 1416
 Worst Loss: 1133
 Ties: 1174, 1430, 1219, 1203, 1300, 1276, 1302, 1075, 1061
 Your errors: 2
 Opponent errors: 0
 Suspect games: 5

Error for both players: 15
Games not processed: 

```

The following ruby script uses the Watir library to scrape the kaggle website.  (This script worked on the website as of March 31.)

```ruby
#!/usr/bin/ruby

require 'json'
require 'watir'

HISTORY_FILE="#{__dir__}/_historical_games.json"

def process_player_string(pstring)
  error = false
  player_score = "Unknown"
  md = pstring.match(/(.*)\((.*)\)/)
  score_change = md[2]
  if score_change == "Error" or score_change == "Validation"
    if score_change == "Error"
      error = true
      score_change = 0  # is this true?
    else
      score_change = 0
      player_score = 600
    end
    player_name = md[1].strip()
  else
    player_name, _, player_score = md[1].strip().rpartition(" ")
  end
  return [error, player_name, player_score, score_change]
end

historical_games = {}
if File.file?(HISTORY_FILE)
  s = File.read(HISTORY_FILE)
  historical_games= JSON.parse(s)
end

games = []
opp_strings = []
my_strings = []
submission_id = ARGV[0]
submission = "https://www.kaggle.com/c/connectx/submissions?dialog=episodes-submission-#{submission_id}"
browser = Watir::Browser.new
browser.goto(submission)

sleep(2)
elem = browser.element(:xpath => "//*[@id=\"site-content\"]/div[3]/div[1]/div/div[1]/ul")
# Scrolls 20 times to get more games
20.times do
  elem.scroll.to :bottom
  sleep(0.1)
end
browser.links(:href => /episodes-episode/).each_with_index do |link, i|
  my_strings << browser.element(:xpath => "//*[@id=\"site-content\"]/div[3]/div[1]/div/div[1]/ul/li[#{i*2 + 1}]/a/span/div/span/span[1]").text()
  opp_strings << browser.element(:xpath => "//*[@id=\"site-content\"]/div[3]/div[1]/div/div[1]/ul/li[#{i*2 + 1}]/a/span/div/span/span[2]").text()
  games << link.href.gsub(/[^\d]/, '')
end


not_retrieved = []
system_error_count = 0
p1 = { "wins"=> [], "losses"=> [], "ties"=> [], "my_error_count"=> 0, "opp_error_count"=> 0, "suspect_count"=> 0 }
p2 = { "wins"=> [], "losses"=> [], "ties"=> [], "my_error_count"=> 0, "opp_error_count"=> 0, "suspect_count"=> 0 }

games.zip(my_strings, opp_strings) { |game, my_string, opp_string|
  me = process_player_string(my_string)
  opp = process_player_string(opp_string)
  if (me[0] and opp[0]) # error reported for both players
    system_error_count += 1
    next
  end

  finished = false
  retries = 0
  if historical_games.key?(game)
    p1_id, p2_id, p1_reward, p2_reward, suspect = historical_games[game]
    finished = true
  end
  while not finished and retries < 3
    if (retries > 0)
      print("Retrying game #{game}\n")
    end
    begin
      sleep(2)
      game_url = "https://www.kaggle.com/c/connectx/submissions?dialog=episodes-episode-#{game}"
      print("Game URL #{game_url}\n")
      browser.goto(game_url)
      iframe = browser.iframe(:xpath => "//*[@id=\"site-content\"]/div[3]/div[1]/div/div/div[1]/iframe").wait_until(&:present?)
      player1 = iframe.element(:xpath => "/html/body/div/div[2]/ul/li[1]")
      player2 = iframe.element(:xpath => "/html/body/div/div[2]/ul/li[2]")

      p1_id = player1.title().match("id: (.*)")[1]
      p2_id = player2.title().match("id: (.*)")[1]

      sleep(2)
      browser.goto("https://www.kaggle.com/competitions/episodes/#{game}.json")
      next if (not browser.text.start_with?("{"))
      gamelog = JSON.parse(browser.text)

      num_steps = gamelog["steps"].length() - 1
      last_board = gamelog["steps"][-1][0]["observation"]["board"]
      num_pieces = 42 - last_board.count(0)
      p1_reward = gamelog["steps"][-1][0]["reward"]
      p2_reward = gamelog["steps"][-1][1]["reward"]

      suspect = (num_steps != num_pieces)

      finished = true
      historical_games[game] = [p1_id, p2_id, p1_reward, p2_reward, suspect]
    rescue Watir::Exception::Error => e
      retries += 1
      puts "Could not process game #{game}", e
    end
  end

  if finished
    if p1_id == submission_id and p2_id == submission_id
      # ignore validation episode
      next
    end
    if p1_id == submission_id
#      puts "You are player 1"
      my_stats = p1
      my_reward = p1_reward
    else
#      puts "You are player 2"
      my_stats = p2
      my_reward = p2_reward
    end

    # [error, player_name, player_score, score_change]
    if me[0]
      my_stats["my_error_count"] += 1
    elsif opp[0]
      my_stats["opp_error_count"] += 1
    elsif suspect
      my_stats["suspect_count"] += 1
    else
      if my_reward == 1.0
        my_stats["wins"] << opp[2]
      elsif my_reward == 0.0
        my_stats["losses"] << opp[2]
      else
        my_stats["ties"] << opp[2]
      end
    end
  else
    not_retrieved << game
  end
}
print "Player 1 #{p1}\n"
print "Player 2 #{p2}\n"

print "\nStats for Submission #{submission_id}\n"
print "Last #{games.length()} Games\n"
[p1, p2].each_with_index { |stats, i|
  print "\nAs Player #{i+1}\n"
  print " Record: #{stats["wins"].length()}-#{stats["losses"].length()}-#{stats["ties"].length()}\n"
  print " Best Win: #{stats["wins"].sort_by(&:to_i)[-1]}\n"
  print " Worst Loss: #{stats["losses"].sort_by(&:to_i)[0]}\n"
  print " Ties: #{stats["ties"].join(', ')}\n"
  print " Your errors: #{stats["my_error_count"]}\n"
  print " Opponent errors: #{stats["opp_error_count"]}\n"
  print " Suspect games: #{stats["suspect_count"]}\n"
}

print "\nError for both players: #{system_error_count}\n"
print "Games not processed: #{not_retrieved.join(', ')}\n\n"

File.open(HISTORY_FILE, "w") do |f|
  f.write(historical_games.to_json)
end


```


Notes:
* The "Record" is shown for full games only.
* The "Record" does not include error games or suspect games.
* The "Record" is in the format W-L-T.
* "Ties" lists the score of all of the oppenents that you have tied.
* Games are counted as "Suspect" if the number of pieces on the board at the end of the game does not match the number of moves played.
* A history file is stored to the same directory as the script so subsequent runs won't have to download the same games again.

Issues:
* It is slow.  Some of the sleep() statements seemed necessary, others I added so as to not hammer the kaggle servers.
* To change the name and location of the history file you need to change the variable HISTORY_FILE.
