### Game analysis generator

This notebook works like `game_analysis.ipynb` but analyzes several SGFs at once. Has the same dependencies.

Before proceeding, create the directory `./output/<player_id>/sgf_for_analysis/`.

In [None]:
import os
import json
import io
import pandas as pd

Technically analyze-sgf can pull SGFs from the web as well. Maybe we'll make this more interactive later.

Place SGFs into a separate folder and replace the `path` variable accordingly.

In [None]:
# Replace with player id
player_id = '<player_id>'
path = './output/'+player_id+'/sgf_for_analysis/'

# Scan path directory for sgfs. Make sure you did the emptying step!
sgf_list = os.listdir(path)

sgf_string = ''
for x in sgf_list:
    sgf_string += path+x+' '
print(sgf_string)

On Windows systems, the analyze-sgf shell command is `analyze-sgf.cmd`. On Unix systems, it is `analyze-sgf`. Modify next cell accordingly.

This cell performs a full KataGo analysis of each game file, so it will take quite some time to complete if you set a large value of `maxVisits` in `.analyze-sgf.yml` and if you analyze many games at once.

For reference, with an NVIDIA GeForce GTX 1650 my machine can typically analyze a full game (completed to scoring) in about 30 minutes with 2000 visits per root.

In [None]:
!analyze-sgf.cmd -s {sgf_string}

In [None]:
# Output organization and formatting

try:
    os.mkdir(path+'analyzed_sgf')
except FileExistsError:
    pass
try:
    os.mkdir(path+'analyzed_json')
except FileExistsError:
    pass
try:
    os.mkdir(path+'analyzed_json_clean')
except FileExistsError:
    pass

drop_columns = ['id','isDuringSearch', 'rootInfo',
                'rawStScoreError', 'rawStWrError', 'rawVarTimeLeft',
               'symHash', 'thisHash']
for gameName in sgf_list:
    os.replace(path+gameName[:len(gameName)-4]+'-analyzed.sgf', path+'/analyzed_sgf/'+gameName[:len(gameName)-4]+'-analyzed.sgf')
    os.replace(path+gameName[:len(gameName)-4]+'.json', path+'/analyzed_json/'+gameName[:len(gameName)-4]+'.json')

    with io.open(path+'/analyzed_json/'+gameName[:len(gameName)-4]+'.json', mode='r', encoding='utf-8') as f:
        lines = f.readlines()[1:]
    data = [json.loads(x) for x in lines]
    with io.open(path+'/analyzed_json/'+gameName[:len(gameName)-4]+'.json', mode='w', encoding='utf-8') as f:
        f.write(json.dumps(data,indent=2))
    
    # Creating clean version of JSON analysis
    df = pd.read_json(path+'/analyzed_json/'+gameName[:len(gameName)-4]+'.json')
    rootInfo = pd.DataFrame([x for x in df['rootInfo']])
    df = pd.concat([df, rootInfo], axis=1)
    df.set_index('turnNumber', inplace=True)
    df.drop(drop_columns, axis=1, inplace=True)
    df.sort_index(inplace=True)
    df.to_json(path+'/analyzed_json_clean/'+gameName[:len(gameName)-4]+'_clean.json')