### Game analysis generator

This notebook uses the [analyze-sgf](https://github.com/9beach/analyze-sgf) tool, which employs[KataGo's Analysis Engine](https://github.com/lightvector/KataGo/blob/master/docs/Analysis_Engine.md), to generate and organize KataGo analyses of full game SGF files. For the purposes of this project we assume input SGFs come from the output directory, but the basic features are universal for SGFs.

Output:
* SGF file containing the analysis in `./output/analyzed_sgf`
* JSON file containing the analysis in `./output/analyzed_json`
* cleaned Pandas-ready JSON analysis file in `./output/analyzed_json_clean`

Maybe later:
* Append outputs to player game datasets?

### Dependencies:

This notebook requires the following to be installed on your machine:
* [KataGo](https://github.com/lightvector/KataGo)
* [analyze-sgf](https://github.com/9beach/analyze-sgf)

In addition, you should modify the `.analyze-sgf.yml` file associated with your analyze-sgf installation to reflect your KataGo paths.

In [None]:
import os
import json
import io
import pandas as pd

Technically analyze-sgf can pull SGFs from the web as well. Maybe we'll make this more interactive later.

In [None]:
# Replace with player id
player_id = 'PLAYER_ID'
# Replace with SGF filename, drop the '.sgf' extension.
sgf_filename = 'FILENAME'
path = './output/'+player_id

On Windows systems, the analyze-sgf shell command is `analyze-sgf.cmd`. On Unix systems, it is `analyze-sgf`. Modify next cell accordingly.

This cell performs a full KataGo analysis of the game file, so it will take quite some time to complete if you set a large value of `maxVisits` in `.analyze-sgf.yml`.

In [None]:
!analyze-sgf.cmd -s {path+'/ogssgf/'+sgf_filename+'.sgf'}

In [None]:
# Output organization and formatting

try:
    os.mkdir('output/'+str(player_id)+'/analyzed_sgf')
except FileExistsError:
    pass
try:
    os.mkdir('output/'+str(player_id)+'/analyzed_json')
except FileExistsError:
    pass
try:
    os.mkdir('output/'+str(player_id)+'/analyzed_json_clean')
except FileExistsError:
    pass

os.replace(path+'/ogssgf/'+sgf_filename+'-analyzed.sgf', path+'/analyzed_sgf/'+sgf_filename+'-analyzed.sgf')
os.replace(path+'/ogssgf/'+sgf_filename+'.json', path+'/analyzed_json/'+sgf_filename+'.json')

with io.open(path+'/analyzed_json/'+sgf_filename+'.json', mode='r', encoding='utf-8') as f:
    lines = f.readlines()[1:]
data = [json.loads(x) for x in lines]
with io.open(path+'/analyzed_json/'+sgf_filename+'.json', mode='w', encoding='utf-8') as f:
    f.write(json.dumps(data,indent=2))
    
# Creating clean version of JSON analysis
df = pd.read_json(path+'/analyzed_json/'+sgf_filename+'.json')
rootInfo = pd.DataFrame([x for x in df['rootInfo']])
df = pd.concat([df, rootInfo], axis=1)
drop_columns = ['id','isDuringSearch', 'rootInfo',
                'rawStScoreError', 'rawStWrError', 'rawVarTimeLeft',
               'symHash', 'thisHash']
df.set_index('turnNumber', inplace=True)
df.drop(drop_columns, axis=1, inplace=True)
df.to_json(path+'/analyzed_json_clean/'+sgf_filename+'_clean.json')