# Pipeline 2 - Stockfish engine evaluation

## Inputs
- The extracted files from previous pipeline 1

## Output
- A csv file

## CSV file columns
For each row of the csv file, the output contains:
- **Basic game info**
    - WhiteUsername
    - BlackUsername
    - WhiteElo
    - BlackElo
    - EloDifference
    - TimeControl
    - Opening
    - GameId
- **Moving info**
    - MoveId
    - RemainingTime
    - MovePlayed
    - MovePlayedEval
    - ProcessTime
    - NumberofNodes
- **Algorithm prediction** ($n$ from $1$ to $5$)
    - best_move_$n$,
    - best_score_$n$,
    - ProcessTime_$n$,
    - NumberofNodes_$n$,

WhiteUsername,BlackUsername,WhiteElo,BlackElo,EloDifference,TimeControl,Opening,GameId,
MoveId,RemainingTime,MovePlayed,MovePlayedEval,ProcessTime,NumberofNodes,

best_move_1,best_score_1,ProcessTime_1,NumberofNodes_1,
best_move_2,best_score_2,ProcessTime_2,NumberofNodes_2,
best_move_3,best_score_3,ProcessTime_3,NumberofNodes_3,
best_move_4,best_score_4,ProcessTime_4,NumberofNodes_4,
best_move_5,best_score_5,ProcessTime_5,NumberofNodes_5

---

## Import APIs

In [1]:
# import julia libraries
using PyCall
using CSV
using DataFrames
using Plots

# import python chess library
@pyimport chess
cp = pyimport("chess.pgn")
ce = pyimport("chess.engine")

PyObject <module 'chess.engine' from '/home/ubuntu/.local/lib/python3.10/site-packages/chess/engine.py'>

## Path

In [2]:
# Constants
const STOCKFISH_PATH = "/usr/local/bin/stockfish"

const DATA_PATH = "./data/"
const INPUT_DATA_PATH = "$(DATA_PATH)pipeline1_exported/" # file type: .pgn and .csv file
const OUTPUT_DATA_PATH = "$(DATA_PATH)pipeline2_exported/" # file type: .csv file

const TEMPLATE_FILE_PATH = "$(DATA_PATH)expected_output_template.csv"

# Variables
INPUT_FILENAME = ".pgn"
OUTPUT_FILENAME = ".csv"

".csv"

In [3]:
# Test file paths
const TEST_INPUT_PATH = "./pre-pipeline-test/"
const TEST_OUTPUT_PATH = "./"

const TEST_INPUT_FILENAME = "$(TEST_INPUT_PATH)test100.pgn"
const TEST_OUTPUT_FILENAME = "$(TEST_OUTPUT_PATH)test_output.csv"

"./test_output.csv"

## Methods

In [4]:
function initialize_stockfish(STOCKFISH_PATH)
    stockfish = ce.SimpleEngine.popen_uci(STOCKFISH_PATH)

    return stockfish
end
stockfish = initialize_stockfish(STOCKFISH_PATH)

PyObject <SimpleEngine (pid=182980)>

In [29]:
# not using
get(stockfish.options, "Hash")


# get(stockfish.options, "Threads")

LoadError: MethodError: no method matching get(::PyObject)

[0mClosest candidates are:
[0m  get(::PyObject, [91m::Union{Tuple{Vararg{Type, N}}, Type} where N[39m, [91m::Any[39m)
[0m[90m   @[39m [36mPyCall[39m [90m~/.julia/packages/PyCall/KLzIO/src/[39m[90m[4mPyCall.jl:782[24m[39m
[0m  get(::PyObject, [91m::Union{Tuple{Vararg{Type, N}}, Type} where N[39m, [91m::Any[39m, [91m::Any[39m)
[0m[90m   @[39m [36mPyCall[39m [90m~/.julia/packages/PyCall/KLzIO/src/[39m[90m[4mPyCall.jl:772[24m[39m
[0m  get(::PyObject, [91m::Any[39m)
[0m[90m   @[39m [36mPyCall[39m [90m~/.julia/packages/PyCall/KLzIO/src/[39m[90m[4mPyCall.jl:787[24m[39m
[0m  ...


# empty hash table by using `ucinewgame`

In [5]:
pgn_file = open(TEST_INPUT_FILENAME)
game = cp.read_game(pgn_file)

PyObject <Game at 0x7fa2b3fc4cd0 ('Manorainjan' vs. 'Harshin', '2023.09.01' at 'https://lichess.org/OsBizORo')>

In [6]:
function basic_game_info(game)
    headers = game.headers

    WhiteUsername = get(headers, "White", "NA")
    BlackUsername = get(headers, "Black", "NA")
    WhiteElo = parse(Int64, get(headers, "WhiteElo", "0"))
    BlackElo = parse(Int64, get(headers, "BlackElo", "0"))
    EloDifference = WhiteElo - BlackElo
    TimeControl = get(headers, "TimeControl", "NA")
    Opening = get(headers, "Opening", "NA")
    GameId = split(get(headers, "Site", "NA"), "/")[end]

    basic_game_info = DataFrame(WhiteUsername = [WhiteUsername],
                   BlackUsername = [BlackUsername],
                   WhiteElo = [WhiteElo],
                   BlackElo = [BlackElo],
                   EloDifference = [EloDifference],
                   TimeControl = [TimeControl],
                   Opening = [Opening],
                   GameId = [GameId])

    return basic_game_info
end

basic_game_info (generic function with 1 method)

In [7]:
basic_game_info(game)

Row,WhiteUsername,BlackUsername,WhiteElo,BlackElo,EloDifference,TimeControl,Opening,GameId
Unnamed: 0_level_1,String,String,Int64,Int64,Int64,String,String,SubStrin…
1,Manorainjan,Harshin,2283,2161,122,1800+0,Indian Defense,OsBizORo


- **Moving info**
    - MoveId
    - RemainingTime
    - MovePlayed
    - MovePlayedEval
    - ProcessTime
    - NumberofNodes
- **Algorithm prediction** ($n$ from $1$ to $5$)
    - best_move_$n$,
    - best_score_$n$,
    - ProcessTime_$n$,
    - NumberofNodes_$n$,

--

- Clock: `[%clk ...]`
- Eval: `[%eval ...]`

In [24]:
function prediction(game, RemainingTime, MovePlayed, depth, multipv)
    predictions = DataFrame(RemainingTime = Float64[], MovePlayed = String[], MovePlayedEval = String[],
                            ProcessTime = Int64[], NumberofNodes = Int64[], best_move_1 = String[], best_score_1 = String[],
                            ProcessTime_1 = Int64[], NumberofNodes_1 = Int64[], best_move_2 = String[], best_score_2 = String[],
                            ProcessTime_2 = Int64[], NumberofNodes_2 = Int64[], best_move_3 = String[], best_score_3 = String[],
                            ProcessTime_3 = Int64[], NumberofNodes_3 = Int64[], best_move_4 = String[], best_score_4 = String[],
                            ProcessTime_4 = Int64[], NumberofNodes_4 = Int64[], best_move_5 = String[], best_score_5 = String[],
                            ProcessTime_5 = Int64[], NumberofNodes_5 = Int64[])

    board = game.board()
    analysis_results = stockfish.analyse(board, ce.Limit(depth=depth), multipv=multipv)

    # Initialize placeholders for each multipv best move and score
    best_moves = fill(nothing, multipv)
    best_scores = fill("", multipv)
    process_times = fill(nothing, multipv)
    number_of_nodes = fill(nothing, multipv)

    for (index, result) in enumerate(analysis_results)
        best_move = result["pv"][1].uci()
        score = result["score"].white()
        best_score_str = score.is_mate() ? "Mate in $(score.mate())" : "Cp $(score.score())"
        process_time = get(result, "time", 0)
        nodes = get(result, "nodes", 0)

        # best_moves[index] = best_move
        # best_scores[index] = best_score_str
        # process_times[index] = process_time
        # number_of_nodes[index] = nodes
        print(best_moves)
        print(best_scores)
        print(process_times)
        print(number_of_nodes)
    end

    # Fill the DataFrame with the obtained data
    row = DataFrame(RemainingTime = [RemainingTime], MovePlayed = [MovePlayed], MovePlayedEval = [nothing],
                    ProcessTime = [process_times[1]], NumberofNodes = [number_of_nodes[1]], 
                    best_move_1 = [best_moves[1]], best_score_1 = [best_scores[1]],
                    ProcessTime_1 = [process_times[1]], NumberofNodes_1 = [number_of_nodes[1]], 
                    best_move_2 = [best_moves[2]], best_score_2 = [best_scores[2]],
                    ProcessTime_2 = [process_times[2]], NumberofNodes_2 = [number_of_nodes[2]], 
                    best_move_3 = [best_moves[3]], best_score_3 = [best_scores[3]],
                    ProcessTime_3 = [process_times[3]], NumberofNodes_3 = [number_of_nodes[3]], 
                    best_move_4 = [best_moves[4]], best_score_4 = [best_scores[4]],
                    ProcessTime_4 = [process_times[4]], NumberofNodes_4 = [number_of_nodes[4]], 
                    best_move_5 = [best_moves[5]], best_score_5 = [best_scores[5]],
                    ProcessTime_5 = [process_times[5]], NumberofNodes_5 = [number_of_nodes[5]])

    append!(predictions, row)

    return predictions
end


prediction (generic function with 1 method)

In [25]:
function moving_info(game)
    MoveId = 0
    RemainingTime = 60.0 # Placeholder, adjust according to actual remaining time logic
    MovePlayed = "e2e4" # Placeholder, use actual move played
    depth = 20
    multipv = 5

    predictions = prediction(game, RemainingTime, MovePlayed, depth, multipv)
    
    # Update this part to properly fetch MovePlayedScore and MovePlayedMate
    MovePlayedScore = 0 # Placeholder, should be updated based on actual game move score
    MovePlayedMate = "" # Placeholder, should be empty if score is used

    ProcessTime = 0.0 # Placeholder, calculate actual processing time
    NumberofNodes = 0 # Placeholder, calculate actual number of nodes
    
    moving = DataFrame(
        MoveId = [MoveId],
        RemainingTime = [RemainingTime],
        MovePlayed = [MovePlayed],
        MovePlayedScore = [MovePlayedScore],
        MovePlayedMate = [MovePlayedMate],
        ProcessTime = [ProcessTime],
        NumberofNodes = [NumberofNodes]
    )

    return hcat(moving, predictions)
end


moving_info (generic function with 2 methods)

In [26]:
function moving_info(game)
    # Assuming the game, RemainingTime, and MovePlayed are correctly set
    depth = 20
    multipv = 5
    predictions = prediction(game, 60.0, "e2e4", depth, multipv) # Example RemainingTime and MovePlayed
    return predictions
end


moving_info (generic function with 2 methods)

In [27]:
df = moving_info(game)
# CSV.write("./dataframe.csv", df)
df


[nothing, nothing, nothing, nothing, nothing]["", "", "", "", ""][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing]["", "", "", "", ""][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing]["", "", "", "", ""][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing]["", "", "", "", ""][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing]["", "", "", "", ""][nothing, nothing, nothing, nothing, nothing][nothing, nothing, nothing, nothing, nothing]

[91m[1m┌ [22m[39m[91m[1mError: [22m[39mError adding value to column :MovePlayedEval. Maybe you forgot passing `promote=true`?
[91m[1m└ [22m[39m[90m@ DataFrames ~/.julia/packages/DataFrames/58MUJ/src/dataframe/insertion.jl:303[39m


LoadError: MethodError: [0mCannot `convert` an object of type [92mNothing[39m[0m to an object of type [91mString[39m

[0mClosest candidates are:
[0m  convert(::Type{T}, [91m::PyObject[39m) where T<:AbstractString
[0m[90m   @[39m [36mPyCall[39m [90m~/.julia/packages/PyCall/KLzIO/src/[39m[90m[4mconversions.jl:92[24m[39m
[0m  convert(::Type{String}, [91m::WeakRefStrings.WeakRefString[39m)
[0m[90m   @[39m [33mWeakRefStrings[39m [90m~/.julia/packages/WeakRefStrings/31nkb/src/[39m[90m[4mWeakRefStrings.jl:81[24m[39m
[0m  convert(::Type{String}, [91m::FilePathsBase.AbstractPath[39m)
[0m[90m   @[39m [35mFilePathsBase[39m [90m~/.julia/packages/FilePathsBase/4RrDh/src/[39m[90m[4mpath.jl:117[24m[39m
[0m  ...


In [None]:
function measure_execution_time(n)
    start_time = time()
    best_move = stockfish.analyse(game.board(), ce.Limit(depth=n), multipv=5)
    end_time = time()

    return end_time - start_time
end
depths = 15:25
execution_times = Float64[]

for n in depths
    push!(execution_times, measure_execution_time(n))
end

plot(depths, execution_times, xlabel="Depth", ylabel="Time (s)", title="Execution Time of Depth per game (top 5 best move)")

In [None]:
function plot_execution_times(execution_times, depth_range)
    plot(depth_range, execution_times, 
         xlabel="Depth", 
         ylabel="Average Time (s)", 
         title="Average Execution Time per Depth (One game)", 
         legend=false)
end

plot_execution_times(execution_times, 15:25)


In [None]:
function main()
    stockfish = initialize_stockfish(STOCKFISH_PATH)
    pgn_file = open(TEST_INPUT_FILENAME)

    all_data = DataFrame(WhiteUsername = String[], BlackUsername = String[], WhiteElo = Int64[], BlackElo = Int64[], 
                         EloDifference = Int64[], TimeControl = String[], Opening = String[], GameId = String[],
                         MoveId = Int64[], RemainingTime = Float64[], MovePlayed = String[], MovePlayedMate = String[], MovePlayedScore = Int64[],
                         ProcessTime = Int64[], NumberofNodes = Int64[], 
                        best_move_1 = String[], best_mate_1 = String[], best_score_1 = Int64[], ProcessTime_1 = Int64[], NumberofNodes_1 = Int64[], 
                        best_move_2 = String[], best_mate_2 = String[], best_score_2 = Int64[], ProcessTime_2 = Int64[], NumberofNodes_2 = Int64[], 
                        best_move_3 = String[], best_mate_3 = String[], best_score_3 = Int64[], ProcessTime_3 = Int64[], NumberofNodes_3 = Int64[], 
                        best_move_4 = String[], best_mate_4 = String[], best_score_4 = Int64[], ProcessTime_4 = Int64[], NumberofNodes_4 = Int64[], 
                        best_move_5 = String[], best_mate_5 = String[], best_score_5 = Int64[], ProcessTime_5 = Int64[], NumberofNodes_5 = Int64[])

    game = cp.read_game(pgn_file)
    # while !isnothing(game)
        game_info = basic_game_info(game)

        move_data = moving_info(game, stockfish)
        MoveId += 1 # Ensure MoveId is incremented correctly in the loop

        # Merging game info with move data for each move
        row = hcat(game_info, move_data)

        # Append the current move's data to the all_data DataFrame
        append!(all_data, row)

        # Attempt to move to the next game/move, adjust this logic based on your file's structure
        game = cp.read_game(pgn_file)
    # end

    close(pgn_file)
    CSV.write(TEST_OUTPUT_FILENAME, all_data)
end


## Execution

In [None]:
main()