Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas #14

Closed
mcoolin opened this issue Jan 18, 2022 · 12 comments
Closed

Ideas #14

mcoolin opened this issue Jan 18, 2022 · 12 comments
Labels
question Further information is requested

Comments

@mcoolin
Copy link

mcoolin commented Jan 18, 2022

Hi Mitchel,

Nice job on the engine!

Just starting to get familiar with the code. New to rust so it will take some time. So far it looks great!

No specific issue. Is thier anything you need help on?

On my initial review it does not appear that you support using any sort of opening book moves. I'm looking at another another github project that could act as a source to provide a set of opening book moves. See https://github.com/niklasf/chess-openings

I have cloned and built Walleye.

Cute chess does not seem to install as documented. But that issues will go to them.

A few other ideas:

  • break out the fen functions from board
  • record moves
  • add AI to the engine
@MitchelPaulin
Copy link
Owner

MitchelPaulin commented Jan 18, 2022

Hey Mike, glad you were able to build it!

On my initial review it does not appear that you support using any sort of opening book moves.

This is a deliberate choice, opening book management is usually offloaded to the GUI (cute chess has an option to load a pgn file for example).

  • break out the fen functions from board
  • record moves
  • add AI to the engine
  • I think breaking up the board file is a great idea, its grown pretty large and moving the piece definitions to their own file would be nice. Though I do like the fen functions where they are.

  • Recording moves again is usually a job taken care of by the GUI.

  • As far as adding AI I'm not sure what you mean. The engine has AI via an optimized version of the min-max algorithm. Do you mean adding some type of neural network/deep learning?

No specific issue. Is thier anything you need help on?

Move generation is one thing, its pretty slow right now relative to other engines, mostly due to the fact that I use clone in a lot of places which makes the move generation code much easier to understand and debug but much slower. If you can find a way to make some optimizations there that would be great. You can check out the README for how to run benchmarks.

@MitchelPaulin MitchelPaulin added the question Further information is requested label Jan 18, 2022
@mcoolin
Copy link
Author

mcoolin commented Jan 19, 2022

Mitchel,

I did not see the AI that already exists. I was refering more to the nural/deep learning AI.

I spent my evening running tests/perf and reading.

Clone is usually number 1 or 2 in the perf stats.

I'm running on linux using a i7 lenovo, 4 cpus 8 threads.

My observations so far:

  • most of the time everything seems to run on two cpu's.
  • memory usage is pretty even
  • After alot of reading and code study I'm wondering if Copy would be faster that clone as its essentially a memcopy. There are 18 Clone's in the code.
  • I'm wondering if the get moves could use threads for each of the piece types to run in paralel?
  • I want to put a summary together on the current performance
  • I noticed that the number of nodes per second really drops off fast the higher the depth goes
  • oddly depth 5 and 6 gave me the same node count

@mcoolin
Copy link
Author

mcoolin commented Jan 19, 2022

Testing results:
Debug
./walleye -T --depth=5
Searched to a depth of 5 and evaluated 5072212 nodes in 28.646725553s for a total speed of 181150 nps
Searched to a depth of 5 and evaluated 5072212 nodes in 27.975565682s for a total speed of 187859 nps
Searched to a depth of 5 and evaluated 5072212 nodes in 27.871391048s for a total speed of 187859 nps
Searched to a depth of 5 and evaluated 5072212 nodes in 28.314559684s for a total speed of 181150 nps
Searched to a depth of 5 and evaluated 5072212 nodes in 27.945402446s for a total speed of 187859 nps
sudo perf record ./walleye -T --depth=5
[sudo] password for mike:
Searched to a depth of 5 and evaluated 5072212 nodes in 28.027864331s for a total speed of 181150 nps
[ perf record: Woken up 17 times to write data ]
[ perf record: Captured and wrote 4.283 MB perf.data (111850 samples) ]

Perf
23.31% walleye walleye [.] walleye::evaluation::get_evaluation
7.64% walleye walleye [.] <core::ops::range::Range as core::iter::range::Ra
5.26% walleye walleye [.] walleye::move_generation::is_check_cords
3.97% walleye walleye [.] core::mem::replace
3.79% walleye walleye [.] <walleye::board::PieceColor as core::cmp::PartialEq>

./walleye -T --depth=6
Searched to a depth of 6 and evaluated 124132536 nodes in 742.085057102s for a total speed of 167294 nps
sudo perf record ./walleye -T --depth=6
Searched to a depth of 6 and evaluated 124132536 nodes in 746.898943079s for a total speed of 166397 nps
[ perf record: Woken up 454 times to write data ]
[ perf record: Captured and wrote 113.881 MB perf.data (2984771 samples) ]

perf
23.39% walleye walleye [.] walleye::evaluation::get_evaluation
7.53% walleye walleye [.] <core::ops::range::Range as core::iter::range::Ra
5.26% walleye walleye [.] walleye::move_generation::is_check_cords
4.14% walleye walleye [.] core::mem::replace
3.72% walleye walleye [.] <walleye::board::PieceColor as core::cmp::PartialEq>

sudo perf record ./walleye -P -S

perf
8.23% walleye walleye [.] walleye::move_generation::is_check_cords
7.59% walleye walleye [.] walleye::evaluation::get_evaluation
5.05% walleye walleye [.] <core::ops::range::Range as core::iter::range::R
4.24% walleye walleye [.] <core::slice::iter::Iter as core::iter::traits::
4.05% walleye walleye [.] <walleye::board::Square as core::cmp::PartialEq>::e

@MitchelPaulin
Copy link
Owner

MitchelPaulin commented Jan 19, 2022

Thanks for putting this together, for accurate results though make sure you are profiling the release build (with these nps numbers I highly suspect you are profiling the debug build). For perf to still work for you properly you can instruct the compiler to keep the debugging symbols using

[profile.release]
debug = true

For why the speed drops off rapidly as depth increase thats because there are many more terminal nodes (exponentially more each level) and each of those positions gets evaluated, if you were to disable the evaluation part of the test bench you would get a much more consistent nps across depths.

@mcoolin
Copy link
Author

mcoolin commented Jan 19, 2022

I was profiling debug.

Where do I put the debug = true line?

I expect the release numbers would be better, but the items would likely stay the same.
I give it a try when I update the release entries.

@MitchelPaulin
Copy link
Owner

You would add it here https://github.com/MitchelPaulin/Walleye/blob/main/Cargo.toml

The numbers actually do change slightly, for example is_check_cords has a large footprint in debug but has an almost unnoticeable performance impact in release mode

@mcoolin
Copy link
Author

mcoolin commented Jan 19, 2022

Release version
./walleye -T --depth=5
Searched to a depth of 5 and evaluated 5072212 nodes in 1.687552567s for a total speed of 5072212 nps

sudo perf record ./walleye -T --depth=5
[sudo] password for mike:
Searched to a depth of 5 and evaluated 5072212 nodes in 1.622877638s for a total speed of 5072212 nps
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.261 MB perf.data (6434 samples) ]

perf
51.48% walleye walleye [.] walleye::evaluation::get_evaluation
14.86% walleye libc-2.31.so [.] __memmove_avx_unaligned_erms
14.06% walleye walleye [.] walleye::move_generation::generate_moves
8.58% walleye walleye [.] walleye::move_generation::is_check_cords
2.14% walleye walleye [.] walleye::board::BoardState::move_piece
1.89% walleye walleye [.] _mi_page_retire
1.57% walleye walleye [.] walleye::move_generation::generate_moves_test

@mcoolin
Copy link
Author

mcoolin commented Jan 19, 2022

And when playing itself:
sudo perf record ./walleye -P -S

perf
27.61% walleye walleye [.] walleye::move_generation::generate_moves
18.09% walleye libc-2.31.so [.] __memmove_avx_unaligned_erms
16.91% walleye walleye [.] walleye::evaluation::get_evaluation
15.07% walleye walleye [.] walleye::move_generation::is_check_cords
2.95% walleye walleye [.] walleye::engine::alpha_beta_search
2.84% walleye walleye [.] walleye::board::BoardState::move_piece
2.44% walleye walleye [.] walleye::move_generation::rook_moves
2.03% walleye walleye [.] _mi_page_retire

Does it only play so far (e.g. number of moves)? What determines when it stops?

@MitchelPaulin
Copy link
Owner

MitchelPaulin commented Jan 19, 2022

Does it only play so far (e.g. number of moves)? What determines when it stops?

Its set to stop after 100 moves or checkmate, it usually ends up drawing when it plays itself and so reaches the 100 move mark.

Profiling it with -PS gives a better idea of how it performs in real games, where as -T is good if you want move generation specifically, is_check_cords still seems to dominate though, which is not what I expected. Could be a good place to look for optimizations by either reducing the number of calls to it or improving the function itself.

@mcoolin
Copy link
Author

mcoolin commented Jan 22, 2022

Hi Mitchel,

Got a few things I'm trying but running into borrow checker issues,

I'm trying to run multiple threads in generate_moves

for i in BOARD_START..BOARD_END {
    for j in BOARD_START..BOARD_END {
        if let Square::Full(piece) = board.board[i][j] {
            if piece.color == board.to_move {

// spawn this
generate_moves_for_piece(
piece,
board,
Point(i, j),
&mut new_moves,
move_gen_mode,
zobrist_hasher,
);
//
}
}
}
}

I clone the board but keep getting a static error on the zobrist_hasher. Not sure how to fix that yet.

I have to understand ownership and 'static better.

My idea is running the get moves in threads should be faster.

I have a similar idea in the check code but am less sure it will help.

Mike

@MitchelPaulin
Copy link
Owner

MitchelPaulin commented Jan 23, 2022

Hey Mike,

I do not think using multiple cores during play is allowed under the CCRL testing conditions, or at least they put the engine on a separate leader board and would prefer you use no more than one core during play. you can see this issue for more details. Usually you are allowed more than one thread but these need to be "light weight" threads, usually just waiting to be waken up by input and using additional threads for move generation would have a noticeable multi-core CPU footprint.

For learning purposes adding multi threading is a great idea but unfortunately I would not be able to include the results in Walleye.

Also a quick note:
Where you are planning on spawning the thread will spawn hundreds of thousands of threads to find a move, you will probably need to split up the work higher in the call stack if you go this route

@MitchelPaulin
Copy link
Owner

I'm going to close this issue but feel free to keep commenting here if you have any other questions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants