Add `pgn.BaseVisitor.begin_parse_san()` method #984

snowdrop4 · 2023-03-30T13:43:06Z

Add a new pgn.BaseVisitor method: begin_parse_san().

This is called when the parser encounters a SAN, but before the very expensive step of calling board.parse_san() on the SAN (which entails generating all the possible legal moves).

If the begin_parse_san() method returns SKIP, parsing of the SAN is skipped.

There are a couple of potential use-cases:

only parsing the first $x moves from each game mainline (perhaps you're only interested in the opening stages of each game?)
only parsing the first $x moves from each variation (perhaps the follow-up moves inside each variation do not matter to you?)
trawling through lots of PGNs to find games where a specific opening sequence was played

As parsing the moves is by far the most time-consuming part of parsing a PGN, skipping any moves can result in huge time-savings.

On a real-world codebase that reads in PGNs with variations specified for most moves (and does a lot of computations afterwards), I'm seeing around a 20% speedup in overall program runtime, entirely from skipping parsing all but the first move of each variation.

niklasf · 2023-04-09T12:45:38Z

Thanks, this makes a lot of sense. In the more common case where every move is needed, how much does calling an extra method for every move cost?

snowdrop4 · 2023-04-12T14:24:23Z

Reading all games in a tight loop from a PGN with 74 games in it, 100 times. Measured with timeit.

Master:

run 1 = 131.732180291001
run 2 = 132.3093085839937
run 3 = 131.67879545799224

mean = 131.906761444 seconds

PR branch:

run 1 = 132.53046183301194
run 2 = 132.58470095800294
run 3 = 132.302952500002

mean = 132.472705097 seconds

For an increase of 0.004% in total runtime.

Python 3.11.2 on macOS 13.3.1 on an M1 MacBook air.

snowdrop4 · 2023-04-12T14:27:40Z

Bonus benchmark, run the same way on the PR branch, but making use of the begin_parse_san() visitor to skip all but the first move from each variation:

run 1 = 95.46104741700401
run 2 = 95.54273075000674
run 3 = 95.95566983299796

mean = 95.6531493333 seconds

snowdrop4 · 2023-04-19T12:37:10Z

Semi-unrelatedly, but would you be amenable to a PR that ported the move generation to Rust with PyO3?

niklasf · 2023-04-25T18:31:13Z

Thanks again.

Regarding #722, I am not sure, honestly. Priority would memory safety and correctness, of course, followed by simplicity of the build/release process. I've also considered ditching the aged python-chess API and making bindings for shakmaty, instead.

add BaseVisitor.begin_parse_san() method

7d670a4

niklasf merged commit bb3a2a4 into niklasf:master Apr 25, 2023

niklasf added this to the v1.10.0 milestone Jul 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `pgn.BaseVisitor.begin_parse_san()` method #984

Add `pgn.BaseVisitor.begin_parse_san()` method #984

snowdrop4 commented Mar 30, 2023 •

edited

Loading

niklasf commented Apr 9, 2023

snowdrop4 commented Apr 12, 2023 •

edited

Loading

snowdrop4 commented Apr 12, 2023

snowdrop4 commented Apr 19, 2023

niklasf commented Apr 25, 2023

Add pgn.BaseVisitor.begin_parse_san() method #984

Add pgn.BaseVisitor.begin_parse_san() method #984

Conversation

snowdrop4 commented Mar 30, 2023 • edited Loading

niklasf commented Apr 9, 2023

snowdrop4 commented Apr 12, 2023 • edited Loading

snowdrop4 commented Apr 12, 2023

snowdrop4 commented Apr 19, 2023

niklasf commented Apr 25, 2023

Add `pgn.BaseVisitor.begin_parse_san()` method #984

Add `pgn.BaseVisitor.begin_parse_san()` method #984

snowdrop4 commented Mar 30, 2023 •

edited

Loading

snowdrop4 commented Apr 12, 2023 •

edited

Loading