Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up full plugin parsing for correctness checking #5

Closed
Ortham opened this issue Jun 17, 2018 · 4 comments
Closed

Speed up full plugin parsing for correctness checking #5

Ortham opened this issue Jun 17, 2018 · 4 comments
Assignees

Comments

@Ortham
Copy link
Owner

Ortham commented Jun 17, 2018

libloadorder takes about 800x longer to fully parse a load order consisting of Skyrim.esm and 20 copies of Update.esm than it does when just parsing headers. It would be great if full plugin parsing was (much) faster, and in libloadorder's case it just cares if the file format is self-consistent (i.e. structure sizes are correct, actual data could be nonsense), so it might be worth offering a parse-and-discard mode.

The first step is adding some benchmarks though.

@Ortham Ortham self-assigned this Jun 17, 2018
@Ortham
Copy link
Owner Author

Ortham commented Jun 17, 2018

Some basic benchmarks:

No changes

Plugin.parse_file() header-only
                        time:   [31.472 us 31.723 us 32.002 us]

Plugin.parse_file() full
                        time:   [20.630 ms 20.753 ms 20.893 ms]

Plugin.parse_mmapped_file() header-only
                        time:   [55.817 us 56.109 us 56.473 us]

Plugin.parse_mmapped_file() full
                        time:   [8.7143 ms 8.7664 ms 8.8223 ms]

With FormID collection commented out (but still parsed from records)

Plugin.parse_file() header-only
                        time:   [28.998 us 29.137 us 29.301 us]

Plugin.parse_file() full
                        time:   [4.4352 ms 4.4549 ms 4.4763 ms]

Plugin.parse_mmapped_file() header-only
                        time:   [57.613 us 57.734 us 57.861 us]

Plugin.parse_mmapped_file() full
                        time:   [2.4021 ms 2.4146 ms 2.4276 ms]

So FormID collection is the slowest bit, but even just parsing records is still too slow for libloadorder.

@Ortham
Copy link
Owner Author

Ortham commented Jun 17, 2018

Skipping allocation of record header fields when parsing for only the FormID speeds up collectionless parsing for a memory-mapped Hearthfires.esm down to 1 ms:

Plugin.parse_file() header-only
                        time:   [28.681 us 28.844 us 29.060 us]

Plugin.parse_file() full
                        time:   [3.1757 ms 3.2031 ms 3.2328 ms]

Plugin.parse_mmapped_file() header-only
                        time:   [55.188 us 55.419 us 55.682 us]

Plugin.parse_mmapped_file() full
                        time:   [1.0637 ms 1.0673 ms 1.0714 ms]

Here are the results with FormID collection:

Plugin.parse_file() header-only
                        time:   [28.812 us 29.114 us 29.502 us]

Plugin.parse_file() full
                        time:   [7.7028 ms 7.7235 ms 7.7459 ms]

Plugin.parse_mmapped_file() header-only
                        time:   [57.017 us 57.195 us 57.429 us]

Plugin.parse_mmapped_file() full
                        time:   [5.4778 ms 5.4877 ms 5.4986 ms]

Specialising the record parsing function to be game-specific (taking the conditions out of nom macros) doesn't have any effect on performance. Neither does skipping allocation of the FormIDs. This makes sense, as the collectionless results are probably close to my SSD's throughput (it's actually rated for 2,150MB/s, I'm getting 3,636MB/s, though here the reads aren't sustained so they're not really comparable).

@Ortham
Copy link
Owner Author

Ortham commented Jun 18, 2018

read_to_end() doesn't pre-allocate a buffer for the whole file, doing that improves performance for parse_file() full. With FormID collection:

Plugin.parse_file() full
                        time:   [6.1372 ms 6.1650 ms 6.1954 ms]

Without FormID collection:

Plugin.parse_file() full
                        time:   [1.7348 ms 1.7417 ms 1.7497 ms]

@Ortham
Copy link
Owner Author

Ortham commented Jun 18, 2018

With just the conversion from Vec<u32>s to BTreeSet<FormId> commented out:

Plugin.parse_file() full
                        time:   [1.4648 ms 1.4728 ms 1.4816 ms]

then with that still commented out and pushing u32s to a single vec:

Plugin.parse_file() full
                        time:   [1.0750 ms 1.0821 ms 1.0906 ms]

The Vec<u32> isn't preallocated, as although an estimated size could be obtained from the TES4 header's record and group count field, preallocating only saves < 2% of the execution time (still ignoring the conversion to FormID objects), so it's not really worth it.

This is interesting for speeding up LOOT's sorting, as that's what uses FormIDs. As they are, FormIDs are highly portable but also very inefficient, but portability isn't really a concern as they never get passed through the C FFI and so esplugin could instead have the FormIDs reference masters rather than each taking a copy.

However, it's not very relevant to speeding up correctness checking, which has probably reached its limit... Full plugin parsing is 40% - 60% faster for large plugins, and memory mapping is used only when it's likely to give better performance, but it's still not enough for libloadorder to fully parse plugins to check for correctness. As such, closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant