Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Codechange: Build station and depot vehicle lists from shared order lists. #11676

Merged
merged 1 commit into from Jan 5, 2024

Conversation

PeterN
Copy link
Member

@PeterN PeterN commented Jan 3, 2024

Motivation / Problem

When building lists of vehicles from stations and depots, we iterate all vehicles, filter to find suitable vehicles, and then check each other to see if it matches the station/depot we're listing.

This is a bit bruteforce, and means for shared orders, we are checking order lists multiple times.

Description

Instead, build station and depot vehicle lists from shared order lists.

This brings some performance advantages:

  • No need to iterate all vehicles and check for primary vehicle as only vehicles that can have orders are listed.
  • Orders shared by vehicles only need to be tested once instead of for each vehicle sharing them.
  • Vehicle tests only need to be performed on the first shared vehicle instead of all.

Stats from Xarick's corona test game:

version AI station
master avg 3600µs
this PR avg 1200µs

Stats from Wentbourne:

version VLI station list VLI depot list
master avg 2100µs avg 2000µs
this PR avg 350µs avg 350µs

Limitations

This is reasonable improvement, and only uses data that we already have.

Checklist for review

Some things are not automated, and forgotten often. This list is a reminder for the reviewers.

  • The bug fix is important enough to be backported? (label: 'backport requested')
  • This PR touches english.txt or translations? Check the guidelines
  • This PR affects the save game format? (label 'savegame upgrade')
  • This PR affects the GS/AI API? (label 'needs review: Script API')
    • ai_changelog.hpp, game_changelog.hpp need updating.
    • The compatibility wrappers (compat_*.nut) need updating.
  • This PR affects the NewGRF API? (label 'needs review: NewGRF')

2TallTyler
2TallTyler previously approved these changes Jan 4, 2024
Copy link
Member

@2TallTyler 2TallTyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me.

@rubidium42
Copy link
Contributor

I'm seeing four times essentially the same loop, with only small differences in conditionals at different places. It looks to me like this can be written in a single templated function, with some small functions for the differentiating logic. Like whether to filter on company or not, what type of order is allowed, and how to add to a collection.

That would split the internals of the (complicated) loops from the actual important bits for each of the instances. It's bit similar to something like std::copy_if, which has an UnaryPredicate to separate the if logic from the looping logic.

@PeterN
Copy link
Member Author

PeterN commented Jan 5, 2024

I've duffed the code a bit, replacing nested if conditions with continue.

And here's a good thing. Making locals out of ScriptObject::GetCompany() and ScriptCompanyMode::IsDeity() made another performance difference:

dbg: [misc] [Old] 3663 us [avg: 3663.0 us]
dbg: [misc] [New] 196 us [avg: 196.0 us]
dbg: [misc] [Old] 2916 us [avg: 2916.0 us]
dbg: [misc] [New] 188 us [avg: 188.0 us]
dbg: [misc] [Old] 2970 us [avg: 2970.0 us]
dbg: [misc] [New] 257 us [avg: 257.0 us]

@PeterN
Copy link
Member Author

PeterN commented Jan 5, 2024

I'm seeing four times essentially the same loop, with only small differences in conditionals at different places. It looks to me like this can be written in a single templated function, with some small functions for the differentiating logic. Like whether to filter on company or not, what type of order is allowed, and how to add to a collection.

That would split the internals of the (complicated) loops from the actual important bits for each of the instances. It's bit similar to something like std::copy_if, which has an UnaryPredicate to separate the if logic from the looping logic.

I had a play with this and ended up with the function like this.

template <class VehiclePredicate, class OrderPredicate, class VehicleFunc>
void FindVehiclesWithOrder(VehiclePredicate vp, OrderPredicate op, VehicleFunc vf) {
...
}

This seems to be at least about 50% slower. Not sure if I'm doing it wrong, or if it's because it needs to capture variables.

@glx22
Copy link
Contributor

glx22 commented Jan 5, 2024

And here's a good thing. Making locals out of ScriptObject::GetCompany() and ScriptCompanyMode::IsDeity() made another performance difference:

Maybe do the same for ScriptVehicleList too, and the Group ones.

@PeterN
Copy link
Member Author

PeterN commented Jan 5, 2024

Ah, good point!

src/vehiclelist.cpp Outdated Show resolved Hide resolved
@PeterN
Copy link
Member Author

PeterN commented Jan 5, 2024

Performance difference since #11686:

dbg: [misc] [old] 207814 us [avg: 2078.1 us]
dbg: [misc] [new] 17636 us [avg: 176.4 us]
dbg: [misc] [old] 193357 us [avg: 1933.6 us]
dbg: [misc] [new] 16664 us [avg: 166.6 us]
dbg: [misc] [old] 203591 us [avg: 2035.9 us]
dbg: [misc] [new] 17441 us [avg: 174.4 us]
dbg: [misc] [old] 192353 us [avg: 1923.5 us]
dbg: [misc] [new] 16598 us [avg: 166.0 us]
dbg: [misc] [old] 202319 us [avg: 2023.2 us]
dbg: [misc] [new] 17576 us [avg: 175.8 us]

But please double check I haven't got the conditions wrong :)

src/vehiclelist.cpp Outdated Show resolved Hide resolved
…ists.

The brings some performance advantages:

* No need to iterate all vehicles and check for primary vehicle as only vehicles that can have orders are listed.
* Shared orders only need to be tested once instead of for each vehicle sharing them.
* Vehicle tests only need to be performed on the first shared vehicle instead of all.
Copy link
Contributor

@rubidium42 rubidium42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the category of massive bike shedding... but if you're looking for optimal performance, some things that might help:

  • move the order->GetDestination check to before the order->IsType checks. For an order it's generally way more likely that the destination ID differs from what you want, than whether it's station vs depot, as such the destination check drops way more and that saves the two or three IsType expressions. Technically order->GetDestination requires the order to be one of a few specific types, but that isn't checked and I think the documentation of order->GetDestination is not even listing all of them.
  • when is_deity is true, you can make a separate call where the vehicle predicate just returns true, which ought to remove the whole predicate with minor optimisations; then for the other branch you only test for owner. It's more code, but fewer comparisons, so it might make a difference.

Of both I'm definitely not saying that we should do it, but rather that it could be done and could improve performance even more. Even though I do not think this code is called that often.

@PeterN
Copy link
Member Author

PeterN commented Jan 5, 2024

Splitting the deity check seems to save on average about 20µs (so about 10%) in the stationlist case.

@PeterN PeterN merged commit 34e8c8e into OpenTTD:master Jan 5, 2024
20 checks passed
@PeterN PeterN deleted the vehicle-list-station-depot branch January 5, 2024 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants