Skip to content

[RULE REQUEST] Warn for compiler and linker flags that are bad for performance are used for release builds#659

Closed
dmachaj wants to merge 6 commits into
microsoft:mainfrom
dmachaj:user/dmachaj/performance-scanning
Closed

[RULE REQUEST] Warn for compiler and linker flags that are bad for performance are used for release builds#659
dmachaj wants to merge 6 commits into
microsoft:mainfrom
dmachaj:user/dmachaj/performance-scanning

Conversation

@dmachaj
Copy link
Copy Markdown
Collaborator

@dmachaj dmachaj commented Jun 25, 2022

Closes #651
[RULE REQUEST] Warn for compiler and linker flags that are bad for performance are used for release builds

Why is this change being made?

The purpose of this change is to establish a pattern of identifying and warning about performance problems in BinSkim, in addition to security problems. BinSkim has a wide (and well deserved) reach so it would be great if it could provide coverage here. There are a variety of compiler/linker options that are good or bad for performance and in many cases the problems are non-obvious. BinSkim warning about these non-obvious problems would help improve the performance of software whenever it is run and followed.

Briefly summarize what changed

This change starts performance warnings with the least controversial problem that I am aware of: incremental linking. Incremental linking is good for debug builds but it is very bad for optimized release builds that are released publicly. It leaves padding around the binary so that more code can be shoved in later. That is wasteful for release builds.

The BA6001.DisableIncrementalLinking rule was created. I semi-arbitrarily picked the 6000 range for performance-related warnings. This warning tries to only apply to optimized code so it tries to sniff out whether the binary was optimized, or not. If the binary is unoptimized then clearly performance checking is not needed so it skips analysis. The current approach for detecting optimization is a heuristic so there is probably a better way to do it. I don't know it so I'm hoping this PR can help drive discussion.

The other aspect is detecting the linker flags. There does not seem to be precedent for analyzing the link.exe command line so I had to establish a LinkerCommandLine class to encapsulate that. The docs.microsoft.com documentation for incremental linking was encoded as logic in here to determine whether incremental linking will happen, or not.

The outer rule then checks if optimizations are enabled and incremental linking is also enabled. If both are true then it is a problem and it logs a warning.

The documentation was also updated as appropriate.

How was this change tested?

I tested it locally using a variety of binaries with different variations of linker flags. These binaries are added as test cases. There is also a new set of functional tests for this rule (which pass; although the NotApplicable test is sketchy).

Comment thread src/BinaryParsers/PEBinary/ProgramDatabase/LinkerCommandLine.cs Fixed
Comment thread src/BinaryParsers/PEBinary/PortableExecutable/PE.cs
Comment thread src/BinaryParsers/PEBinary/PortableExecutable/PE.cs
// Arbitrary threshold of 50% optimized.
// Things linked into the binary also count as symbols so this can be complicated. There is probably a better heuristic than this.
// A small single-file win32 console application is just over 50% optimized. A large project is more like 95%.
if (((float)optimizedCount / (float)count) > 0.50f)
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This heuristic is not ideal. I'm looking for better options.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detection just from a binary? Tough one.

Imports to debug CRTs imply not-optimized

Similarly, if the binary has certain symbols from a statically linked debug CRT it's likely not-optimized. Not sure what symbols to look for. In particular in cases like WinAppSDK using the HybridCRT approach the Standard C library is UCRT and just the Standard C++ stuff (vector etc) are static. Not sure if there's any debug artifacts in there one could sniff out

Copy link
Copy Markdown
Collaborator Author

@dmachaj dmachaj Jul 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting suggestion. Linking to the debug C runtime definitely means a debug build. However many projects link to the release C runtime, even for debug builds, so it is not a firm guarantee. That might be a good early-exit approach. I'll think about this more.

Also, this rule requires access to the PDB so it isn't just the binary.

Comment thread docs/BinSkimRules.md

---

## Rule `BA6001.DisableIncrementalLinking`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to clash with #657
@michaelcfanning or @marmegh can you advise on how to assign these new proposed rule IDs?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 6000 range was not claimed when I started coding this change. If that range will be in use before this merges then I don't mind renumbering to something else (7000's?).

Thanks for raising that. I didn't realize there was another PR adding a 6001 rule.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind renumbering mine (SourceLink) either. Jumping up by 1,000 each time seems silly. For mine, I can't imagine having more than one or two rules related to Source Link. I certainly don't need to reserve 1,000! Maybe we should both just append onto the 2,000 range.
@michaelcfanning, @marmegh what's the thinking on rule ID assignment? Is there an established convention or policy? For example, the "RULE REQUEST" template asks for a rule ID, but isn't it up to the maintainers to decide what the new ID should be (and avoid clashes like this)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pharring, we're still reviewing both changes for the best rule id range.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be late jumping in, these are both excellent contributions and we appreciate them. I just assigned BA2027 to Paul, as source link has a clear security connection. I tend to think that starting a new id line with BA6001 for analysis that's strictly performance relevant makes sense. So you win, @dmachaj, stick with it. :)

I plan to give this PR careful attention tomorrow, when I have some more time. Thanks again for the contribution! This will be an interesting discussion.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelcfanning please let me know if you have any thoughts, questions, or feedback. I'm especially interested in discussing the expansion of BinSkim into performance warnings. I have several more that I'd like to follow up with once the pattern is established. Thanks.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very sorry for the delay here, and I'm headed out for a break. We did manage to get @pharring's rule merged and you're next in the queue. That merge means you have file conflicts to resolve, sorry about that. Also, the fact that you're contributing from a fork (which is typical for mature/larger OSS projects) happens to prevent our CI/CD build from verifying your change (because it is hardened to only process branches from trusted contributors). I have therefore made you a write contributor. So if you have time to resolve merge conflicts and move your contribution over to a branch off main and submit a new PR, I'd appreciate it. I will be reviewing your rule as soon as I can get to it, which will be 7/20 (Wednesday) at the earliest. Sorry again for the delay.

Once we get through this first contribution and establish a pattern, the next perf rules s/be smoother sailing. :) Thank you for this contribution!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries. I have merged and un-forked and there is now #667.

Comment thread src/BinSkim.Rules/PERules/BA6001.DisableIncrementalLinking.cs Outdated
Comment thread src/BinSkim.Rules/PERules/BA6001.DisableIncrementalLinking.cs Outdated
Comment thread src/BinaryParsers/PEBinary/ProgramDatabase/CompilerCommandLine.cs Outdated
Comment thread src/BinaryParsers/PEBinary/ProgramDatabase/LinkerCommandLine.cs Fixed
@DrusTheAxe
Copy link
Copy Markdown
Member

DrusTheAxe commented Jul 1, 2022

How about MSBUILD .props/.targets to do this detection? Projects using MSBUILD (including VS *proj) could import PerformanceWarning.target to get detection at build-time (including dev inner-loop build-time). That would be a big plus over binskim which is rarely (can't be?) run locally as part of inner-loop.

I've seen build logic like that but can't recall where. Something something UCRT something Windows something?

@dmachaj
Copy link
Copy Markdown
Collaborator Author

dmachaj commented Jul 2, 2022

How about MSBUILD .props/.targets to do this detection? Projects using MSBUILD (including VS *proj) could import PerformanceWarning.target to get detection at build-time (including dev inner-loop build-time). That would be a big plus over binskim which is rarely (can't be?) run locally as part of inner-loop.

I've seen build logic like that but can't recall where. Something something UCRT something Windows something?

BinSkim is a tool that is packaged up in a NuGet. It can be run locally so long as you know to. That wouldn't be part of the inner loop but these sorts of things don't really change that often so I'm not sure it makes sense to run it with every build. I think the important part is outer loop coverage.

The other benefit of BinSkim over something like .props/.targets is that it is agnostic of the build tooling. As long as cl.exe/link.exe are used it doesn't matter if you use CMake, Ninja, nmake, msbuild, a 20 year old batch file, or whatever else to compile your project. It can still check the results.

@DrusTheAxe
Copy link
Copy Markdown
Member

Ohhh I didn't mean to imply either-or. BinSkim is cool but as you note the weakness is it's a step removed from source so you're reduced to digging up dinosaur bones to determine if they were warm- or cold-blooded, their color and if they had feathers. Much easier to do at the source level starting with a Condition="'$(Configuration)'=='Release'". Devs usually build Debug flavors locally so its not really an inconvenience, and even better if that Condition includes an optional opt-out if you define a property false</> and if unspecified default to true for Release and false for Debug

The actual detection is easy since you have access to the actual compiler and linker options. WinAppSDK and other projects have various pre-build checks so it's paved ground too

P.S. The debugCRT isn't definitive but seems like a solid way to early exit. Heuristically distinguishing optimized vs not just from a binary is hard

RTTI is a good one to probe for. Fortunately you should be able to spot RTTI metadata in the binary to reliable warnnon that front

Do you have a specific list of what options you want to detect? Always good to start with goals... 🙂

@dmachaj
Copy link
Copy Markdown
Collaborator Author

dmachaj commented Jul 19, 2022

@DrusTheAxe @jonwis @pharring @lhecker as an FYI this PR is being closed and is moving to #667 (same changes, not from a fork).

@dmachaj dmachaj closed this Jul 19, 2022
@dmachaj dmachaj deleted the user/dmachaj/performance-scanning branch July 19, 2022 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RULE REQUEST] Warn for compiler and linker flags that are bad for performance are used for release builds

8 participants