Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression in r18 with ndk-build inside makefiles themselves #799

Closed
b-spencer opened this issue Sep 21, 2018 · 13 comments
Closed
Assignees
Milestone

Comments

@b-spencer
Copy link

b-spencer commented Sep 21, 2018

Description

When upgrading from r16b to r18, I'm experiencing a significant performance regression with ndk-build when doing even ndk-build clean (or a anything else). One example:

ndk-build -j16 clean time for r16b: 0m1.070s
ndk-build -j16 clean time for r18: 1m21.414s

Yes, that's an increase of 80 seconds. :) This project only takes 70 seconds to compile in r16b.

The time is all spent executing expressions inside the setup of make. It appears that the changes for f99eb9f (JSON Compilation Database support) are the source of the problem, even though I am taking no actions to enable or use this feature.

It seems that make is spending a lot of time executing the highly recursive (and probably slow) generate-list-file sequence for thousands of targets. By simply commenting out three lines in definitions.mk, the performance regression disappears (and the build seems to work):

https://android.googlesource.com/platform/ndk.git/+/ndk-release-r18/build/core/definitions.mk#1509
https://android.googlesource.com/platform/ndk.git/+/ndk-release-r18/build/core/definitions.mk#1526
https://android.googlesource.com/platform/ndk.git/+/ndk-release-r18/build/core/definitions.mk#1527

ndk-build -j16 clean time for r18 with these lines commented out: 0m1.288s

The first commented-out line is the most important and gives the largest benefit. Perhaps this feature and thus these expansions can, at the very least, be enabled only when the JSON Compilation Database is requested?

Environment Details

  • NDK Version: 18.0.5002713
  • Build system: ndk-build
  • Host OS: Linux x86_64 Debian Stretch
  • Compiler: clang
  • ABI: android-19
  • STL: libc++
@b-spencer b-spencer changed the title Serious performance regression in r18 with ndk-build inside makefiles themselves Performance regression in r18 with ndk-build inside makefiles themselves Sep 21, 2018
@DanAlbert DanAlbert self-assigned this Sep 21, 2018
@DanAlbert DanAlbert added this to the r19 milestone Sep 21, 2018
@DanAlbert
Copy link
Member

DanAlbert commented Sep 21, 2018

Putting this in the r19 bucket for now. If we end up making an r18b this will be in it assuming I can come up with a relatively safe looking change (should be able to).

@DanAlbert DanAlbert modified the milestones: r19, r18b Sep 26, 2018
@DanAlbert
Copy link
Member

DanAlbert commented Sep 26, 2018

Nothing else has come in yet, but I think this probably justifies r18b on its own.

@DanAlbert
Copy link
Member

DanAlbert commented Oct 3, 2018

Managed to make generate-list-file quite a bit faster. Would you mind applying this patch to your NDK to see how much it helps? It did a fair amount for the libshaderc, but I don't have an ndk-build project on hand that takes anywhere near a minute to build (though there could be some -j differences). I think we probably should still only use the list files when LOCAL_SHORT_COMMANDS is set, but if the performance cost is now negligible I'd rather keep everyone on the same code path.

@b-spencer
Copy link
Author

b-spencer commented Oct 3, 2018

Hmm. It's better, but unfortunately, it's still slow enough to be frustrating in a edit-compile cycle.

For the same project, it now takes ~9 seconds to clean, almost all of which is spent in the startup time:

r16b:
  real    0m1.046s

r18 orig: 
  real    1m25.578s
  real    1m20.962s

r18 my commented-out hack:
  real    0m1.291s
  real    0m1.311s

r18 upstream patch:
  real    0m9.064s
  real    0m9.083s

Commenting out the generate-list-file call right before the _JSON_INTERMEDIATE target still shaves the time down to only twice as long as r16. Cutting the number of objects in play (such as by removing one of the ABI targets, for example) also cuts the time, of course.

BTW, I was able to measure that generate-list-file is being evaluated 2748 times (once per .o) in this project when running ndk-build clean.

(Another large project I tried it on takes ~6.2 seconds to clean with r18 plus this patch vs. ~2.0 seconds with r16b.)

Just to be completely clear, I'm using the clean target as a proxy for measuring the makefile overhead. The same delay occurs no matter what target you ask for.

@b-spencer
Copy link
Author

b-spencer commented Oct 3, 2018

I just tried this out on a minimal ndk-build project that just compiles 3000 zero-byte .cc files into an executable and its clean target exhibits the same kind of relative timings. I'm definitely happy to help out, but perhaps making a project like that would let you get a good feel for the speed without waiting for me . :)

@DanAlbert
Copy link
Member

DanAlbert commented Oct 3, 2018

That's what I figured, but thanks for confirming. I'll continue on to making sure we don't use these code paths unless necessary then.

@DanAlbert
Copy link
Member

DanAlbert commented Oct 3, 2018

Okay, here's the second round of improvements: https://android-review.googlesource.com/c/platform/ndk/+/777020

With that, a project with 3000 empty source files (12000 object files total) takes 6 seconds to do a no-op clean instead of 2 like it would in r17. If the user sets LOCAL_SHORT_COMMANDS=true it takes 30 seconds compared to 2m10s for r17b (that first patch really paid off, apparently).

I'm not sure if we can do better without changing the interface. The only way to get more time back would be to check if the compilation database was requested and avoid generating the rules entirely if it wasn't. This is indicated by either building the compile_commands.json goal or setting $(GEN_COMPILE_COMMANDS_DB) to true. We can detect if the user explicitly requests the former on the command line with $(MAKECMDGOALS), but I don't know off the top of my head how to detect something like the following:

# Android.mk snippet
my-custom-goal: compile_commands.json
$ ndk-build my-custom-goal

I'll give it some more thought.

@DanAlbert
Copy link
Member

(one option would of course be documenting that unlikely case and just requiring that anyone with such a rule handles it by flipping GEN_COMPILE_COMMANDS_DB to true if they see my-custom-goal in MAKECMDGOALS, and that might be good enough)

@b-spencer
Copy link
Author

b-spencer commented Oct 4, 2018

👍 With both yesterday's and today's patches applied, r18 is showing good speeds on the same project:

r18 + both patches:
  real    0m1.552s
  real    0m1.733s

As for the my-custom-goal question, you might consider making the compile_commands.json target have a conditional definition. When it's not a command-line target and GEN_COMPILE_COMMANDS_DB isn't defined, it could fail and print a message that says "Please define GEN_COMPILE_COMMANDS_DB before depending on compile_commands.json" or something like that.

@DanAlbert
Copy link
Member

Oh, so what we have now is actually more or less equivalent to r17 for a less contrived project. That's better than I thought. I suppose an extra half a second might still be worth trying to win back though?

I think for the moment it's time to stop fiddling with it to keep the risk down for r18b, but that could be an improvement for r19.

@DanAlbert
Copy link
Member

The two big fixes are in the r18 branch now. Filed #808 to track post-r18 work.

@DanAlbert
Copy link
Member

r18b was just released, FYI. Should be able to undo your local patches and update now.

scyclzy pushed a commit to scyclzy/ndk that referenced this issue Oct 20, 2018
Using list file generation for creating compile_commands.json slowed
down ndk-build a fair amount for large projects. For the following
command:

    $NDK/ndk-build -C $NDK/sources/third_party/shaderc \
        NDK_PROJECT_PATH=. APP_BUILD_SCRIPT=Android.mk APP_ABI=all \
        APP_PLATFORM=android-24 -j 72 clean

r17b takes 0.5 seconds, but r18 takes 10 seconds.

The algorithm itself is rather slow and can be sped up. With this
change, the above command takes only 0.9 seconds.

We should still gate the use of this on LOCAL_SHORT_COMMANDS so we
can get back to r17's speeds, but that will be a follow up patch.

Test: Generated a compile_commands.json for a project and compared it
      to one generated with r18. The only differences are whitespace.
Bug: android/ndk#799

Change-Id: Ic8f58c8f2c05c19b613e095c0479049b31ad6d72
(cherry picked from commit 4ab6658)
@DanAlbert
Copy link
Member

DanAlbert commented Sep 10, 2019

Managed to speed things up more (about 2x) in the LOCAL_SHORT_COMMANDS=true case now that we have Make 4: https://android-review.googlesource.com/c/platform/ndk/+/1118751. It looks like we've also improved since r18b from other changes (presumably just by updating to a newer version of make?).

disigma pushed a commit to wimal-build/ndk that referenced this issue Sep 25, 2019
This looks to be about twice as fast and lets us delete some code. To
test performance, I generated an ndk-build project with 1000 empty C++
files and built a single static library from them. That project was
built with:

    $ time $NDK/ndk-build -C foo LOCAL_SHORT_COMMANDS=true -j 72 clean

Checked with r18b (the last time we improved list file performance),
and master with and without this patch.

The results:

NDK                  | Time
-------------------- | ----
r18b                 | 5.4s
r21 (without change) | 4.5s
r21 (with change)    | 2s

Test: ./checkbuild.py && ./run_tests.py
Bug: android/ndk#799
Change-Id: If8b13cc72b462bda9e5238acd12f5e64f3bf0a96
mehulagg pushed a commit to mehulagg/superproject that referenced this issue Dec 21, 2019
* Update ndk from branch 'ndk-release-r18'
  to 6003f1835c0a758edee845caf7464078e2060de3
  - Merge changes I5a440505,Ic8f58c8f into ndk-release-r18
    
    * changes:
      Only use list files for the JSON db when requested.
      Improve generate-list-file performance.
    
  - Only use list files for the JSON db when requested.
    
    Generating list files is faster now, but just the generation of the
    rules is still quite slow when there are many objects. Avoid doing so
    unless LOCAL_SHORT_COMMANDS is used.
    
    When the following command is used to build a project with 1000 empty
    source files:
    
        ndk-build APP_ABI=arm64-v8a -B -j72 clean
    
    r17b takes 0.25 seconds and r18 takes 11 seconds.
    
    With this change, r19 takes 1 second when LOCAL_SHORT_COMMANDS is
    true and .32 seconds when it is false.
    
    Test: Generated a compile_commands.json file and checked accuracy
          with both LOCAL_SHORT_COMMANDS=true and
          LOCAL_SHORT_COMMANDS=false.
    Bug: ndk-build -C foo APP_ABI=arm64-v8a -B -j72 clean
    
    Change-Id: I5a440505c68d52f30131290998a5ce080f489732
    (cherry picked from commit f90e7869a7e1ed07a3679128a95dcdfc659c5e16)
    
  - Improve generate-list-file performance.
    
    Using list file generation for creating compile_commands.json slowed
    down ndk-build a fair amount for large projects. For the following
    command:
    
        $NDK/ndk-build -C $NDK/sources/third_party/shaderc \
            NDK_PROJECT_PATH=. APP_BUILD_SCRIPT=Android.mk APP_ABI=all \
            APP_PLATFORM=android-24 -j 72 clean
    
    r17b takes 0.5 seconds, but r18 takes 10 seconds.
    
    The algorithm itself is rather slow and can be sped up. With this
    change, the above command takes only 0.9 seconds.
    
    We should still gate the use of this on LOCAL_SHORT_COMMANDS so we
    can get back to r17's speeds, but that will be a follow up patch.
    
    Test: Generated a compile_commands.json for a project and compared it
          to one generated with r18. The only differences are whitespace.
    Bug: android/ndk#799
    
    Change-Id: Ic8f58c8f2c05c19b613e095c0479049b31ad6d72
    (cherry picked from commit 4ab66582190ef3d51d43e9603cce961f85656c44)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants