Skip to content

Factorial function gets over-vectorized and slower with -O2 #36266

@llvmbot

Description

@llvmbot
Bugzilla Link 36918
Version 5.0
OS Linux
Reporter LLVM Bugzilla Contributor
CC @davidbolvansky,@DougGregor,@efriedma-quic,@hfinkel,@laytonio,@RKSimon,@jeremy-rifkin

Extended Description

Consider a simple factorial function with tail recursion:

int factorial(int n)
{
    if (n <= 0) return 1;
    return n * factorial(n - 1);
}

Compiling with -O2 on clang++ produces a lot of code compared to -O1, -Os, or any of these flags on gcc. This holds for at least versions 3.8 and onwards.

A comparison can be found here: https://godbolt.org/g/fbztqo

Basic performance tests on Ubuntu 16.04 (core i7-6700 CPU) also shows the non-super-vectorized version is considerably slower, at least for all values that don't lead to overflow. The clang binary is also slightly larger.

I would expect clang to not attempt this "optimization".

Note: Using -Os gives a result almost identical to gcc with -O2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions