Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-11640] Multiplying doubles from struct #54051

Open
zoecarver opened this issue Oct 19, 2019 · 4 comments
Open

[SR-11640] Multiplying doubles from struct #54051

zoecarver opened this issue Oct 19, 2019 · 4 comments

Comments

@zoecarver
Copy link
Collaborator

@zoecarver zoecarver commented Oct 19, 2019

Previous ID SR-11640
Radar None
Original Reporter @zoecarver
Type Improvement
Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Improvement, CodeGen, Optimizer, Performance
Assignee None
Priority Medium

md5: 833dd33d195d879e7d8fc3c9618b65e2

Issue Description:

struct X {
    let x : Double = 2
    let y : Double = 2
    let z : Double = 2
}

func test(numbers: [X]) -> Double {
    var x = 1.0
    for num in numbers {
        x *= num.x
        x *= num.y
        x *= num.z
    }
    return x
}

assert(test(numbers: [X(), X(), X(), X(), X()]) != 0)

The following generates 9 different `fmul` instructions. It should only generate 3. It also generates 31 `getelementptr` instructions when it certainly doesn't need that many, and theoretically could only generate 4.

An equivalent program in C++ takes about 1/4th of the time to run. Here is a comparison of the codegen from swift and clang.

If others agree this is an issue, I will start working on a patch to try to resolve it.

@belkadan
Copy link
Contributor

@belkadan belkadan commented Oct 21, 2019

I only see three fmul instructions in the optimized code. Are you looking at unoptimized code?

@zoecarver
Copy link
Collaborator Author

@zoecarver zoecarver commented Oct 21, 2019

Yes, I am looking at the optimized code. Take a look at the assembly or the IR gen in the link above.

@belkadan
Copy link
Contributor

@belkadan belkadan commented Oct 21, 2019

Ah, I thought you meant LLVM IR. The nine instructions are from loop unrolling; if you use -Osize instead of -O you get the three you originally expected.

@zoecarver
Copy link
Collaborator Author

@zoecarver zoecarver commented Oct 22, 2019

Ah, you're right! That's probably not the slow part. I tried removing the conditional fails which made it a little faster but, not much. Any other ideas?

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants