Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize access of array member in a structure. #13972

Merged
merged 1 commit into from May 12, 2023
Merged

Conversation

shwqf
Copy link
Contributor

@shwqf shwqf commented Dec 16, 2022

This change adds a simple pattern match for a short AIR sequence of 'struct_field_ptr -> load -> array_elem_val', and generates same LLVM IR as the sequence of 'struct_field_ptr -> ptr_elem_ptr -> load'.

Although this change will still generate CallInst of Memcpy, LLVM will successfully identify the CallInst as dead code and eliminate it, preventing unuseful memcpy of the array field.

Resolves #13938

test_io_uring_renameat_old Outdated Show resolved Hide resolved
@Vexu
Copy link
Member

Vexu commented Dec 16, 2022

It is good to fix things like this but doing it in llvm.zig means that it either has to be implemented for every backend or that it needs to be reimplemented in an earlier stage of the compiler.

@shwqf
Copy link
Contributor Author

shwqf commented Dec 16, 2022

It is good to fix things like this but doing it in llvm.zig means that it either has to be implemented for every backend or that it needs to be reimplemented in an earlier stage of the compiler.

I am familiar with LLVM, so I changed llvm.zig first. I will try to optimize it in Sema.zig tomorrow.

@Vexu
Copy link
Member

Vexu commented Dec 16, 2022

I am familiar with LLVM, so I changed llvm.zig first. I will try to optimize it in Sema.zig tomorrow.

Note that it is not a blocker for this PR since the other backends (besides maybe the C backend) are not ready for use.

@shwqf
Copy link
Contributor Author

shwqf commented Dec 16, 2022

I am familiar with LLVM, so I changed llvm.zig first. I will try to optimize it in Sema.zig tomorrow.

Note that it is not a blocker for this PR since the other backends (besides maybe the C backend) are not ready for use.

lol, it would be great if we have a LLVM-like pass manager for AIR

@IntegratedQuantum
Copy link
Contributor

Although this change will still generate CallInst of Memcpy, LLVM will successfully identify the CallInst as dead code and eliminate it

Does that also work in debug mode?
Because if it doesn't and the debug performance is still 10000 times worse, then this PR wouldn't help me(being the author of #13938) since I want to be able to test my program in debug mode.

@shwqf
Copy link
Contributor Author

shwqf commented Dec 17, 2022

Although this change will still generate CallInst of Memcpy, LLVM will successfully identify the CallInst as dead code and eliminate it

Does that also work in debug mode? Because if it doesn't and the debug performance is still 10000 times worse, then this PR wouldn't help me(being the author of #13938) since I want to be able to test my program in debug mode.

Now it can

@matu3ba
Copy link
Contributor

matu3ba commented Dec 19, 2022

Can you rebase on top of master and run zig fmt? It fails:

-- Install configuration: "Debug"
+ echo Looking for non-conforming code formatting...
Looking for non-conforming code formatting...
+ stage3-debug/bin/zig fmt --check .. --exclude ../test/cases/ --exclude ../build-debug
../src/codegen/llvm.zig
Error: Process completed with exit code 1.

@shwqf
Copy link
Contributor Author

shwqf commented Dec 20, 2022

Can you rebase on top of master and run zig fmt? It fails:

-- Install configuration: "Debug"
+ echo Looking for non-conforming code formatting...
Looking for non-conforming code formatting...
+ stage3-debug/bin/zig fmt --check .. --exclude ../test/cases/ --exclude ../build-debug
../src/codegen/llvm.zig
Error: Process completed with exit code 1.

fixed

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for sending this enhancement.

In the future, I want to implement #13265. This will make the LLVM backend no longer use the LLVM IR builder API and instead output bitcode directly. Under such circumstances, we will not have access to these C++ APIs, or this SSA structure that LLVM creates. This patch takes us further away from that goal, and so I request that the optimization be done without doing either of these things:

  • Introducing new C++ code
  • Depending on more of LLVM's API surface area

@shwqf
Copy link
Contributor Author

shwqf commented Dec 23, 2022

Thank you for sending this enhancement.

In the future, I want to implement #13265. This will make the LLVM backend no longer use the LLVM IR builder API and instead output bitcode directly. Under such circumstances, we will not have access to these C++ APIs, or this SSA structure that LLVM creates. This patch takes us further away from that goal, and so I request that the optimization be done without doing either of these things:

  • Introducing new C++ code
  • Depending on more of LLVM's API surface area

The new C++ code is for removing useless memcpy in debug mode, LLVM can optimize it automatically in release mode.

I think we can't avoid it because we need to track the number of users of the AllocaInst, to make sure that users don't intend to copy the field, that is why I call 'hasOneUser'.

I guess it is ok, if we want to generate .bc directly, we should implement use-def and def-use in air, this function is easy to reimplement.

@Vexu
Copy link
Member

Vexu commented Dec 23, 2022

  • Introducing new C++ code

  • Depending on more of LLVM's API surface area

One possible but slightly harder solution would be to implement this in AstGen.

@shwqf
Copy link
Contributor Author

shwqf commented Dec 23, 2022

  • Introducing new C++ code
  • Depending on more of LLVM's API surface area

One possible but slightly harder solution would be to implement this in AstGen.

I tried but I found AST is untyped.

@Vexu
Copy link
Member

Vexu commented Dec 23, 2022

Types are not needed to fix this, field access can result in a pointer via field_ptr and array access can take a pointer with elem_ptr and then turn it into a value using load. The issue is that currently AstGen uses field_val and elem_val and then relies on LLVM to fix it.

@shwqf
Copy link
Contributor Author

shwqf commented Dec 23, 2022

Types are not needed to fix this, field access can result in a pointer via field_ptr and array access can take a pointer with elem_ptr and then turn it into a value using load. The issue is that currently AstGen uses field_val and elem_val and then relies on LLVM to fix it.

How can we tell if a field is a slice or an array?

My first implementation is in AstGen as you suggest, statg3 cannot be compiled.

@Vexu
Copy link
Member

Vexu commented Dec 23, 2022

elem_ptr works for all indexable types including arrays and slices.

@shwqf
Copy link
Contributor Author

shwqf commented Dec 25, 2022

I got COVID in China, too tired to continue to finish AstGen version.

Some comptime_field_ptr problems appear, looks not naive.

I remove the redundant C++ code, now this PR only work in non-Debug mode.

@Vexu @matu3ba @andrewrk

@shwqf shwqf requested review from andrewrk and Vexu and removed request for matu3ba, andrewrk and Vexu December 25, 2022 04:38
@shwqf shwqf requested review from matu3ba, Vexu and andrewrk and removed request for andrewrk, matu3ba and Vexu December 25, 2022 04:38
@shwqf shwqf requested review from matu3ba and Vexu and removed request for andrewrk and matu3ba January 1, 2023 06:18
@Vexu Vexu enabled auto-merge (rebase) May 10, 2023 17:53
@matu3ba

This comment was marked as resolved.

@matu3ba
Copy link
Contributor

matu3ba commented May 11, 2023

Thanks, I think you need to apply zig fmt:

+ echo Looking for non-conforming code formatting...
+ stage3-debug/bin/zig fmt --check .. --exclude ../test/cases/ --exclude ../build-debug
Looking for non-conforming code formatting...
../lib/std/c/darwin.zig
Error: Process completed with exit code 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Another array access performance issue.
5 participants