Skip to content

More closely mimic the Clang compilation #5543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: trunk
Choose a base branch
from

Conversation

ilya-biryukov
Copy link
Contributor

@ilya-biryukov ilya-biryukov commented May 27, 2025

Instead of using ASTUnit and tooling APIs, directly mimic what Clang does during the compilation.

Run the Clang frontend until the translation unit parsing is done, at which point switch back to Carbon to finish the Check phase and interface with Clang through ASTContext and Sema. In lower, finish the corresponding Clang compilation phase, i.e. CodeGen.

Clang does not have the corresponding APIs and instead provides a callback-based mechanism. To map this back to Carbon APIs, we create a separate thread that gives control back to Carbon through the callbacks, and then finishes the compilation when the Carbon code is done.

Although essentially a hack, this allows to easily fit into the Carbon codebase quickly and we should have an option of refactoring Clang code in LLVM upstream if this approach proves fruitful.

Replace this paragraph with a description of what this PR is changing or
adding, and why.

Closes #ISSUE

Instead of using ASTUnit and tooling APIs, directly mimic what Clang
does during the compilation.

Run the Clang frontend until the translation unit parsing is done, at
which point switch back to Carbon to finish the `Check` face and
interface with Clang through `ASTContext` and `Sema`. In lower, finish
the corresponding Clang compilation phase, i.e. CodeGen.

Clang does not have the corresponding APIs and instead provides a
callback-based mechanism. To map this back to Carbon APIs, we create a
separate thread that gives control back to Carbon through the callbacks,
and then finishes the compilation when the Carbon code is done.

Although essentially a hack, this allows to easily fit into the Carbon
codebase quickly and we should have an option of refactoring Clang code
in LLVM upstream if this approach proves fruitful.
@danakj
Copy link
Contributor

danakj commented May 27, 2025

Run the Clang frontend until the translation unit parsing is done, at which point switch back to Carbon to finish the Check face

Did you mean Check phase?

@ilya-biryukov
Copy link
Contributor Author

This is very raw: tests fail, an approach to codegen needs to be updated, the documentation and PR description needs to be improved.

Still posting this to get early feedback about the feasibility of this approach, especially the dance with multiple threads.
Sharing the same CompilerInstance, AST, Sema, etc between multiple threads is not unheard of, but, like other use-cases (e.g. clangd), it's unusual in its own way. LLVM should be prepared to handle that, though. Even if there are few global variables / thread locals left, they should be easily fixable.

@bricknerb bricknerb self-requested a review May 27, 2025 18:32
@ilya-biryukov
Copy link
Contributor Author

Did you mean Check phase?

🤦 I did, thanks for pointing that out.

@@ -24,6 +24,21 @@ cc_library(
],
)

cc_library(
name = "in_flight_clang",
hdrs = ["in_flight_clang.h"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[diff] reported by reviewdog 🐶

Suggested change
hdrs = ["in_flight_clang.h"],

name = "in_flight_clang",
hdrs = ["in_flight_clang.h"],
srcs = ["in_flight_clang.cpp"],
deps = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[diff] reported by reviewdog 🐶

Suggested change
deps = [
hdrs = ["in_flight_clang.h"],
deps = [

srcs = ["in_flight_clang.cpp"],
deps = [
"//common:check",
"@llvm-project//llvm:Support",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[diff] reported by reviewdog 🐶

Suggested change
"@llvm-project//llvm:Support",

"@llvm-project//clang:driver",
"@llvm-project//clang:frontend",
"@llvm-project//clang:frontend_tool",
],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[diff] reported by reviewdog 🐶

Suggested change
],
"@llvm-project//llvm:Support",
],

@ilya-biryukov
Copy link
Contributor Author

I've picked this up again and it's shaping up now, should be close to something I'd like to land.
In particular, we now use the code generator from Clang rather than creating our own and take the llvm::Module Clang produces.

I want to do one last polish of the code before sending this out for review, but it's now roughly what I want it to be in terms of behavior. One last bit that I want to do a little later is to get rid of the use of CodeGenerator for marking which functions are used and instead go through Clang frontend interfaces. But since it's pretty independent, I am thinking of doing it as a follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants