Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using --no-codegen blocks --emit options #5820

Open
bew opened this issue Mar 13, 2018 · 11 comments
Open

Using --no-codegen blocks --emit options #5820

bew opened this issue Mar 13, 2018 · 11 comments

Comments

@bew
Copy link
Contributor

bew commented Mar 13, 2018

When we do: crystal build hello.cr --emit llvm-ir --no-codegen
It'll not generate the hello.ll file even though it was asked for.

This is because the emit options are handled all-in-one-place in the CompilationUnit#emit method

def emit(value : String, output_filename)
case value
when "asm"
compiler.target_machine.emit_asm_to_file llvm_mod, "#{output_filename}.s"
when "llvm-bc"
FileUtils.cp(bc_name, "#{output_filename}.bc")
when "llvm-ir"
llvm_mod.print_to_file "#{output_filename}.ll"
when "obj"
FileUtils.cp(object_name, "#{output_filename}.o")
end
end

Which is called in the codegen block:

@progress_tracker.stage("Codegen (bc+obj)") do
@progress_tracker.stage_progress_total = units.size
if units.size == 1
first_unit = units.first
first_unit.compile
reused << first_unit.name if first_unit.reused_previous_compilation?
if emit = @emit
first_unit.emit(emit, emit_base_filename || output_filename)
end
else
reused = codegen_many_units(program, units, target_triple)
end
end

At line 348.

I think it should emit what was asked for as soon as it get it, so the llvm-ir after it was built, the obj in the codegen phase, etc..

@bew
Copy link
Contributor Author

bew commented Apr 23, 2018

This also means that there is no way to dump llvm IR when there is a module validation failed error, that we would like to debug.

@straight-shoota
Copy link
Member

In order to debug #5972 I had to modify the compiler manually to disable module validation. So +1 to that!

@bew
Copy link
Contributor Author

bew commented Apr 23, 2018

While looking into that, I noticed that all the codegen phases (mainly Crystal to LLVM IR and LLVM IR to BC+OBJ) are done in the codegen method.
I suggest to separate the Crystal to LLVM IR codegen phase from the binary codegen (LLVM-IR to BC+OBJ), maybe name it IR codegen ?

Also the current emit options are in 2 categories:

  • after llvm ir generation: for llvm-ir (should be processed even on --no-codegen)
  • after binary codegen: for asm, llvm-bc, obj (not processed on --no-codegen)

Those are currently all handled and processed after binary codegen. By having the IR codegen separated, we could handle the different emit options at different time.

WDYT?

@bew
Copy link
Contributor Author

bew commented Apr 23, 2018

Also, when doing --no-codegen this would disable all codegen ? or only binary codegen? or how to configure that?

Maybe if there is --emit llvm-ir, it would only disable binary codegen (IR codegen will still be done, so that the IR can be dumped), and when not given, all codegen would be disabled.

@RX14 RX14 modified the milestone: Next Apr 25, 2018
@asterite
Copy link
Member

What's so bad about not using --no-codegen when using --emit? No codegen means the codegen phase doesn't run, and this trivially means no artifacts (.ll, .s, .o, final executable) will be produced.

@straight-shoota
Copy link
Member

straight-shoota commented Apr 29, 2018

@asterite True, these are all different stages of codegen. But what do you do if you want the .ll (or .s) but not .o?

--no-codegen --emit llvm-ir makes sense for this: It essentially says "skip codegen, but emit LLVM-IR" which can be interpreted as "do codegen but abort after LLVM-IR is dumped". There is no other logical interpretation of the combination of these two flags other than ignoring or failing when --emit is presented together with --no-codegen. But if it can be expressed that way, why shouldn't it be usable?

@asterite
Copy link
Member

True, these are all different stages of codegen. But what do you do if you want the .ll (or .s) but not .o?

You just wait a little more? :-)

It essentially says "skip codegen, but emit LLVM-IR" which can be interpreted as "do codegen but abort after LLVM-IR is dumped".

There's the confusion. Codegen means "run code to create the LLVM in memory". Emit comes after that. Of course we could change that, but I don't see the point. Just wait a few more seconds and you'll have it.

@asterite
Copy link
Member

Or, well, put another one, if someone wants to implement it, please send a PR (I won't)

@bew
Copy link
Contributor Author

bew commented Jun 3, 2018

Also, there could be a --no-binary-codegen in addition to --no-codegen, where the latter disables all codegen (ir & binary), and the former disable the binary codegen only, but allow the ir codegen to be done.

I think it allows more control, and remove weird edge cases like "always do IR codegen when --emit is llvm-ir even on --no-codegen".

@HertzDevil
Copy link
Contributor

For the Compiler Explorer it would be very useful if the assembly and the IR can both be generated without actually compiling / linking that into a binary or .o file. One reason is that link failures should be tolerated there unless the code is ultimately executed as requested by the user.

@mattrberry
Copy link
Contributor

Agree with HertzDevil above. It would be useful to be able to emit and inspect the IR without needing to link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants