Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Compressed Instructions] Support compressed instructions for RVC #7

Merged
merged 4 commits into from Sep 22, 2021

Conversation

zhengxiaolinX
Copy link
Collaborator

Hi team,

Could I have a review of this patch of compressed instructions support based on current implementation? Thanks in advance.

This patch can introduce:

  • 4.4% performance up on average, evaluated on SPECjbb2005 and SPECjbb2015
  • 21% code size reduction in template interpreter generated code
  • 20%~25% code size reduction in C1 generated code, evaluated by a common SpringBoot program
  • 15%~20% code size reduction in C2 generated code, evaluated by a common SpringBoot program

Having passed related tests based on the current code base.

There are things about this patch:

  1. It is an implicit phase to scan all assembly instructions generated by Assembler::emit() and convert them as they can into compressed instructions - it should be because it is somewhat C-Ext's semantics and we cannot change instructions written by programmers explicitly one by one.
  2. About the _nc postfix of some of Assembler instructions: we know a bunch of places should be reserved for patching, where we cannot change them into compressed instructions. _nc is short for not compressed - with this, that instruction should keep its origin 4-byte form and remain uncompressed.
  3. About whether a machine supports c-ext and autodetection: please see this. It seems that we need to use a compressed instruction to test if it supports C-Ext or not. This part of the logic is what I cannot test because: we do not have a machine that does not support C-Ext; Qemu seems not able to easily turn off C-Ext as I see, and Spike behaves a bit weird for it will turn to a dead loop or something if you write a compressed instruction but run the simulator without C-Ext. I only tested it with a SIGILL to protect its function.
  4. There are things not easy to compress like MachBranchNodes. Please see the comments in my code - but it seems no potential benefits for compressing these MachBranchNodes after we have done some work for this so we can directly disable compression of these instructions until we think of a better plan. We think it is not trivial to support this.

This patch may not be a small one so it may take a while to get merged or something - but I was kind of hoping this could be done before the next merge, which contains the biased lock removal and stuff. Hope everything safe.

Thanks again for your great work.
Xiaolin

@zhengxiaolinX
Copy link
Collaborator Author

Fixed conflicts with #6 .

@zhengxiaolinX
Copy link
Collaborator Author

Rebased and fixed conflicts with #14 and #15.

Copy link
Collaborator

@yhzhu20 yhzhu20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,
I have some questions about this PR:

  1. Performence of Assembler::emit(): this function is called when each intruction is generated, , and instructions like "condition branch" are excuted frequently, but only beqz/bnez have corresponding compressed c.beqz/c.bnez, maybe part of the call is not needed .
  2. The default value of argument compressed is confusing, sometime is true, sometime false.
  3. Naming of compresseed instructions c_xx and xx_nc is confusing, too .
  4. Duplicate code in Assembler::emit_compressed_ld_st.
  5. It seems no need to modify instruction_size -> normale_instruction_size.

Thanks,
Yanhong

@zhengxiaolinX
Copy link
Collaborator Author

zhengxiaolinX commented Sep 8, 2021

Hello,
I have some questions about this PR:

  1. Performence of Assembler::emit(): this function is called when each intruction is generated, , and instructions like "condition branch" are excuted frequently, but only beqz/bnez have corresponding compressed c.beqz/c.bnez, maybe part of the call is not needed .
  2. The default value of argument compressed is confusing, sometime is true, sometime false.
  3. Naming of compresseed instructions c_xx and xx_nc is confusing, too .
  4. Duplicate code in Assembler::emit_compressed_ld_st.
  5. It seems no need to modify instruction_size -> normale_instruction_size.

Thanks,
Yanhong

Hi Yanhong,

Thanks for your reviews. Your points are undeniable. It is hard to make it an implicit phase base on the currently existing system. Code style is an annoying thing because I have to either hook instructions or fork a piece of code slice but only change one or two of them. Both are tiresome and I chose the former one. Also, especially because of the MachBranchNode problem, branching instructions like beq are handled differently from other instructions, hence causing some confusion.
I do not mind changing this patch so any suggestion is welcome. About the questions:

  1. Would you mind my reverting Assembler::emit() to the previous one, with adding a function like Assembler::emit_may_compress() to only transform instructions that are able to compress (explicitly)? This could solve this, yielding better performance. But as a trade-off, lots of macro definitions like INSN(add, 0b0110011, 0b000, 0b0000000); may become INSN(add, 0b0110011, 0b000, 0b0000000, false); like things because I need to explicitly determine which one is needed to transform. Also, branch instructions like beqz are treated separately, I will consider a way to modify them.
  2. It is because of the MachBranchNode problem. They cannot be easily treated and currently, we cannot compress them. In this case, the compressed is false. It is not easy to deal with because static conditional_branch_insn conditional_branches[] is a function pointer set. They need to have the same method signature. Let me have a try to modify it.
  3. c_xx are of course compressed instructions - but _nc means we cannot compress them. I am okay with any suggestions of naming if you have.
  4. I will see if I can have a refactoring of them.
  5. After introducing C-Ext, I think the instruction_size's semantics are changed. I will change it back if you insist - it is easy to change.

Thanks again for your reviews - I know it is not easy to review this patch because it may break the current code style.

Regards,
Xiaolin

@zhengxiaolinX
Copy link
Collaborator Author

zhengxiaolinX commented Sep 8, 2021

Hello,
I have some questions about this PR:

  1. Performence of Assembler::emit(): this function is called when each intruction is generated, , and instructions like "condition branch" are excuted frequently, but only beqz/bnez have corresponding compressed c.beqz/c.bnez, maybe part of the call is not needed .
  2. The default value of argument compressed is confusing, sometime is true, sometime false.
  3. Naming of compresseed instructions c_xx and xx_nc is confusing, too .
  4. Duplicate code in Assembler::emit_compressed_ld_st.
  5. It seems no need to modify instruction_size -> normale_instruction_size.

Thanks,
Yanhong

Hi Yanhong,

I have force-pushed my patch.

About the questions:

  1. changed the code as the proposal. No such performance problem such as blt, bgt etc. again.
  2. changed the code as the proposal. There are no different styles again.
  3. this remains unchanged now. Maybe we need one suggestion.
  4. done.
  5. changed back to the previous state.

The code style seems better. Thanks for your advice.

Would you mind having another review when available? Thanks.

Regards,
Xiaolin

…ith relocations (e.g. `INSN_ENTRY_RELOC`) by forcing to use 4-byte instructions to prevent protential problems
@zhengxiaolinX
Copy link
Collaborator Author

Testing on full tiers and no more personal changes for this patch.

Thanks,
Xiaolin

Copy link
Collaborator

@kuaiwei kuaiwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@zhengxiaolinX
Copy link
Collaborator Author

Gentle ping.

Could I get another review? Tests have passed. I can get to merge with JDK17 with this patch entering master.

Copy link
Collaborator

@yadongw yadongw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants