Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8278322: riscv: Support RVC: compressed instructions #24

Closed
wants to merge 1 commit into from

Conversation

zhengxiaolinX
Copy link
Collaborator

@zhengxiaolinX zhengxiaolinX commented Dec 7, 2021

Hi team,

This patch support RISC-V RVC extension. It can introduce:

  • 21% code size reduction in template interpreter generated code
  • 20%~25% code size reduction in C1 generated code, evaluated by a common SpringBoot program
  • 15%~20% code size reduction in C2 generated code, evaluated by a common SpringBoot program

In my observation, the code size footprint could be reduced to nearly a level of the AArch64 back-end. About the performance, there seems a stable ~0.7% performance improvement on SPECjbb2015 on one HiFive Unleashed board, considering the code density increase. I think the performance aspect might be a speculative behavior on different hardware implementations because C910's performance might be better than that, but HiFive Unleashed may be more general.

Things about this patch:

  • If an instruction is compressible, then we will implicitly emit a 16-bit compressed instruction instead of the 32-bit instruction in Assembler.
  • About the _nc postfix of some of Assembler instructions: we know a bunch of places should be reserved for patching, where we cannot change them into compressed instructions. _nc is short for not compressible - with this, those instructions should keep their origin 4-byte form and remain uncompressed.
  • There are things not easy to compress like MachBranchNodes. Please see the comments in the code - currently this patch does not support this. We will support this improvement in patches coming afterward.

The macros after their expansion might be like:

void andr(Register Rd, Register Rs1, Register Rs2) {
+  {
+    Register src = noreg;
+    if (UseRVC && Rs1->is_compressed_valid() && Rs2->is_compressed_valid() &&
+        ((src = Rs1, Rs2 == Rd) || (src = Rs2, Rs1 == Rd))) {
+      and_c(Rd, src);
+      return;
+    }
+  }
  unsigned insn = 0;
  patch((address)&insn, 6, 0, 0b0110011);
  patch((address)&insn, 14, 12, 0b111);
  patch((address)&insn, 31, 25, 0b0000000);
  patch_reg((address)&insn, 7, Rd);
  patch_reg((address)&insn, 15, Rs1);
  patch_reg((address)&insn, 20, Rs2);
  emit(insn);
};

For further information, please see comments in assembler_riscv_cext.hpp.

--

This patch may need some time to acquire a review. I have been polishing this patch for quite a long time and it might seem stable under a bunch of full-tier tests and some tier1 jdk/hotspot jtreg tests. But I think we might mark it experimental at first, though it is turned on by default. The original patch, just for reference, might be quite different from the current one.

I am pleased to receive any suggestion.

Thanks,
Xiaolin


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8278322: riscv: Support RVC: compressed instructions

Contributors

  • Wei Kuai <kuaiwei.kw@alibaba-inc.com>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/riscv-port pull/24/head:pull/24
$ git checkout pull/24

Update a local copy of the PR:
$ git checkout pull/24
$ git pull https://git.openjdk.java.net/riscv-port pull/24/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24

View PR using the GUI difftool:
$ git pr show -t 24

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/riscv-port/pull/24.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 7, 2021

👋 Welcome back xlinzheng! A progress list of the required criteria for merging this PR into riscv-port will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 7, 2021
@zhengxiaolinX
Copy link
Collaborator Author

I would humbly recommend adding my colleague Wei Kuai(kuaiwei.kw@alibaba-inc.com) as a co-author of this patch, especially including facilitating me to troubleshoot problems that are not easy to address, out of his insightfulness, and also the support of the compression of MachBranchNodes coming afterward, etc. He has done quite a bunch of contributions to this patch.

/contributor add Wei Kuai kuaiwei.kw@alibaba-inc.com

@openjdk
Copy link

openjdk bot commented Dec 7, 2021

@zhengxiaolinX
Contributor Wei Kuai <kuaiwei.kw@alibaba-inc.com> successfully added.

@mlbridge
Copy link

mlbridge bot commented Dec 7, 2021

Webrevs

@yadongw
Copy link
Collaborator

yadongw commented Dec 7, 2021

About the _nc postfix of some of Assembler instructions: we know a bunch of places should be reserved for patching, where we cannot change them into compressed instructions. _nc is short for not compressible - with this, those instructions should keep their origin 4-byte form and remain uncompressed.

Why the compressed instructions can not be patched?

@zhengxiaolinX
Copy link
Collaborator Author

About the _nc postfix of some of Assembler instructions: we know a bunch of places should be reserved for patching, where we cannot change them into compressed instructions. _nc is short for not compressible - with this, those instructions should keep their origin 4-byte form and remain uncompressed.

Why the compressed instructions can not be patched?

Thank you for your reviews, yadong. For example, if we need to relocate oops to another place, we may emit a movptr, whose length is fixed, including lui+addi+slli+addi+slli+addi. For instructions like addis, if we compress them to a 16-byte form, we will face troubles when we want to relocate them to another place since the new address may not be presented by simple c.addis -- because c.addi could only present [-32, +31] but addi could present [-2048, +2047]. With this respect, it may be a good choice to remain their original forms for the patchable places, though quite conservative.

@RealFYang
Copy link
Member

Well, I don't think it's a good idea for assembler to emit 16-bit compressed instructions by default.
It's quite error prone and will make the code base hard to maintain.
I would suggest the opposite way, i.e., default to 32-bit normal instructions and use compressed instructions when it is absolutly safe.
I see some places where we could make use of compressed instructions.
One scenario would be C2 architecture description file. (C1 should be similar in theory)

Let's say,
6305 instruct addI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{
6306 match(Set dst (AddI src1 src2));
6307
6308 ins_cost(ALU_COST);
6309 format %{ "addw $dst, $src1, $src2\t#@addI_reg_reg" %}
6310
6311 ins_encode %{
6312 __ addw(as_Register($dst$$reg),
6313 as_Register($src1$$reg),
6314 as_Register($src2$$reg));
6315 %}
6316
6317 ins_pipe(ialu_reg_reg);
6318 %}

In this case, I think it should be safe to emit a compressed C.ADD when possible here for this instruct.
Then I think the "addw" assembler function would need one extra argument, say "compress", defaulting to false.
Then for the above instruct, we would call "addw" passing UseRVC as the "compress" argument.
We can then carry out the work incrementally if we go this way and this thus will also make code review easier.
I agree that we will miss some opportunities compared with the current solution, but that shouldn't be a big issue for now.

@yadongw
Copy link
Collaborator

yadongw commented Dec 8, 2021

About the _nc postfix of some of Assembler instructions: we know a bunch of places should be reserved for patching, where we cannot change them into compressed instructions. _nc is short for not compressible - with this, those instructions should keep their origin 4-byte form and remain uncompressed.

Why the compressed instructions can not be patched?

Thank you for your reviews, yadong. For example, if we need to relocate oops to another place, we may emit a movptr, whose length is fixed, including lui+addi+slli+addi+slli+addi. For instructions like addis, if we compress them to a 16-byte form, we will face troubles when we want to relocate them to another place since the new address may not be presented by simple c.addis -- because c.addi could only present [-32, +31] but addi could present [-2048, +2047]. With this respect, it may be a good choice to remain their original forms for the patchable places, though quite conservative.

The solution by this PR is hooking the assembler and making all instructions as compressed as possible except 'patchable' instructions, because we must make enough room of immutable constants embedded. Let us call it a "block" list solution. I don't think there's a proper way to ensure the list is complete, and if incomplete, the result may be a wierd exception/crash or unexpected behavior, and it will be difficult to review and debug.
How about we make a "white" list to compressed instruction generation. You can make a compressed region where you think it's safe to do so,for example:

{
  CompressRegion cr;
  __ addi(rd, rd, 1);
}

@zhengxiaolinX
Copy link
Collaborator Author

zhengxiaolinX commented Dec 9, 2021

Thanks for your reviews, Felix and Yadong, and also thanks for your advice -- also quite sorry for the late reply.

I think I totally understand your concerns.

Firstly please let me talk about my opinion about this.
Most compressible instructions could be safely compressed, say, ~25% of total RISCV instructions, except only several places like

  1. patchable code, for instance: la_patchable related code;
  2. and several places requiring a fixed length like BoxLockNode.
    We have found these places, excluded them in this patch, and wrapped them for developers into ld_patchable, jalr_patchable in this patch. If you feel unsafe for misuse from developers, maybe we can further carry out some strong guarantees to prevent this from happening. We can disable UseRVC by default for now, the jvm shoud be same behavior as before. And after enough testing, we can have more confidence to enable it.

I think your suggestions are indeed reasonable in your aspects, for safety.


About our first scheme:

We may mark some instructions as compressible by adding function arguments. But compressed instructions can be used in almost every place in the code cache, like C1, C2, interpreter, and stub code. The range is quite big -- for instance, after some time we may want to compress some intrinsics, say, C2_MacroAssembler::string_indexof_char_short().

After addi(Rd, Rs, imm) becomes addi(Rd, Rs, imm, bool compressible), it will be

bind(MATCH1);
addi(index, index, 1);   ->    addi(index, index, 1, true);
j(MATCH);                ->    j(MATCH, true);

bind(MATCH2);
addi(index, index, 2);   ->    addi(index, index, 2, true);
j(MATCH);                ->    j(MATCH, true)

bind(MATCH3);
addi(index, index, 3);   ->   ...
j(MATCH);

bind(MATCH4);
addi(index, index, 4);
j(MATCH);

bind(MATCH5);
addi(index, index, 5);
j(MATCH);

We might not add extra explicit arguments to them since this may make maintenance quite hard. :-)


About our second scheme:

Considering the CompressibleRegion solution, this could indeed, largely expand the range of code included. But we may face some situations:

As code function and features begin to complete, we may continue to make code compress further denser afterward. If we want to use this region to include a quite large range of function, of which we think is safe to compress.

#define __ _masm->
intrinsic A()
{
	CompressibleRegion cr;                 // The CompressibleRegion
	
	... balabala ...
	... balabala ...
	__ macroassembler_function();          // Here may include a tiny 'la_patchable'
	... balabala ...
	... balabala ...
}

Inside the macroassembler_function(), a la_patchable might be hidden somewhere...
What should we do? It seems that we need another UncompressibleRegion ur to exclude such a smaller region.

#define __ _masm->
intrinsic A()
{
	CompressibleRegion cr;
	
	... balabala ...
	... balabala ...
	{
	    UncompressibleRegion ur;             // Add one UncompressibleRegion
            __ macroassembler_function();
	}
	... balabala ...
	... balabala ...
}

As the inherent laws of things, the number of CompressibleRegion will inevitably become more and more; the same pattern emerges one by one; hence, the CompressibleRegion may fly to every part of the world; hmm, we may think naturally: why don't we remove the CompressibleRegion and remain the UncompressibleRegion? Because it seems most of the code could get compressed, and only several places could not be compressed. Why not maintain a blacklist for them? ...

So comes

#define __ _masm->
intrinsic A()
{
 	... balabala ... (automatically compressed)
	... balabala ... (automatically compressed)
	{
		__ macroassembler_function_nc();               // nc things :-)
	}
	... balabala ... (automatically compressed)
	... balabala ... (automatically compressed)
}

I think finally it may become the solution of this patch (i.e. the blacklist)... :-)

I hope this analysis is reasonable and it could bring you some smiles.

In that the blacklist must be the core of exclusions, I might recommend focusing on the maintenance of the blacklist, because anyway, we might face the blacklist in the future. To relieve your concerns, maybe making UseRVC false as default for now is a better plan because it is an one-click switch.

But, to step back, I totally understand your points as well. If you still consider the whitelist scheme might be a good scheme for you, I will choose to use your scheme for now, and add the CompressibleRegion and UncompressibleRegion gradually. But I assume maybe turning UseRVC as false could better solve this?


About splitting the patch:

Let's come back to our topic. I have considered from your angle and I begin to realize this design might be quite counterintuitive for guys who always use the normal 4-byte instructions to work; and also because of this 'blacklist' mode, which may cause a burden for reviewers to review. But I believe the primary part (the definitions of C-Ext, like c.j/c.beqz/c.add... stuff) could be trivial and this could be separated as a primitive patch -- it may relieve the burden for the reviewers.


About an ideal but specious solution:

In fact, I think the ideal solution might be to use another brand-new pass to handle this compression logic, at the end of C1's Compilation::emit_code_body, C2's PhaseOutput, and other BufferBlobs' end of emission, to compress instructions in batches after all instructions have emitted to CodeBlobs. I think it will prevent all the above-mentioned issues: no hard-to-understand _nc things, no black and white lists, and programmers have no need to care about whether there is C-Ext or not. But it might be very difficult and specious:

  • First, we need to re-do all relocations, check all paddings like CodeEntryAlignment, OptoLoopAlignment... also self-written __ align()s. If developers want to add ten nops to align addresses, I think we lack knowledge about these things just before code installation and make the code not 'what they see is what they get'.
  • Second, we need to re-do all OopMaps and other related stuff. These delicate data structures are precisely calculated by C2's BuildOopMaps(). If we insert a phase at the end of all phases to compress code, it will be no less than doing these stuff once again. Also, for modifying those maps and debuginfos, maybe we need to add APIs to change the data inside them. This will further cause invasions to the public codebase just because RISC-V has this extension -- maybe other guys will not allow us to do such a big change. Besides, I speculate that it should be limited to our back-end only.

I consider this strategy has other limits except for those above-mentioned ones. So I did not choose this plan.


It might be so energy-consuming for me to write so many words in English to explain my reasoning and write analysis in detail. If there are some typos or errors, please don't mind. I wonder if is there a need to arrange a small meeting or not -- we may need to split the patch, discuss friendly about solutions and precautions, and gradually reach a consensus. I think for this patch, we might finally reach that place.

Thanks again for your reviews and advice.

Sincerely,
Xiaolin

@RealFYang
Copy link
Member

RealFYang commented Dec 9, 2021

About our first scheme:

We may mark some instructions as compressible by adding function arguments. But compressed instructions can be used in almost every place in the code cache, like C1, C2, interpreter, and stub code. The range is quite big -- for instance, after some time we may want to compress some intrinsics, say, C2_MacroAssembler::string_indexof_char_short().

After addi(Rd, Rs, imm) becomes addi(Rd, Rs, imm, bool compressible), it will be

bind(MATCH1);
addi(index, index, 1);   ->    addi(index, index, 1, true);
j(MATCH);                ->    j(MATCH, true);

bind(MATCH2);
addi(index, index, 2);   ->    addi(index, index, 2, true);
j(MATCH);                ->    j(MATCH, true)

bind(MATCH3);
addi(index, index, 3);   ->   ...
j(MATCH);

bind(MATCH4);
addi(index, index, 4);
j(MATCH);

bind(MATCH5);
addi(index, index, 5);
j(MATCH);

That's exactly what I mean :-)
And this could be carried out incrementally, module by module.

We might not add extra explicit arguments to them since this may make maintenance quite hard. :-)

No. Passing one extra argument here won't affect future code maintainance in anyway here.
In fact, I think this will make reviewer/maintainer's life easier since this makes it explict the places where it's safe to do compress.

@zhengxiaolinX
Copy link
Collaborator Author

About our first scheme:

We may mark some instructions as compressible by adding function arguments. But compressed instructions can be used in almost every place in the code cache, like C1, C2, interpreter, and stub code. The range is quite big -- for instance, after some time we may want to compress some intrinsics, say, C2_MacroAssembler::string_indexof_char_short().
After addi(Rd, Rs, imm) becomes addi(Rd, Rs, imm, bool compressible), it will be

bind(MATCH1);
addi(index, index, 1);   ->    addi(index, index, 1, true);
j(MATCH);                ->    j(MATCH, true);

bind(MATCH2);
addi(index, index, 2);   ->    addi(index, index, 2, true);
j(MATCH);                ->    j(MATCH, true)

bind(MATCH3);
addi(index, index, 3);   ->   ...
j(MATCH);

bind(MATCH4);
addi(index, index, 4);
j(MATCH);

bind(MATCH5);
addi(index, index, 5);
j(MATCH);

That's exactly what I mean :-) And this could be carried out incrementally, module by module.

We might not add extra explicit arguments to them since this may make maintenance quite hard. :-)

No. Passing one extra argument here won't affect future code maintainance in anyway here. In fact, I think this will make reviewer/maintainer's life easier since this makes it explict the places where it's safe to do compress.

Thanks for your quick response, Felix. In fact, I could not realize the philosophy inside this strategy -- so just one double-check: after this, if we define #define COMPRESSIBLE true here, I think nearly all the RISC-V backend will be full of inst(Rd, Rs, ..., COMPRESSIBLE) everywhere, which means the whole backend's code will be fully coupling with a single RISC-V extension. Hmm... it seems quite unacceptable in my opinion -- could you please simply explain it further for me?

@RealFYang
Copy link
Member

About our first scheme:

We may mark some instructions as compressible by adding function arguments. But compressed instructions can be used in almost every place in the code cache, like C1, C2, interpreter, and stub code. The range is quite big -- for instance, after some time we may want to compress some intrinsics, say, C2_MacroAssembler::string_indexof_char_short().
After addi(Rd, Rs, imm) becomes addi(Rd, Rs, imm, bool compressible), it will be

bind(MATCH1);
addi(index, index, 1);   ->    addi(index, index, 1, true);
j(MATCH);                ->    j(MATCH, true);

bind(MATCH2);
addi(index, index, 2);   ->    addi(index, index, 2, true);
j(MATCH);                ->    j(MATCH, true)

bind(MATCH3);
addi(index, index, 3);   ->   ...
j(MATCH);

bind(MATCH4);
addi(index, index, 4);
j(MATCH);

bind(MATCH5);
addi(index, index, 5);
j(MATCH);

That's exactly what I mean :-) And this could be carried out incrementally, module by module.

We might not add extra explicit arguments to them since this may make maintenance quite hard. :-)

No. Passing one extra argument here won't affect future code maintainance in anyway here. In fact, I think this will make reviewer/maintainer's life easier since this makes it explict the places where it's safe to do compress.

Thanks for your quick response, Felix. In fact, I could not realize the philosophy inside this strategy -- so just one double-check: after this, if we define #define COMPRESSIBLE true here, I think nearly all the RISC-V backend will be full of inst(Rd, Rs, ..., COMPRESSIBLE) everywhere, which means the whole backend's code will be fully coupling with a single RISC-V extension. Hmm... it seems quite unacceptable in my opinion -- could you please simply explain it further for me?

In fact, I am not expecting we catch all the possible places in the backend converting to compressed instructions.
I guess maybe the best places to do this would be C1 and C2.
It doesn't make much sense to me to do this kind of thing for code stubs, template interpreter, runtime, etc.
We might churn too much without little gain.
Code size might be an issue for 32-bit RV32G embedded systems, but I feel it's less likely an big issue for 64-bit systems like RV64G. I even doubt future chip vendors would ship this RVC feature for RV64G systems.

@RealFYang
Copy link
Member

Based on my last comment, I think we can then further consider the "CompressibleRegion" solution suggested by Yadong.
That way we will avoid adding the COMPRESSIBLE argument in many places. Just leave aside the uncertain places.

@zhengxiaolinX
Copy link
Collaborator Author

Thank you for your explanation, Felix. :-) Seems we finally reach a consensus now.

I guess maybe the best places to do this would be C1 and C2.

I think I could get your meaning this time.

I think we can then further consider the "CompressibleRegion" solution suggested by Yadong.

Yadong's plan works for me. I am quite in favor of his strategy.

I will implement this approach ASAP. Maybe next week I might open another PR including this implementation and the definition of compressed instructions presented in the assembler_riscv_cext.hpp part for a better review. I wonder if this is okay for you.

I am going to close this PR, save it for future reference, and may open another PR and start from scratch. Before that, just one more question: would you mind my keeping definitions of C-Ext instructions into one separate file like assembler_riscv_cext.hpp, as the current implementation? I think we might better keep different RISCV extensions into different files.

@RealFYang
Copy link
Member

Thank you for your explanation, Felix. :-) Seems we finally reach a consensus now.

I guess maybe the best places to do this would be C1 and C2.

I think I could get your meaning this time.

I think we can then further consider the "CompressibleRegion" solution suggested by Yadong.

Yadong's plan works for me. I am quite in favor of his strategy.

I am happy that we are agreed on this solution ;-)
We need to be safe at this very early stage, so I would suggest UseRVC disabled by default like UseRVV even if the hardware RVC extension is there. I hope you could understand that.
In the future, we will consider auto-enable UseRVC and UseRVV after enough testing and evaluation.

I will implement this approach ASAP. Maybe next week I might open another PR including this implementation and the definition of compressed instructions presented in the assembler_riscv_cext.hpp part for a better review. I wonder if this is okay for you.

I am going to close this PR, save it for future reference, and may open another PR and start from scratch. Before that, just one more question: would you mind my keeping definitions of C-Ext instructions into one separate file like assembler_riscv_cext.hpp, as the current implementation? I think we might better keep different RISCV extensions into different files.

Yes, I think assembler_riscv_c.hpp will do here.

@zhengxiaolinX
Copy link
Collaborator Author

Thank you for your explanation, Felix. :-) Seems we finally reach a consensus now.

I guess maybe the best places to do this would be C1 and C2.

I think I could get your meaning this time.

I think we can then further consider the "CompressibleRegion" solution suggested by Yadong.

Yadong's plan works for me. I am quite in favor of his strategy.

I am happy that we are agreed on this solution ;-) We need to be safe at this very early stage, so I would suggest UseRVC disabled by default like UseRVV even if the hardware RVC extension is there. I hope you could understand that. In the future, we will consider auto-enable UseRVC and UseRVV after enough testing and evaluation.

I will implement this approach ASAP. Maybe next week I might open another PR including this implementation and the definition of compressed instructions presented in the assembler_riscv_cext.hpp part for a better review. I wonder if this is okay for you.
I am going to close this PR, save it for future reference, and may open another PR and start from scratch. Before that, just one more question: would you mind my keeping definitions of C-Ext instructions into one separate file like assembler_riscv_cext.hpp, as the current implementation? I think we might better keep different RISCV extensions into different files.

Yes, I think assembler_riscv_c.hpp will do here.

Totally agree --
Thanks again for your reviews.
Gonna close this PR and the related issue.

@yadongw
Copy link
Collaborator

yadongw commented Dec 9, 2021

Thank you for your explanation, Felix. :-) Seems we finally reach a consensus now.

I guess maybe the best places to do this would be C1 and C2.

I think I could get your meaning this time.

I think we can then further consider the "CompressibleRegion" solution suggested by Yadong.

Yadong's plan works for me. I am quite in favor of his strategy.

I am happy that we are agreed on this solution ;-) We need to be safe at this very early stage, so I would suggest UseRVC disabled by default like UseRVV even if the hardware RVC extension is there. I hope you could understand that. In the future, we will consider auto-enable UseRVC and UseRVV after enough testing and evaluation.

I will implement this approach ASAP. Maybe next week I might open another PR including this implementation and the definition of compressed instructions presented in the assembler_riscv_cext.hpp part for a better review. I wonder if this is okay for you.
I am going to close this PR, save it for future reference, and may open another PR and start from scratch. Before that, just one more question: would you mind my keeping definitions of C-Ext instructions into one separate file like assembler_riscv_cext.hpp, as the current implementation? I think we might better keep different RISCV extensions into different files.

Yes, I think assembler_riscv_c.hpp will do here.

Totally agree -- Thanks again for your reviews. Gonna close this PR and the related issue.

@zhengxiaolinX I think a seperate file for rvc is a good choice, and maybe we should do the same thing for rvv later.
By the way, the Open/Closed principle may be suitable for your design. We can use a RVC_Assembler extend the Assembler, override all instructions that may be compressed, and then deliver to the super Assembler if the "compression" failed. Maybe you can swap the context easier when entering or leaving the "CompressedRegion". Just for suggestion, and thank you for your big effort.

@zhengxiaolinX
Copy link
Collaborator Author

zhengxiaolinX commented Dec 9, 2021

Thank you for your explanation, Felix. :-) Seems we finally reach a consensus now.

I guess maybe the best places to do this would be C1 and C2.

I think I could get your meaning this time.

I think we can then further consider the "CompressibleRegion" solution suggested by Yadong.

Yadong's plan works for me. I am quite in favor of his strategy.

I am happy that we are agreed on this solution ;-) We need to be safe at this very early stage, so I would suggest UseRVC disabled by default like UseRVV even if the hardware RVC extension is there. I hope you could understand that. In the future, we will consider auto-enable UseRVC and UseRVV after enough testing and evaluation.

I will implement this approach ASAP. Maybe next week I might open another PR including this implementation and the definition of compressed instructions presented in the assembler_riscv_cext.hpp part for a better review. I wonder if this is okay for you.
I am going to close this PR, save it for future reference, and may open another PR and start from scratch. Before that, just one more question: would you mind my keeping definitions of C-Ext instructions into one separate file like assembler_riscv_cext.hpp, as the current implementation? I think we might better keep different RISCV extensions into different files.

Yes, I think assembler_riscv_c.hpp will do here.

Totally agree -- Thanks again for your reviews. Gonna close this PR and the related issue.

@zhengxiaolinX I think a seperate file for rvc is a good choice, and maybe we should do the same thing for rvv later. By the way, the Open/Closed principle may be suitable for your design. We can use a RVC_Assembler extend the Assembler, override all instructions that may be compressed, and then deliver to the super Assembler if the "compression" failed. Maybe you can swap the context easier when entering or leaving the "CompressedRegion". Just for suggestion, and thank you for your big effort.

Thanks for your suggestion, Yadong. In fact I nearly have the same consideration about this. RISC-V is a modularized ISA with pluggable extensions. Now that extensions will be supported one by one, I think the codebase will inevitably be full of lots of unreadable if (A-Ext enabled)...else... if (B-Ext enabled)...else... unified in one VM in order to handle all possible cases. The runtime costs of performance will gradually get bigger and bigger as time goes. So we might do the same modularized things in the Assembler as your words from the beginning to prevent regrets, for at last code might be extremely difficult to get refactored. I have noticed that at every compressible instruction I need to check one UseRVC now, but the workload might be so big to change so I didn't determine if I should do that. After your words, I realize if we are thinking about the same thing, we might start to consider changes from the beginning. Thanks again for your advice and I will take this into careful consideration.

@zhengxiaolinX zhengxiaolinX mentioned this pull request Jan 12, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfr Pull request is ready for review
3 participants