Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow addressing Registers by their number (e.g. x0) #51

Closed
superenginegit opened this issue Nov 6, 2023 · 9 comments
Closed

Allow addressing Registers by their number (e.g. x0) #51

superenginegit opened this issue Nov 6, 2023 · 9 comments

Comments

@superenginegit
Copy link
Contributor

Currently it is only possible to use a register by it's ABI-name. I think it would be nice, to be able to also use a register by it's number instead of it's name.

This is something I can implement and create a PR for it. But I wanted to ask if this is something you would be interested in having in this project before putting work into it.

@superlopuh
Copy link
Collaborator

There are a few things that make it a little trickier than it might seem, as there are some mechanisms, like register allocation, that can only reason about the ABI names. It would probably be a good idea to do change that logic to reason on x0 etc as that would help us avoid the confusion wrt fp and s0 both being the same as x8. I'd be happy to move to xk being the source of truth regarding identity, as long as we would still be able to pretty-print the ABI name in the IR, as that helps us reason about the logic. We would probably want to let the user specify what register should be printed in the assembly in case of ambiguity.

@superlopuh
Copy link
Collaborator

Thinking about it a bit more, I don't know of a good way to get roundtrippung to work wrt printing and parsing. We don't want to print the indices in the IR, as that would add a lot of noise, but without that I'm not sure how to recover the ABI name. What would you like this for? Is it for inspecting the IR when printed, or for reasoning about iR? Because there might be different solutions depending on the use-case you have in mind.

@superenginegit
Copy link
Contributor Author

The main use-case I had in mind was actually being able to have for example x0 instead of zero in my assembly code, as I sometimes write x0 etc. and when I go to run the code and it doesn't work it takes some time to change it.

I definitely think that changing the logic to work with the register number would be the way to go. For printing I actually prefer the ABI names almost always, maybe a command-line flag could be useful to select whether to print the name or the number, or maybe, as an optional parameter for reg.dump and mem.dump to select what format to use.

@AntonLydike
Copy link
Owner

Thanks for showing interest and offering to contribute @superenginegit!

I'm not 100% sure on what's the best way quite yet, but maybe a preprocessor pass during parsing to canonicalize register names could be an option?

@superenginegit
Copy link
Contributor Author

I would say that a preprocessor pass would be a good way to go about this. I can't really think of another way that would work as well.

@superlopuh
Copy link
Collaborator

superlopuh commented Nov 7, 2023

I think it would be really interesting to see what happens when the register is represented with a more complex structure that more accurately identifies it than just the string that we use currently. There are a few restrictions:

  1. it has to be immutable
  2. all register representations with the same bits in the binary encoding should be equal and have the same hash
  3. it has to support registers outside of the current binary encodings (we use j0, j1, ... for extra registers)

Maybe something like this could work?

@dataclass(frozen=True)
class RISCVRegister:
  index: int
  name_hint: str = field(default="", hash=False)

class RISCVRegisterType(Data[RISCVRegister], TypeAttribute, ABC):
  ...

@irdl_attr_definition
class IntRegisterType(RISCVRegisterType):
    name = "riscv.reg"

x0 = IntRegisterType(0)
zero = IntRegisterType(0, "zero")
assert x0 == zero
fp = IntRegisterType(8, "fp")
s0 = IntRegisterType(8, "s0")
j0 = IntRegisterType(-1, "j0")
f0 = FloatRegisterType(0)
ft0 = FloatRegisterType(0, "ft0")

If you have the time, could you try this out, and see what happens? Hopefully it would still preserve most of our logic for canonicalization and register allocation, while allowing for writing "x0" when that's more convenient.

@superenginegit
Copy link
Contributor Author

Sure, that looks great, I'll have time to test it out in the next few days.

@AntonLydike
Copy link
Owner

@superenginegit, excuse my colleague, he thought this was an issue on another project of ours. I guess a good place to put the canonicalization would be around line 30 here:

def parse_instruction(token: Token, args: Tuple[str], context: ParseContext):
if context.section is None:
context.new_section(".text", MemorySectionType.Instructions)
if context.section.type != MemorySectionType.Instructions:
raise ParseException(
"{} {} encountered in invalid context: {}".format(token, args, context)
)
ins = SimpleInstruction(
token.value, args, context.context, context.current_address()
)
context.section.data.append(ins)

And I probably wouldn't bother with creating a custom class for representing register names.

@superenginegit
Copy link
Contributor Author

Thanks, I'll get started then.

AntonLydike pushed a commit that referenced this issue Nov 28, 2023
This PR is for issue #51.

I have added a function to parse the arguments of an instruction, that is used to canonicalize the register names. This function could be expanded to do other things with the arguments in the future, if needed.

The canonicalization is done using a dict, that is used replace register-index-names with the according ABI-name.

Additionally I have added a file-check, that is just like the "hello world"-check but uses register-indices instead of ABI-names.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants