Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arm64Emitter: Improve MOVI2R #9412

Merged
merged 5 commits into from Feb 14, 2021
Merged

Conversation

JosJuice
Copy link
Member

@JosJuice JosJuice commented Jan 1, 2021

More or less a complete rewrite of the function which aims to be equally good or better for each given input, without relying on special cases like the old implementation did.

In particular, we now have more extensive support for MOVN, as mentioned in a TODO comment.

@JosJuice
Copy link
Member Author

JosJuice commented Jan 3, 2021

Now that PR #9423 exists, ideally this PR should be merged after that one.

@JosJuice JosJuice changed the title JitArm64: Improve MOVI2R Arm64Emitter: Improve MOVI2R Jan 13, 2021
@JosJuice JosJuice force-pushed the jitarm64-movi2r branch 2 times, most recently from ce22608 to f521822 Compare January 13, 2021 10:13
@JosJuice
Copy link
Member Author

Added ORR support, made us prefer ADRP+ADD over ADR+MOVK and ADRP+MOVK, and fixed a bug in my ADRP code.

@degasus
Copy link
Member

degasus commented Jan 27, 2021

If I calculated correctly, this code will call malloc 8 times per MOVI2R. I fear this might have a significat time hit in code compilation.

@JosJuice
Copy link
Member Author

What do you think about replacing std::vector<Part> with std::array<Part, 4> plus a size_t to keep track of the size?

@degasus
Copy link
Member

degasus commented Jan 27, 2021

I assume the std::array will make this a lot faster. I'm also fine with the vector if you tell me that it has no large impact in your profile ;)

@JosJuice
Copy link
Member Author

Done.

I don't really see the use of this. (Maybe in the past it
was used for when we need a constant number of instructions
for backpatching? But we don't use MOVI2R for that now.)
Source/Core/Common/Arm64Emitter.cpp Outdated Show resolved Hide resolved
Source/Core/Common/Arm64Emitter.cpp Outdated Show resolved Hide resolved
Source/Core/Common/Arm64Emitter.cpp Outdated Show resolved Hide resolved
Source/Core/Common/Arm64Emitter.cpp Show resolved Hide resolved
Source/Core/Common/Arm64Emitter.cpp Outdated Show resolved Hide resolved
More or less a complete rewrite of the function which aims
to be equally good or better for each given input, without
relying on special cases like the old implementation did.

In particular, we now have more extensive support for
MOVN, as mentioned in a TODO comment.
This tests for a bug with ADRP which was present in an
earlier version of this pull request.

Also adding the MOVI2R unit test to the VS build.
Copy link
Member

@leoetlino leoetlino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM. The unit tests still pass, I hope?

@JosJuice
Copy link
Member Author

Yes, the PowerPC tests pass. (The VertexLoader tests still aren't passing, like in master, but that has no relation to this PR.)

@leoetlino leoetlino merged commit c33d944 into dolphin-emu:master Feb 14, 2021
9 checks passed
@JosJuice JosJuice deleted the jitarm64-movi2r branch February 14, 2021 15:28
@dovart
Copy link

dovart commented Feb 15, 2021

@JosJuice not building on Arm64

../Source/Core/Common/Arm64Emitter.cpp: In member function ‘void Arm64Gen::ARM64XEmitter::MOVI2RImpl(Arm64Gen::ARM64Reg, T)’:
../Source/Core/Common/Arm64Emitter.cpp:2039:5: internal compiler error: Segmentation fault
     MOVZ,
     ^~~~
0x7f99269d23 __libc_start_main
	../csu/libc-start.c:308

@degasus
Copy link
Member

degasus commented Feb 15, 2021

@dovart This is an internal compiler error. So which compiler are you using? Which version of the compiler is it? Is this for android, windows or linux?

@dovart
Copy link

dovart commented Feb 15, 2021

@degasus gcc version 8.3.0 (Debian 8.3.0-6)
platform: pi

@dovart
Copy link

dovart commented Feb 15, 2021

@degasus was building fine before the commits from 2 days ago.

@JosJuice
Copy link
Member Author

I wonder if it works fine on newer versions of GCC... If so, maybe it's time to simply drop support for GCC 8. (See PR #9294)

@merryhime
Copy link
Contributor

merryhime commented Feb 15, 2021

I can reproduce the above ICE on gcc (Debian 8.3.0-6) 8.3.0 on a desktop AArch64 Debian buster system.
Installed gcc-10 (most recent package available via testing).
Unfortunately this ICE remains on the latest version of GCC available via debian's package repository (gcc (Debian 10.2.1-6) 10.2.1 20210110).
Will submit a GCC bugreport once I get an account on their bugzilla.


Reduced to minimal testcase:

Required command line:
g++ -Wshadow a.cpp

a.cpp:

class ARM64XEmitter
{
  void MOVZ();

  template <typename T>
  void MOVI2RImpl(T imm);
};

void ARM64XEmitter::MOVZ()
{
}

template <typename T>
void ARM64XEmitter::MOVI2RImpl(T imm)
{
  enum class Approach
  {
    MOVZ,
  };
}

Bug in shadow detection. Having the member function be templated is required for ICE to occur.


Submitted GCC bug. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99120

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants