Skip to content

Commit

Permalink
[lldb][AArch64] Add SME2's ZT0 register (#70205)
Browse files Browse the repository at this point in the history
SME2 is documented as part of the main SME supplement:
https://developer.arm.com/documentation/ddi0616/latest/

The one change for debug is this new ZT0 register. This register
contains data to be used with new table lookup instructions.
It's size is always 512 bits (not scalable) and can be
interpreted in many different ways depending on the instructions
that use it. 

The kernel has implemented this as a new register set containing
this single register. It always returns register data (with no header,
unlike ZA which does have a header).

https://docs.kernel.org/arch/arm64/sme.html

ZT0 is only active when ZA is active (when SVCR.ZA is 1). In the 
inactive state the kernel returns 0s for its contents. Therefore
lldb doesn't need to create 0s like it does for ZA. 

However, we will skip restoring the value of ZT0 if we know that
ZA is inactive. As writing to an inactive ZT0 sets SVCR.ZA to 1,
which is not desireable as it would activate ZA also. Whether
SVCR.ZA is set will be determined only by the ZA data we restore.

Due to this, I've added a new save/restore kind SME2. This is easier
than accounting for the variable length ZA in the SME data. We'll only
save an SME2 data block if ZA is active. If it's not we can get fresh
0s back from the kernel for ZT0 anyway so there's nothing for us to
restore.

This new register will only show up if the system has SME2 therefore
the SME set presented to the user may change, and I've had to account
for that in in a few places.

I've referred to it internally as simply "ZT" as the kernel does in
NT_ARM_ZT, but the architecture refers to the specific register as "ZT0"
so that's what you'll see in lldb.

```
(lldb) register read -s 6
Scalable Matrix Extension Registers:
      svcr = 0x0000000000000000
       svg = 0x0000000000000004
        za = {0x00 <...> 0x00}
       zt0 = {0x00 <...> 0x00}
```
  • Loading branch information
DavidSpickett committed Nov 1, 2023
1 parent aaba376 commit b8150c8
Show file tree
Hide file tree
Showing 13 changed files with 312 additions and 48 deletions.
4 changes: 4 additions & 0 deletions lldb/packages/Python/lldbsuite/test/lldbtest.py
Expand Up @@ -1271,6 +1271,10 @@ def isAArch64SVE(self):
def isAArch64SME(self):
return self.isAArch64() and "sme" in self.getCPUInfo()

def isAArch64SME2(self):
# If you have sme2, you also have sme.
return self.isAArch64() and "sme2" in self.getCPUInfo()

def isAArch64SMEFA64(self):
# smefa64 allows the use of the full A64 instruction set in streaming
# mode. This is required by certain test programs to setup register
Expand Down
140 changes: 126 additions & 14 deletions lldb/source/Plugins/Process/Linux/NativeRegisterContextLinux_arm64.cpp
Expand Up @@ -45,6 +45,11 @@
#define NT_ARM_ZA 0x40c /* ARM Scalable Matrix Extension, Array Storage */
#endif

#ifndef NT_ARM_ZT
#define NT_ARM_ZT \
0x40d /* ARM Scalable Matrix Extension 2, lookup table register */
#endif

#ifndef NT_ARM_PAC_MASK
#define NT_ARM_PAC_MASK 0x406 /* Pointer authentication code masks */
#endif
Expand Down Expand Up @@ -104,6 +109,17 @@ NativeRegisterContextLinux::CreateHostNativeRegisterContextLinux(
.Success())
opt_regsets.Set(RegisterInfoPOSIX_arm64::eRegsetMaskZA);

// SME's ZT0 is a 512 bit register.
std::array<uint8_t, 64> zt_reg;
ioVec.iov_base = zt_reg.data();
ioVec.iov_len = zt_reg.size();
regset = NT_ARM_ZT;
if (NativeProcessLinux::PtraceWrapper(PTRACE_GETREGSET,
native_thread.GetID(), &regset,
&ioVec, zt_reg.size())
.Success())
opt_regsets.Set(RegisterInfoPOSIX_arm64::eRegsetMaskZT);

NativeProcessLinux &process = native_thread.GetProcess();

std::optional<uint64_t> auxv_at_hwcap =
Expand Down Expand Up @@ -148,6 +164,7 @@ NativeRegisterContextLinux_arm64::NativeRegisterContextLinux_arm64(
::memset(&m_pac_mask, 0, sizeof(m_pac_mask));
::memset(&m_tls_regs, 0, sizeof(m_tls_regs));
::memset(&m_sme_pseudo_regs, 0, sizeof(m_sme_pseudo_regs));
std::fill(m_zt_reg.begin(), m_zt_reg.end(), 0);

m_mte_ctrl_reg = 0;

Expand All @@ -164,6 +181,7 @@ NativeRegisterContextLinux_arm64::NativeRegisterContextLinux_arm64(
m_pac_mask_is_valid = false;
m_mte_ctrl_is_valid = false;
m_tls_is_valid = false;
m_zt_buffer_is_valid = false;

// SME adds the tpidr2 register
m_tls_size = GetRegisterInfo().IsSSVEPresent() ? sizeof(m_tls_regs)
Expand Down Expand Up @@ -355,6 +373,15 @@ NativeRegisterContextLinux_arm64::ReadRegister(const RegisterInfo *reg_info,
// storage. Therefore its effective byte offset is always 0 even if it
// isn't 0 within the SME register set.
src = (uint8_t *)GetZABuffer() + GetZAHeaderSize();
} else if (GetRegisterInfo().IsSMERegZT(reg)) {
// Unlike ZA, the kernel will return register data for ZT0 when ZA is not
// enabled. This data will be all 0s so we don't have to invent anything
// like we did for ZA.
error = ReadZT();
if (error.Fail())
return error;

src = (uint8_t *)GetZTBuffer();
} else {
error = ReadSMESVG();
if (error.Fail())
Expand Down Expand Up @@ -552,22 +579,31 @@ Status NativeRegisterContextLinux_arm64::WriteRegister(

return WriteTLS();
} else if (IsSME(reg)) {
if (!GetRegisterInfo().IsSMERegZA(reg))
return Status("Writing to SVG or SVCR is not supported.");
if (GetRegisterInfo().IsSMERegZA(reg)) {
error = ReadZA();
if (error.Fail())
return error;

error = ReadZA();
if (error.Fail())
return error;
// ZA is part of the SME set but not stored with the other SME registers.
// So its byte offset is effectively always 0.
dst = (uint8_t *)GetZABuffer() + GetZAHeaderSize();
::memcpy(dst, reg_value.GetBytes(), reg_info->byte_size);

// ZA is part of the SME set but not stored with the other SME registers.
// So its byte offset is effectively always 0.
dst = (uint8_t *)GetZABuffer() + GetZAHeaderSize();
::memcpy(dst, reg_value.GetBytes(), reg_info->byte_size);
// While this is writing a header that contains a vector length, the only
// way to change that is via the vg register. So here we assume the length
// will always be the current length and no reconfigure is needed.
return WriteZA();
} else if (GetRegisterInfo().IsSMERegZT(reg)) {
error = ReadZT();
if (error.Fail())
return error;

// While this is writing a header that contains a vector length, the only
// way to change that is via the vg register. So here we assume the length
// will always be the current length and no reconfigure is needed.
return WriteZA();
dst = (uint8_t *)GetZTBuffer();
::memcpy(dst, reg_value.GetBytes(), reg_info->byte_size);

return WriteZT();
} else
return Status("Writing to SVG or SVCR is not supported.");
}

return Status("Failed to write register value");
Expand All @@ -580,7 +616,8 @@ enum RegisterSetType : uint32_t {
// Pointer authentication registers are read only, so not included here.
MTE,
TLS,
SME, // ZA only, SVCR and SVG are pseudo registers.
SME, // ZA only, because SVCR and SVG are pseudo registers.
SME2, // ZT only.
};

static uint8_t *AddRegisterSetType(uint8_t *dst,
Expand Down Expand Up @@ -624,6 +661,21 @@ NativeRegisterContextLinux_arm64::CacheAllRegisters(uint32_t &cached_size) {
error = ReadZA();
if (error.Fail())
return error;

// We will only be restoring ZT data if ZA is active. As writing to an
// inactive ZT enables ZA, which may not be desireable.
if (
// If we have ZT0, or in other words, if we have SME2.
GetRegisterInfo().IsZTPresent() &&
// And ZA is active, which means that ZT0 is also active.
m_za_header.size > sizeof(m_za_header)) {
cached_size += sizeof(RegisterSetType) + GetZTBufferSize();
// The kernel handles an inactive ZT0 for us, and it will read as 0s if
// inactive (unlike ZA where we fake that behaviour).
error = ReadZT();
if (error.Fail())
return error;
}
}

// If SVE is enabled we need not copy FPR separately.
Expand Down Expand Up @@ -731,6 +783,19 @@ Status NativeRegisterContextLinux_arm64::ReadAllRegisterValues(
m_za_header.size);
}

// If ZT0 is present and we are going to be restoring an active ZA (which
// implies an active ZT0), then restore ZT0 after ZA has been set. This
// prevents us enabling ZA accidentally after the restore of ZA disabled it.
// If we leave ZA/ZT0 inactive and read ZT0, the kernel returns 0s. Therefore
// there's nothing for us to restore if ZA was originally inactive.
if (
// If we have SME2 and therefore ZT0.
GetRegisterInfo().IsZTPresent() &&
// And ZA is enabled.
m_za_header.size > sizeof(m_za_header))
dst = AddSavedRegisters(dst, RegisterSetType::SME2, GetZTBuffer(),
GetZTBufferSize());

if (GetRegisterInfo().IsMTEPresent()) {
dst = AddSavedRegisters(dst, RegisterSetType::MTE, GetMTEControl(),
GetMTEControlSize());
Expand Down Expand Up @@ -874,6 +939,14 @@ Status NativeRegisterContextLinux_arm64::WriteAllRegisterValues(
error = ReadZA();
src += GetZABufferSize();
break;
case RegisterSetType::SME2:
// Doing this would activate an inactive ZA, however we will only get here
// if the state we are restoring had an active ZA. Restoring ZT0 will
// always come after restoring ZA.
error = RestoreRegisters(
GetZTBuffer(), &src, GetZTBufferSize(), m_zt_buffer_is_valid,
std::bind(&NativeRegisterContextLinux_arm64::WriteZT, this));
break;
}

if (error.Fail())
Expand Down Expand Up @@ -1063,6 +1136,7 @@ void NativeRegisterContextLinux_arm64::InvalidateAllRegisters() {
m_pac_mask_is_valid = false;
m_mte_ctrl_is_valid = false;
m_tls_is_valid = false;
m_zt_buffer_is_valid = false;

// Update SVE and ZA registers in case there is change in configuration.
ConfigureRegisterContext();
Expand Down Expand Up @@ -1300,10 +1374,48 @@ Status NativeRegisterContextLinux_arm64::WriteZA() {

m_za_buffer_is_valid = false;
m_za_header_is_valid = false;
// Writing to ZA may enable ZA, which means ZT0 may change too.
m_zt_buffer_is_valid = false;

return WriteRegisterSet(&ioVec, GetZABufferSize(), NT_ARM_ZA);
}

Status NativeRegisterContextLinux_arm64::ReadZT() {
Status error;

if (m_zt_buffer_is_valid)
return error;

struct iovec ioVec;
ioVec.iov_base = GetZTBuffer();
ioVec.iov_len = GetZTBufferSize();

error = ReadRegisterSet(&ioVec, GetZTBufferSize(), NT_ARM_ZT);
m_zt_buffer_is_valid = error.Success();

return error;
}

Status NativeRegisterContextLinux_arm64::WriteZT() {
Status error;

error = ReadZT();
if (error.Fail())
return error;

struct iovec ioVec;
ioVec.iov_base = GetZTBuffer();
ioVec.iov_len = GetZTBufferSize();

m_zt_buffer_is_valid = false;
// Writing to an inactive ZT0 will enable ZA as well, which invalidates our
// current copy of it.
m_za_buffer_is_valid = false;
m_za_header_is_valid = false;

return WriteRegisterSet(&ioVec, GetZTBufferSize(), NT_ARM_ZT);
}

void NativeRegisterContextLinux_arm64::ConfigureRegisterContext() {
// ConfigureRegisterContext gets called from InvalidateAllRegisters
// on every stop and configures SVE vector length and whether we are in
Expand Down
Expand Up @@ -83,6 +83,7 @@ class NativeRegisterContextLinux_arm64
bool m_fpu_is_valid;
bool m_sve_buffer_is_valid;
bool m_mte_ctrl_is_valid;
bool m_zt_buffer_is_valid;

bool m_sve_header_is_valid;
bool m_za_buffer_is_valid;
Expand Down Expand Up @@ -129,6 +130,9 @@ class NativeRegisterContextLinux_arm64

struct tls_regs m_tls_regs;

// SME2's ZT is a 512 bit register.
std::array<uint8_t, 64> m_zt_reg;

bool IsGPR(unsigned reg) const;

bool IsFPR(unsigned reg) const;
Expand Down Expand Up @@ -163,6 +167,10 @@ class NativeRegisterContextLinux_arm64
// Instead use WriteZA and ensure you have the correct ZA buffer size set
// beforehand if you wish to disable it.

Status ReadZT();

Status WriteZT();

// SVCR is a pseudo register and we do not allow writes to it.
Status ReadSMEControl();

Expand Down Expand Up @@ -190,6 +198,8 @@ class NativeRegisterContextLinux_arm64

void *GetSMEPseudoBuffer() { return &m_sme_pseudo_regs; }

void *GetZTBuffer() { return m_zt_reg.data(); }

void *GetSVEBuffer() { return m_sve_ptrace_payload.data(); }

size_t GetSVEHeaderSize() { return sizeof(m_sve_header); }
Expand All @@ -210,6 +220,8 @@ class NativeRegisterContextLinux_arm64

size_t GetSMEPseudoBufferSize() { return sizeof(m_sme_pseudo_regs); }

size_t GetZTBufferSize() { return m_zt_reg.size(); }

llvm::Error ReadHardwareDebugInfo() override;

llvm::Error WriteHardwareDebugRegs(DREGType hwbType) override;
Expand Down
50 changes: 38 additions & 12 deletions lldb/source/Plugins/Process/Utility/RegisterInfoPOSIX_arm64.cpp
Expand Up @@ -90,6 +90,10 @@ static lldb_private::RegisterInfo g_register_infos_sme[] = {
{"za", nullptr, 16, 0, lldb::eEncodingVector, lldb::eFormatVectorOfUInt8,
KIND_ALL_INVALID, nullptr, nullptr, nullptr}};

static lldb_private::RegisterInfo g_register_infos_sme2[] = {
{"zt0", nullptr, 64, 0, lldb::eEncodingVector, lldb::eFormatVectorOfUInt8,
KIND_ALL_INVALID, nullptr, nullptr, nullptr}};

// Number of register sets provided by this context.
enum {
k_num_gpr_registers = gpr_w28 - gpr_x0 + 1,
Expand All @@ -98,6 +102,8 @@ enum {
k_num_mte_register = 1,
// Number of TLS registers is dynamic so it is not listed here.
k_num_pauth_register = 2,
// SME2's ZT0 will also be added to this set if present. So this number is
// only for SME1 registers.
k_num_sme_register = 3,
k_num_register_sets_default = 2,
k_num_register_sets = 3
Expand Down Expand Up @@ -253,7 +259,7 @@ RegisterInfoPOSIX_arm64::RegisterInfoPOSIX_arm64(
AddRegSetTLS(m_opt_regsets.AllSet(eRegsetMaskSSVE));

if (m_opt_regsets.AnySet(eRegsetMaskSSVE))
AddRegSetSME();
AddRegSetSME(m_opt_regsets.AnySet(eRegsetMaskZT));

m_register_info_count = m_dynamic_reg_infos.size();
m_register_info_p = m_dynamic_reg_infos.data();
Expand Down Expand Up @@ -358,21 +364,35 @@ void RegisterInfoPOSIX_arm64::AddRegSetTLS(bool has_tpidr2) {
m_dynamic_reg_sets.back().registers = m_tls_regnum_collection.data();
}

void RegisterInfoPOSIX_arm64::AddRegSetSME() {
uint32_t sme_regnum = m_dynamic_reg_infos.size();
for (uint32_t i = 0; i < k_num_sme_register; i++) {
m_sme_regnum_collection.push_back(sme_regnum + i);
void RegisterInfoPOSIX_arm64::AddRegSetSME(bool has_zt) {
const uint32_t first_sme_regnum = m_dynamic_reg_infos.size();
uint32_t sme_regnum = first_sme_regnum;

for (uint32_t i = 0; i < k_num_sme_register; ++i, ++sme_regnum) {
m_sme_regnum_collection.push_back(sme_regnum);
m_dynamic_reg_infos.push_back(g_register_infos_sme[i]);
m_dynamic_reg_infos[sme_regnum + i].byte_offset =
m_dynamic_reg_infos[sme_regnum + i - 1].byte_offset +
m_dynamic_reg_infos[sme_regnum + i - 1].byte_size;
m_dynamic_reg_infos[sme_regnum + i].kinds[lldb::eRegisterKindLLDB] =
sme_regnum + i;
m_dynamic_reg_infos[sme_regnum].byte_offset =
m_dynamic_reg_infos[sme_regnum - 1].byte_offset +
m_dynamic_reg_infos[sme_regnum - 1].byte_size;
m_dynamic_reg_infos[sme_regnum].kinds[lldb::eRegisterKindLLDB] = sme_regnum;
}

lldb_private::RegisterSet sme_regset = g_reg_set_sme_arm64;

if (has_zt) {
m_sme_regnum_collection.push_back(sme_regnum);
m_dynamic_reg_infos.push_back(g_register_infos_sme2[0]);
m_dynamic_reg_infos[sme_regnum].byte_offset =
m_dynamic_reg_infos[sme_regnum - 1].byte_offset +
m_dynamic_reg_infos[sme_regnum - 1].byte_size;
m_dynamic_reg_infos[sme_regnum].kinds[lldb::eRegisterKindLLDB] = sme_regnum;

sme_regset.num_registers += 1;
}

m_per_regset_regnum_range[m_register_set_count] =
std::make_pair(sme_regnum, m_dynamic_reg_infos.size());
m_dynamic_reg_sets.push_back(g_reg_set_sme_arm64);
std::make_pair(first_sme_regnum, m_dynamic_reg_infos.size());
m_dynamic_reg_sets.push_back(sme_regset);
m_dynamic_reg_sets.back().registers = m_sme_regnum_collection.data();

// When vg is written during streaming mode, svg will also change, as vg and
Expand Down Expand Up @@ -488,6 +508,12 @@ bool RegisterInfoPOSIX_arm64::IsSMERegZA(unsigned reg) const {
return reg == m_sme_regnum_collection[2];
}

bool RegisterInfoPOSIX_arm64::IsSMERegZT(unsigned reg) const {
// ZT0 is part of the SME register set only if SME2 is present.
return m_sme_regnum_collection.size() >= 4 &&
reg == m_sme_regnum_collection[3];
}

bool RegisterInfoPOSIX_arm64::IsPAuthReg(unsigned reg) const {
return llvm::is_contained(pauth_regnum_collection, reg);
}
Expand Down

0 comments on commit b8150c8

Please sign in to comment.