Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm: debug: Add GDB stub for aarch32 #58067

Merged
merged 3 commits into from Dec 18, 2023

Conversation

marek-vedral
Copy link
Contributor

@marek-vedral marek-vedral commented May 19, 2023

Add implementation of GDB stub for 32-bit ARM. It has been
tested only on the Zynq-7000 SoC and I would like to get any feedback
from others.

The stub still has these issues:

  • To implement single stepping, it uses instruction address mismatch
    breakpoint, as recommended in ARMv7 reference. The breakpoint control
    register is configured (the state control fields) for the "PL0, Supervisor
    and System modes only" option. Otherwise the breakpoint would also halt the
    processor in abort mode, in which the stub loop runs. Zephyr kernel
    runs in the system mode. This works well until the kernel enables
    interrupts, as interrupt handlers typically run in Supervisor mode.
    Single stepping therefore sometimes "catches" a handler instead of the
    next application instruction. I have not tried User mode, because
    Cortex-A SoCs do not appear to have the ARCH_HAS_USERSPACE flag.

    @wentasah
    Cc: Michal Sojka michal.sojka@cvut.cz
    Signed-off-by: Marek Vedral marek.vedral@gmail.com

subsys/debug/gdbstub.c Outdated Show resolved Hide resolved
@microbuilder
Copy link
Member

I'm not sure many of the reviewers here will have a Zync-7000 to test with (I don't). Could this be made to work with qemu_cortex_a9, which is the same core used on the Zync-7000?

@marek-vedral
Copy link
Contributor Author

Yes, it does seem to work on the emulator, except for single stepping. The emulator seems to ignore the mismatch breakpoint and continues the execution. Do you know whether QEMU also emulates these debug registers?

I tried to run the samples/subsys/debug/gdbstub program. The property zephyr,gdbstub-uart = &uart1; needs to be added to the devicetree of qemu_cortex_a9 and the same option -serial tcp:127.0.0.1:5678,server as for qemu_x86 must be specified in CMakeList.txt.

@microbuilder
Copy link
Member

Yes, it does seem to work on the emulator, except for single stepping. The emulator seems to ignore the mismatch breakpoint and continues the execution. Do you know whether QEMU also emulates these debug registers?

I'm not sure about the internals myself, but if we have an easy to reproduce qemu sample I can ping the people who would know around me. See point below to that effect.

I tried to run the samples/subsys/debug/gdbstub program. The property zephyr,gdbstub-uart = &uart1; needs to be added to the devicetree of qemu_cortex_a9 and the same option -serial tcp:127.0.0.1:5678,server as for qemu_x86 must be specified in CMakeList.txt.

Since we'll probably want some sort of basic testing for CI for this, could you add those changes for the qemu target in a separate commit, and we can figure out how to put together a basic test scenario from there?

@marek-vedral
Copy link
Contributor Author

Since we'll probably want some sort of basic testing for CI for this, could you add those changes for the qemu target in a separate commit, and we can figure out how to put together a basic test scenario from there?

Added a new commit with those changes.

@microbuilder
Copy link
Member

microbuilder commented Jun 7, 2023

Added a new commit with those changes.

Thanks. The sample folder is missing some files to be a valid sample, but if you can add either some notes in the sample's readme, plus a useful main.c source file and project config file, or update this PR with some instructions on testing this, I can kick the tires in on it.

EDIT: My mistake, sorry. This is editing the existing sample in samples/subsys/debug/gdbstub 🤦

@MaureenHelm
Copy link
Member

@povergoing can you take a look?

arch/Kconfig Outdated Show resolved Hide resolved
arch/Kconfig Outdated Show resolved Hide resolved
Copy link
Member

@povergoing povergoing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor typos and issues need to be fixed.

also, link to #60031 in case there is an impact. @SgrrZhf @ithinuel would you also take a look?

/* Wrapper function to save and restore execution context */
void z_gdb_entry(z_arch_esf_t *esf, unsigned int exc_cause)
{
/* Disable the hardware breapoint in case it was set */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: breakpoint

uint32_t reg_val = ctx.registers[PC];
/* set BVR (Breakpoint value register) to PC, make sure it is word aligned */
reg_val &= ~(0x3);
__asm__ volatile("mcr p14, 0, %0, c0, c0, 4" ::"r"(reg_val) :);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reg_val |= (0xF & DBGDBCR_BYTE_ADDR_MASK) << DBGDBCR_BYTE_ADDR_SHIFT;
/* Breakpoint enable */
reg_val |= DBGDBCR_BRK_EN_MASK;
__asm__ volatile("mcr p14, 0, %0, c0, c0, 5" ::"r"(reg_val) :);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

{
int ret = 0;
/* Fill the buffer with 'x' in case the stub does not support the required register */
memset(buf, "xxxxxxxx", 8);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean memset(buf, "x", 8);?

int ret = 0;
/* Fill the buffer with 'x' in case the stub does not support the required register */
memset(buf, "xxxxxxxx", 8);
ret = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the meaning of ret = 8, here? Why not int ret = 8 at the beginning? Can you also explain a bit the meaning of the 8? It's the default size of the function read?

return ret;
}

size_t arch_gdb_reg_writeone(struct gdb_ctx *ctx, uint8_t *hex, size_t hexlen, uint32_t regno)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that all arch_gdb_reg_read/writexxxs share almost the same logic, would you like to refine them to make them more compact if it's not too difficult?

@@ -415,6 +415,9 @@ endchoice

config GDBSTUB_BUF_SZ
int "GDB backend send/receive buffer size (in bytes)"
# GDB for ARM expects up to 18 4-byte plus 8 12-byte
# registers - 336 HEX letters
default 350 if ARM
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this one also be set by arch/soc/board specifically instead of default xxx if ARM

/* Enable the monitor debug mode */
__asm__ volatile("mrc p14, 0, %0, c0, c2, 2" : "=r"(reg_val)::);
reg_val |= DBGDSCR_MONITOR_MODE_EN;
__asm__ volatile("mcr p14, 0, %0, c0, c2, 2" ::"r"(reg_val) :);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, registers defined in lib_heler

#if defined(CONFIG_GDBSTUB)
z_gdb_entry(esf, GDB_EXCEPTION_INVALID_INSTRUCTION);
/* Might not be fatal if GDB stub placed it in the code. */
return false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I saw you return false after z_gdb_entry, would you like to elaborate on the details if there is a real fault instead of a gdb exception? You are going to send all faults to the gdb clients?

@microbuilder
Copy link
Member

Still a lot of failures here. Do you have time to look through and sort some of them out to be able to get this merged in eventually? Thanks.

I tried rebasing on the main branch again. Most of the checks are successful, not sure why the two remaining checks fail each time.

This needs to merge first: #62431

@microbuilder
Copy link
Member

This needs to merge first: #62431

Should pass now if you rebase.

@marek-vedral
Copy link
Contributor Author

This needs to merge first: #62431

Should pass now if you rebase.

Thanks for the notification, I rebased on the main branch again.

@SgrrZhf
Copy link
Member

SgrrZhf commented Oct 7, 2023

It has conflicts now. Would you please rebase it again? @marek-vedral

@marek-vedral
Copy link
Contributor Author

It has conflicts now. Would you please rebase it again? @marek-vedral

Rebased on the main branch, there should not be conflicts anymore.

@SgrrZhf
Copy link
Member

SgrrZhf commented Oct 8, 2023

I have tested it again and I always get stuck in the GDB command line after I type n and the application is running to the end and won't stop. This sample (samples/subsys/debug/gdbstub) works well on the qemu_x86 board well. Could you check and test it again?

@marek-vedral
Copy link
Contributor Author

I have tested it again and I always get stuck in the GDB command line after I type n and the application is running to the end and won't stop. This sample (samples/subsys/debug/gdbstub) works well on the qemu_x86 board well. Could you check and test it again?

We discussed the issue with @microbuilder, see the earlier comments. I am not sure whether QEMU also emulates the debug registers, which would allow single stepping.

@SgrrZhf
Copy link
Member

SgrrZhf commented Oct 8, 2023

We discussed the issue with @microbuilder, see the earlier comments. I am not sure whether QEMU also emulates the debug registers, which would allow single stepping.

Sorry I didn't see it.

SgrrZhf
SgrrZhf previously approved these changes Oct 8, 2023
Copy link

github-actions bot commented Dec 8, 2023

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Dec 8, 2023
microbuilder
microbuilder previously approved these changes Dec 8, 2023
@microbuilder microbuilder removed the Stale label Dec 8, 2023
@microbuilder
Copy link
Member

@marek-vedral Sorry, I've been very slow with getting back to this one. I'm OK with the current code, but can you rebase to resolve conflicts since the last push, and we can hopefully get this in? Again, sorry for the delay.

@carlescufi
Copy link
Member

@marek-vedral please rebase

If GDBSTUB is enabled and the kernel runs in tickless mode, the timer
must not convert the delta cycles to a 32-bit data type (cycle_diff_t in
this case). The delta_ticks variable would overflow and the next timeout
would be set before the current timestamp, thus generating an interrupt
right after leaving the handler. As a result, the system would receive
tens of thousands of interrupts per second and would not boot.

Cc: Michal Sojka <michal.sojka@cvut.cz>
Signed-off-by: Marek Vedral <marek.vedral@gmail.com>
This commit adds a devicetree overlay file and an extra condition in the
CMakeLists.txt file to allow remote debugging for the qemu_cortex_a9
board.

Signed-off-by: Marek Vedral <marek.vedral@gmail.com>
This commit adds implementation of GDB stub for 32-bit ARM. It has been
tested only on the Zynq-7000 SoC and I would like to get any feedback
from others.

The stub still has these issues:

- To implement single stepping, it uses instruction address mismatch
  breakpoint, as recommended in ARMv7 reference. The breakpoint control
  register is configured (the state control fields) for the "PL0,
  Supervisor and System modes only" option. Otherwise the breakpoint
  would also halt the processor in abort mode, in which the stub loop
  runs. Zephyr kernel runs in the system mode. This works well until the
  kernel enables interrupts, as interrupt handlers typically run in
  Supervisor mode. Single stepping therefore sometimes "catches" a
  handler instead of the next application instruction. I have not tried
  User mode, because Cortex-A SoCs do not appear to have the
  ARCH_HAS_USERSPACE flag.

Cc: Michal Sojka <michal.sojka@cvut.cz>
Signed-off-by: Marek Vedral <marek.vedral@gmail.com>
@marek-vedral
Copy link
Contributor Author

@marek-vedral please rebase

Rebased on main

@microbuilder
Copy link
Member

@povergoing Do you mind giving this another look? Thanks.

@microbuilder
Copy link
Member

@carlocaione Do you have time to give this a review?

@carlescufi carlescufi merged commit 93a4287 into zephyrproject-rtos:main Dec 18, 2023
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

9 participants