Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

m1n1.hv: Introduce gdbserver #194

Merged
merged 7 commits into from
Jun 4, 2022
Merged

m1n1.hv: Introduce gdbserver #194

merged 7 commits into from
Jun 4, 2022

Conversation

akihikodaki
Copy link
Contributor

No description provided.

@akihikodaki
Copy link
Contributor Author

I have done changes I have planned for so here is a summary of changes since the initial version:

General m1n1.hv changes

  • Shadow used debug registers, fixing debugging Linux.
  • Allow to switch CPU without continuing.
  • Always continue on the stepped thread, aligning with GDB's semantics (otherwise GDB crashes).
  • Propagate breakpoint change to other CPUs.
  • Add hardware watchpoint support.

gdbserver-specific changes

  • Interrupt a CPU when connected with GDB, otherwise GDB connection times out.
  • Support switching threads.
  • Support watchpoints.
  • Add missing SPDX-License-Identifier: MIT.
  • Minor code changes.

@marcan
Copy link
Member

marcan commented May 30, 2022

Thank you for working on this! This is a really cool feature I wanted to have for a long time.

Can you fix the conflicts on hv.py, and maybe explain a bit how the exception handling / CPU switching stuff changes? From what I can tell:

  • The GDB stub runs in its own thread
  • The HV has its existing shell stuff largely as-is
  • The SIGUSR1 hack is introduced to allow the cont() / cpu() stuff to cross thread boundaries from GDB to the main thread
  • The in-thread CPU switching relies on this thread split so the HV thread can exit back out to the proxy path for CPU switching, while the GDB thread continues in line

I've long been bothered by the cpu() stuff causing an exit out of the shell. If I figure out a better way to implement it without an unwind (e.g. with proxy changes so the context switch happens without an exit out to the main loop, at least from the Python perspective) would that help make this a bit simpler?

@marcan
Copy link
Member

marcan commented May 30, 2022

I just pushed some refactoring that makes cpu() not exit the shell context, and the same approach can hopefully be extended to do things like single-step if needed. Do you think you can use this to make your life easier, and maybe avoid the SIGUSR1 hack? :)

@marcan
Copy link
Member

marcan commented May 30, 2022

Though I guess you'd still need a signal or something similar to exit the shell if you want synchronize the gdbserver and hv proxy shell execution state and allow using both at once (which I guess might be handy?) while still supporting break/continue from both sides. But at least it should simplify the CPU switching stuff.

Alternatively you could have the gdbserver stuff mutex with the shell, so only one side can "own" the HV (when paused) at once.

@akihikodaki
Copy link
Contributor Author

Changes since my last comment:

  • Bunch of fixes
  • LLDB support

Main advantages of LLDB are TBI/PAC support and Darwin kernel dyld support.

GDB's TBI/PAC support is somewhat limited; it can only clear the upper bits and cannot handle TBI/PAC pointers in the kernel, which needs the upper bits to be filled.

Darwin kernel dyld support on LLDB is enabled by default so LLDB can provide decent kernel debugging experience out-of-box. You may create a LLDB script to load the kernel and extensions:

echo target create -s kernel.development.t8101.dSYM kernel.development.t8101 > target.lldb
for k in $(find Extensions); do [ "$(file -b --mime-type $k)" != application/x-mach-binary ] || printf 'image add %q\n' $k; done >> target.lldb

The following commands for LLDB loads the generated script and connects to m1n1.

command source -e false target.lldb
process connect unix-connect:///tmp/.m1n1-unix

@marcan Rebased to commit 60254a3. I'll explain the rebased code from now on:

The SIGUSR1/SIGUSR2 hack and the change for CPU switching code and should be distinguished.

The SIGUSR1/SIGUSR2 hack are necessary to exit the shell as you noticed. gdbserver runs concurrently with the shell so you can use both of them at the same time.
The state is synchronized one-way: the changes gdbserver made will be visible for the shell but the changes the shell made will not be propagated to GDB. For example, breakpoints added from the shell will not be visible for GDB. When GDB issues "continue" command, gdbserver signals SIGUSR1 so the shell also exits. The opposite direction will not work however; even if you exit the shell, GDB keeps waiting for commands.

The change for CPU switching code is to prevent from switching the CPU when stepping, which causes GDB trigger an assertion failure. The behavior is implemented with the new hv_wait_cpu command; it makes m1n1 wait for the specified CPU and prevents it from switching to another CPU.

The version before rebasing also had a change to maintain the exception contexts while switching CPUs to propagate breakpoints and watchpoints, but it is no longer necessary thanks to your refactoring. The refactoring also made things simpler by removing the necessity to store the intermediate states when switching contexts.

src/hv_exc.c Outdated Show resolved Hide resolved
src/hv_exc.c Outdated Show resolved Hide resolved
src/hv_exc.c Outdated Show resolved Hide resolved
src/hv.c Outdated Show resolved Hide resolved
src/hv_exc.c Show resolved Hide resolved
src/hv_exc.c Outdated Show resolved Hide resolved
src/hv_exc.c Show resolved Hide resolved
@marcan
Copy link
Member

marcan commented May 31, 2022

Makes sense! I left some comments as to how to implement the hv_wait_cpu stuff differently, but it should be effectively functionally identical in the end.

@marcan
Copy link
Member

marcan commented Jun 3, 2022

Thanks! Looks good to me, if you can fix the merge conflicts I'll merge it! I think it's just an artifact of having moved hv.py, since what Lina pushed yesterday shouldn't clash with your changes line-wise.

Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
@akihikodaki
Copy link
Contributor Author

e1a1419: rebased to 7ed86d8.

Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
@akihikodaki
Copy link
Contributor Author

c6e60c2: Added initialization of hv_pinned_cpu to hv_start and converted argument -1 for hv_pin_cpu to 0xffffffffffffffff since M1N1Proxy encodes arguments as unsigned.

I forgot to recompile m1n1 so the code has actually not been tested at all(!). Fortunately the two changes were enough to get it work. (Background: I use a Makefile which automatically compiles the binary before launching m1n1 for Linux kernel debugging, but I don't use it for Darwin kernel debugging.)

@marcan marcan merged commit 5cea0db into AsahiLinux:main Jun 4, 2022
@marcan
Copy link
Member

marcan commented Jun 4, 2022

Thanks again! FWIW my approach with the hypervisor is to move fast and break things, so I'm not worried if there are any regressions lurking, we'll fix them in time. Sorry for taking so long to review this - it was large enough that I wanted to take a decent look and I've been busy, but I'm glad it's in!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants