Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tool] Implement initial support for swift-inspect under Linux 64-bit #63576

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mikolasstuchlik
Copy link

@mikolasstuchlik mikolasstuchlik commented Feb 10, 2023

swift-inspect

swift-inspect is a tool in apple/swift that allows users to debug Swift programs by dumping various information. The swift-inspect uses SwiftRemoteMirror. It is available on macOS and Windows.

This Pull Request

This Pull Request adds an initial support for Linux, specifically for x86-64 architecture. Since the swift-inspect requires the ability to enumerate heap and such functionality is not natively available in Glibc, this functionality is provided by stand-alone tool called memtool.

memtool

The memtool is a Swift Package created and maintained by the creator of this Pull Request. It uses ptrace and bash in order to provide swift-inspect with all the information it requires.
Since memtool is not maintained by Swift maintainers, I have decided against using it as a Swift Package dependency. Users are expected to read README, where the memtool is linked, and advised to review the memtool before downloading or executing it.


Related discussions

https://forums.swift.org/t/building-swift-inspect-on-linux/56858
https://forums.swift.org/t/tool-for-analyzing-heap-of-linux-glibc-swift-process-in-swift/62017
https://forums.swift.org/t/is-there-a-way-to-differentiate-swift-types-and-c-types-at-runtime/61950

@MaxDesiatov
Copy link
Contributor

@swift-ci smoke test

@ktoso ktoso requested a review from mikeash March 23, 2023 00:29
@ktoso
Copy link
Contributor

ktoso commented Mar 23, 2023

The external package dependency is a bit problematic I guess... What do you think @mikeash ?
It'd be very nice to some of our linux adopters to be able to build on swift-inspect... 🤔

The actual goal here is to implement this all "in swift itself"... though that leaves me wondering what to do with this PR, perhaps we can view it as an incremental step 🤔

@mikeash
Copy link
Contributor

mikeash commented Mar 23, 2023

Oh, this looks sweet. I'm OK with doing whatever we need to get an initial version running on Linux, and we can refine it over time. If the initial version involves an external dependency you build manually, that's fine, it'll be way easier to eventually transition that to having everything built in.

@al45tair this is relevant to your interests as well.

@al45tair
Copy link
Contributor

:-) Making it work on Linux is/was on my TODO list. Probably without the external dependency though.

@mikolasstuchlik
Copy link
Author

My initial goal was to implement the Linux support within the swift-inspect. I've decided to split the implementation into a different repo, when I've started to invoke various external programs. Now I'm confident, that this particular issue could be resolved by using the LLDB C++ API.

However, the implementation of iterateHeap relies on quite a complex logic that depends on the internal implementation of glibc (both libc implementation and dl). Would you be open to accepting a PR into apple/swift which would include this kind of fragile code?

One future direction I've been thinking about is hooking at the malloc and free which would solve the issue with mmapped chunks and also wouldn't require the fragile logic that supports iterateHeap right now.

@al45tair
Copy link
Contributor

My understanding is that internal layout of glibc's memory allocator is well documented, so I think we probably would accept code that walked that, particularly in swift-inspect which is already poking at bits at a fairly low level.

What's "the issue with mmapped chunks"?

@mikolasstuchlik
Copy link
Author

What's "the issue with mmapped chunks"?

For the sake of this discussion, the chunks could be divided into two kinds: mmapped chunks and arena chunks.

  • The arena chunks are book-kept by the glibc malloc via private structs malloc_state (copied from glibc source code to heap_utils.h) - and thus can be located with certainty. Instances of this scructs are refered to as thread arena and main arena and are linked together. The entry point to the list is a private symbol named main_arena. So in order to locate the chunks, we need to know the binary layout of glibc-private structs-
  • The mmapped are not book-kept. Those chunks are allocated by a direct API call to the mmap and therefore can not be enumerated with certainty (as discussed in the "Linux Memory Forensics", section "II. Glibc Analysis", subsection "3) MMAPPED Chunks:"). We know, that the mmapped chunks are always set-up in a particular way, so we might attempt to search for them in /proc/pid/maps regions that are not known to have different purpose. This approach, however, brings the risk of false-positives.

I have therefore decided to ignore mmapped chunks at this stage, which means that mallocated memory above a certain size (which changes dynamically during runtime, but is generally known to be at least 410241024*sizeof(long) on 64-bit system will not be reached by iterateHeap.

My understanding is that internal layout of glibc's memory allocator is well documented

It is unfortunately not that easy. For example, some pointers are obfuscated. Also, the documentation talks about Thread Local Cache (tcache), but locating the tcache is not an easy task, since it is a private __thread variable (and thus stored in the TLS, which could not be accessed even via lldb and I needed to explore my own way). My proposed way of locating the tcache is based on knowledge of implementation of both glibc malloc and glibc dynamic loader - and knowledge of the binary layout of several other glibc-private structs. Those are just some of the issues.

I'm not trying to convince you to reject this Pull Request :) (it took me a lot of free time to self-study and implement this :) ). But at the same time, I don't feel confident proposing to "merge" mikolasstuchlik/memtool and swift-inspect, since the memtool has very little to do with swift (or the swift toolchain) and can have uses beyond the scope of swift-inspect.
That's why on my opinion, either swift-inspect will use some different method of iterating memory (for example above mentioned custom book-keeping via malloc hooks), or the memtool itself will reach a point in the future, where it is stable enough (possibly based on lldb API) to be forked (by a trusted authority) and made a SPM dependency of swift-inspect.

@ktoso
Copy link
Contributor

ktoso commented Mar 29, 2023

Adding a reference to radar rdar://107360568

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants