Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't profile ruby 2.7.3+ or 3.0.1+ processes on Windows #340

Closed
acj opened this issue Oct 24, 2021 · 13 comments
Closed

Can't profile ruby 2.7.3+ or 3.0.1+ processes on Windows #340

acj opened this issue Oct 24, 2021 · 13 comments

Comments

@acj
Copy link
Member

acj commented Oct 24, 2021

A couple of tests fail on Windows when profiling newer rubies:

---- sampler::tests::test_sample_single_process_with_time_limit stdout ----
thread 'sampler::tests::test_sample_single_process_with_time_limit' panicked at 'unexpected error: initialize

Caused by:
    0: get ruby VM state
    1: Couldn't get ruby version: get Ruby version
       
       Caused by:
           0: retrieve ruby version
           1: Only part of a ReadProcessMemory or WriteProcessMemory request was completed. (os error 299)
           2: Only part of a ReadProcessMemory or WriteProcessMemory request was completed. (os error 299)', src\sampler\mod.rs:399:16

I've confirmed that we're looking in the correct place (the libruby DLL) for the version symbol, finding it, and returning a reasonable-looking memory address. When we try to read the process's memory at that address, though, we get the above error.

I looked through the changelog for ruby 2.7.3 and 3.0.1 and didn't see any relevant changes. I also looked at the changelog for RubyInstaller2 (2.7, 3.0) but didn't find anything. It's possible that there was a relevant change in another project that wouldn't be mentioned in these changelogs, e.g. a change to the build parameters (ASLR?). The build toolchain for RubyInstaller2 is based on MSYS2/MinGW, which I'm not familiar with.

If we're somehow trying to read the wrong memory address for the ruby_version symbol, it's possible that there's a bug in one of our crate dependencies.

Update (12/28/2021)

2.5.9+ and 2.6.6+ builds are also affected. The issue seems to affect every release of Ruby that's been built since December 2020. I had hoped to build an older version of Ruby (e.g. 2.7.2) with the latest version of rubyinstaller2 to see if it would be broken, but those old version targets have been removed, and I haven't had any luck adding them back in.

Poking around in ProcessExplorer, I noticed that the ASLR status for the ruby binaries has changed. On older versions, it's enabled (and also disabled? The label is confusing...), and on newer versions it's "High-Entropy" and/or "Bottom-Up". Note that the bottom right window here is a 32-bit binary.

image

There was activity in the gcc and binutils MSYS packages late last year that was related to the "dynamicbase" feature, which (as I understand it) is related to ASLR. Nothing definitive, but it's a possible clue. That PR links to a few related issues.

Next, I'd like to try building a recent version of ruby using rubyinstaller2, but with downgraded packages (gcc, etc) that were released before December 2020. Or, alternatively, using the most recent packages but disabling ASLR. There's a list of MINGW packages here that might be a useful reference.

@aldent95
Copy link

aldent95 commented Mar 5, 2023

Has there been any update on this? I really need to get some profiling done of my Rails app to make some improvements, but I can't because of this bug sadly.

@acj
Copy link
Member Author

acj commented Mar 7, 2023

No, unfortunately I haven't had any luck diagnosing and fixing this.

@aldent95
Copy link

aldent95 commented Mar 7, 2023

@acj Looking at your update above, would having the older installers help you figure it out? I may have some sitting around.
Otherwise is there anything I can take a look at to help out?

@acj
Copy link
Member Author

acj commented Mar 9, 2023

I'm not sure. The only thing that seems clear is that something changed, possibly in the Ruby build tooling, around Dec 2020.

Another surprising thing is that py-spy does fundamentally the same thing as rbspy (reads memory from another process) using the same library code but is much more popular among Windows users and presumably doesn't have this issue.

If you have any hunches about what might have changed in the build tools, or if you're also a py-spy user (or at least willing to give it a try for comparison's sake), that help would be much appreciated.

@jvns
Copy link
Collaborator

jvns commented Mar 9, 2023

This might be obvious, but I think the major difference between py-spy and rbspy is that Python exports more of the structs required for profiling as a public header file than Ruby, so that header file is more stable. So the list of different Python versions that py-spy needs to generate bindings for is a lot shorter.

@acj
Copy link
Member Author

acj commented Mar 10, 2023

Based on my notes, I don't think we're getting far enough through initialization to need the ruby bindings. The first failure is trying to read the version string. We're able to locate the ruby_version symbol and get its memory address (which could be wrong somehow) but then can't read from it.

I remember hard-coding the version string to see if that was the only issue, and I think I got the same "Only part of a ..." error on the next memory read, which was probably looking for another symbol. It feels like something fundamental isn't working.

@adfoster-r7
Copy link

I believe I'm hitting this too, just thought I'd add an extra datapoint if so - I'm running with Windows server 2022 on Ruby 3.0.5 👍

C:\Users\vagrant\Downloads\rbspy-x86_64-pc-windows-msvc.exe>.\rbspy-x86_64-pc-windows-msvc.exe snapshot --pid 2768
Something went wrong while rbspy was sampling the process. Here's what we know:
- get ruby VM state
- Failed to read Ruby version symbol
- Only part of a ReadProcessMemory or WriteProcessMemory request was completed. (os error 299)
- Only part of a ReadProcessMemory or WriteProcessMemory request was completed. (os error 299)

I'm not sure what tool was used for the screenshots in the original comment, but from process hacker it shows ASLR and DEP are enabled

image

@jabamaus
Copy link

jabamaus commented Nov 7, 2023

Me too. Ruby 3.2.2 msvc 2022, windows 10.

@jabamaus
Copy link

jabamaus commented Nov 7, 2023

Worth noting that I built ruby myself with vs2022 so can rule out the problem being in mysys/mingw or in rubyinstaller.

@acj
Copy link
Member Author

acj commented Dec 31, 2023

I think this is fixed in 0.18.1. Tested on Windows 11 with ruby 2.6, 2.7, 3.0, 3.1, and 3.2 from the official installers. There are details in the linked spytools issue if anyone is curious—the short version is that one of our dependency crates needed to be updated with better support for 64-bit memory layouts.

@aldent95 @jabamaus @adfoster-r7 Please give it a try and let me know :)

@adfoster-r7
Copy link

@acj Thanks! Looks like that works for me now using the latest rbspy release and following the same env and steps that didn't previously work 💯

@acj
Copy link
Member Author

acj commented Jan 3, 2024

Happy new year, everyone, and thanks for your patience

@acj acj closed this as completed Jan 3, 2024
@jabamaus
Copy link

jabamaus commented Jan 3, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants