Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JMP offsets to enter in to blt4l exceed the maximum value of a 32-bit integer #3

Closed
RomanHargrave opened this issue Mar 28, 2016 · 11 comments
Labels
bug Confirmed fault in BLT4L fixed This issue has been resolved succesfully

Comments

@RomanHargrave
Copy link
Member

Currently, SubHook uses the JMP opcode E9 in order to perform a relative jump to the target function.

Due to the way that BLT4L is loaded, our functions live in address space that is extremely far from the LUA function calls we need access to.

The actual jump offsets we would need to use are larger than a signed 32-bit integer would allow. Unfortunately, 0xE9 expects a signed paramater in order to support backward jumps.

Inside subhook_make_jmp, the result of the offset calculation is cast to (int32_t) which truncates the jump value, and can result in a negative jump when a positive jump is needed, not to mention that the jump offset is far to small on the order of several magnitudes.

@polarathene
Copy link

Is there no way to jmp to something in between that is reliable to work around the issue? Do you have a way to locate an address by doing an Array of Bytes scan?

@RomanHargrave
Copy link
Member Author

Locating the address is not the problem. The problem we are running in to is the way that the kernel arranges libraries in memory.

When you look at the memory maps for PAYDAY 2, you will notice that the process image resides at a far lower space in virtual memory than any other libraries. The "distance", if you can call it that, between PAYDAY 2 and the libraries that it statically links, which make up the image, and our library along with others like the steam overlay and API is larger than the maximum value that opcode E9 will accept as a relative jump offset.

This is done intentionally, but for other reasons. It is just not particularly beneficial to our rather specific scenario.

@polarathene
Copy link

I take it reducing the amount of memory available for the PD2 process and any libs it links statically is probably not an option? Something like a docker container to isolate the processes from the rest of the system. This isn't my forte, wish I could help in some way though.

@Ozymandias117
Copy link
Contributor

I finally found some time to try to help with the hooking situation.
I've made a temporary fork of subhook which pushes the address onto the stack and returns rather than uses a jump: https://github.com/Ozymandias117/subhook

See if this works any better for you.

@ljrk0
Copy link
Contributor

ljrk0 commented Apr 4, 2016

@Ozymandias117 Hm, this is interesting.

I've tried yours with this branch: https://github.com/LeonardKoenig/blt4l/tree/PRQ_clenaup_and_fixes

And I at least can see that our hooked function is getting called (instead of the original) -- which is definitely an improvement, thanks!

Sadly it still hangs (sometimes) when trying to execute std::__1::recursive_mutex::lock :/

@RomanHargrave
Copy link
Member Author

@Ozymandias117 has effectively solved the core issue for this thread, which I appreciate, because I really don't like to deal with Intel-family CISC assembly, so I'm going to close the issue for now.

I had researched different methods for function intercepts previously, and this was certainly one that was suggested, though I had trouble researching it due to the rather general nature of the terminology involved. (Just try searching for information about direct and indirect jumps and the problems introduced by the 64-bit extensions. I dare you.)

That tangent aside, I think that we should certainly continue to investigate the SubHook option, albeit with @Ozymandias117 fork; additionally, would go so far as to suggest you open a PR to subhook with that modification if you don't mind putting the time in to backing up your changes on the PR page.

Now, with regards to the changes and their affect on process longevity.

From my limited understanding of the exact behaviour implied by RET, we shouldn't be contaminating the stack any more than we would be with the JMP behaviour; and, if the material that I just read (in haste) would suggest anything on the matter, it would be that this is the more "correct" way to do something this far south of "correct".

@Ozymandias117
Copy link
Contributor

Yes, the stack should be clean (as can be seen by examining %rsp in gdb).

I can look into a PR, but I'll still need to clean up the Trampoline case first. Currently my fork can create "correct" trampolines that will throw you off into invalid memory still. This is why I'd mentioned in the other thread that it should only be used with ScopedRemove (or c-style remove/install).

Using this though, I've been able to hook into do_game_update and play a full mission without any crashes (Just printing "Enter game update" "Leaving game update" on either side of the original call). I think a lot of the effort now is going to be figuring out how to attach to the current BLT infrastructure.

@RomanHargrave
Copy link
Member Author

That's good. I'm going to toy with your update as the submodule inplace of the standard upstream.

We do not, and probably will not in the foreseeable future, use trampolines, and if so, sparingly as very few functions are even "twiddled" with, if you will.

I will also have to investigate @leonardkoenig 's claim about thread mutexes, because I really want to steer clear of anything that would involve both C++ and threads (moreso the C++ part. at least threads make sense and do what you tell them to do).

Also, if by "BLT" infrastructure you mean mod compatibility, we should have 100% compatibility with a given mod provided that there are no platform "quirks" and that the mod does not use drive letters. I can always modify the runtime to translate windows path separators to real path separators, but I would like to avoid it, because that's really LUA's job as a cross-platform (*) language, and not ours (though the modder should have known better and used real path separators).

@Ozymandias117
Copy link
Contributor

Yeah, I was trying to reproduce @leonardkoenig's issue, but I wasn't able to get this hook to run correctly. So I just added the do_game_update hook to the small main hook I'd used to make the changes to SubHook. If I can get a build working that has that issue, I can try to investigate. Speaking of which, is there any reason your CMakeLists requires cmake 3.4.1?

The BLT infrastructure I'm referring to is the rest of that Payday-2-BLT
/src folder. My understanding of BLT is that it exposes some additional features that may require some work for the Linux side.

@RomanHargrave
Copy link
Member Author

@Ozymandias117

  1. WRT CMakeLists, it's because it's the oldest version I have and can guarantee compatibility with. If you can verify it works with an older version, let me know what it is. This is just standard configuration management bureaucracy.
  2. BLT modifies a few core LUA functions and exposes some things. None of these appear, after reading them, to be terribly difficult, and if we can get the hook working, those are the least of our concern.
  3. I just tested hooking with your RET mod, and PAYDAY boots, but is slow as molasses. I'll remove the lua_newstate hook I added, but I fear that blt4l's game update hook is the cause of the lag. Can you please write (and test) a short POC that demonstrates how you hooked it in such a manner that the game was playable? Any other details would be greatly appreciated.

@RomanHargrave
Copy link
Member Author

BTW lets carry this dialog over to #5

@blt4linux blt4linux locked and limited conversation to collaborators Apr 5, 2016
@RomanHargrave RomanHargrave added bug Confirmed fault in BLT4L fixed This issue has been resolved succesfully labels Apr 13, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Confirmed fault in BLT4L fixed This issue has been resolved succesfully
Projects
None yet
Development

No branches or pull requests

4 participants