Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libnative: main task has bad thread local data on windows #13259

Closed
klutzy opened this issue Apr 2, 2014 · 26 comments
Closed

libnative: main task has bad thread local data on windows #13259

klutzy opened this issue Apr 2, 2014 · 26 comments
Assignees
Labels
O-windows Operating system: Windows
Milestone

Comments

@klutzy
Copy link
Contributor

klutzy commented Apr 2, 2014

More investigation from #13073.

extern crate native;
extern crate green;
extern crate rustuv;

use std::libc::{c_void, LPWSTR, LPVOID, DWORD};

extern "system" {
    fn MessageBoxA(hWnd: u32, lpText: *u8, lpCaption: *u8, uType: u32) -> i32;

    fn FormatMessageW(flags: DWORD,
                      lpSrc: LPVOID,
                      msgId: DWORD,
                      langId: DWORD,
                      buf: LPWSTR,
                      nsize: DWORD,
                      args: *c_void)
                      -> DWORD;

    fn GetLastError() -> u32;
}

fn test() {
    let mut buf: [u16, ..50] = [0, ..50];
    let ret = unsafe {
        FormatMessageW(0x1000, 0 as *mut c_void, 1, 0x400,
                       buf.as_mut_ptr(), buf.len() as u32, 0 as *c_void)
    };
    if ret == 0 {
        let err = unsafe { GetLastError() };
        println!("err: {:?}", err);
    }
    let s = std::str::from_utf16(buf);
    println!("{:?}", s);

    //unsafe {
    //    MessageBoxA(0, "ABC".as_ptr(), "ABC".as_ptr(), 0);
    //}
}

fn main() {
    if cfg!(spawn) {
        spawn(proc() {
            test();
        });
    } else {
        test();
    }
}

#[start]
pub fn start(argc: int, argv: **u8) -> int {
    if cfg!(green) {
        green::start(argc, argv, rustuv::event_loop, main)
    } else if cfg!(raw) {
        main();
        0
    } else {
        native::start(argc, argv, main)
    }
}

If the code is bulit with --cfg green, --cfg raw or --cfg spawn, it works as expected. However, if it is built with no cfg, it behaves strangely: FormatMessageW() fails to get system locale mssage, and MessageBoxA() shows gui message box with non-system-themed border.

I guess the main task has bad thread-local data due to libnative::start().

cc @alexcrichton

@alexcrichton
Copy link
Member

What happens if you add this to the top of the test function?

unsafe { ::std::rt::stack::record_sp_limit(0); }

@klutzy
Copy link
Contributor Author

klutzy commented Apr 2, 2014

Wow, that one fixed all problems!
(Currently backtrace is not shown on windows + libnative due to same reason. Indeed, the line also fixed the issue.)

@alexcrichton
Copy link
Member

See this comment for how I discovered that. It sounds like this modification to the TIB is causing something to fail when some initial library is loaded.

At this point, I think we need to figure out the minimal set of things that need to be done to "get things loaded". This minimal set of things can be done whenever libnative starts up, but I'm not sure what the minimal set of things are.

The minimal set of things includes "boot libgreen and spawn a task" because that's what using libgreen will do, but it would be nice to narrow it down more than that!

Does that make sense?

@klutzy
Copy link
Contributor Author

klutzy commented Apr 3, 2014

The behavior seems to suggest that it's not a good idea to modify $fs:0x14 (or $gs:0x28 on win64), which LLVM/Rust currently uses for segmented stack.

fs:0x14 is known as "ArbitraryUserPointer", but Raymond Chen said that it's "arbitrary" for internals, not for application. (it's safe to believe what Raymond says!)
I met the article before, but I also found conflicting claims (e.g. this and this) and at the time rust worked well, so I assumed it's ok in practice. Not we have to check if it really is :'( Here is C code reproducing the bad behavior:

#include <windows.h>
int main() {
    asm("movl $1234, %fs:0x14");
    MessageBoxA(0, "abc", "abc", 0);
}

So I want to investigate if LLVM can use TIB's stack bounds instead of ArbitraryUserPointer for segmented stack. We use them in target_record_stack_bounds() on win64 to record full available stack area, but if we call target_record_stack_bounds(stack_lo + RED_ZONE, stack_hi) then it can replace ArbitraryUserPointer usage.

@klutzy
Copy link
Contributor Author

klutzy commented Apr 3, 2014

I've replaced all occurrence (including llvm) of $fs:0x14 by $fs:0x08. The example code seems to work well, but it definitely needs more tests.

@alexcrichton
Copy link
Member

Are we sure that the kernel won't modify those TIB values to some other internal stack limit? (just a mild concern of mine)

@klutzy
Copy link
Contributor Author

klutzy commented Apr 4, 2014

Hmm, StackLimit ($fs:0x08) actually means the lowest address of committed page, not limit address of whole stack. Windows internally uses it to detect stack usage on uncommitted page: it will be changed if "thread uses successively lower addresses in the stack". (Uh wait, the link says that pvArbitrary is theoretically safe to use!)
So it is not a good idea to use StackLimit with native stacks which libnative currently uses.

BTW, the librand issue (CryptAcquireContextA) does not occur on win 8.1, but does occur on win 7. (Maybe there are some difference on internal libraries?) The result indicates that it seems hard to collect minimal set of "get things loaded".

@alexcrichton
Copy link
Member

Hm, interestingly your C code example works for me. I think I'm on a windows 7 VM. The CryptAcquireContext example doesn't work for me, however.

@huonw huonw added the A-windows label Apr 4, 2014
@vadimcn
Copy link
Contributor

vadimcn commented Apr 21, 2014

How about using the actual TLS API for this? TlsGetValue() and TlsSetValue() are both like 10 instructions long. Is this enough overhead to bother messing with undocumented TIB fields? And if it is, maybe we could write directly into TLS slots (after allocating one via TlsAlloc(), of course).

@alexcrichton
Copy link
Member

It may be possible to do that, although this is an extra bit of overhead an all function calls made in rust (each function is preceded with this information). I also fear that the root cause is still unknown so it's too soon to move away from the current implementation (which seems like it should work).

@vadimcn
Copy link
Contributor

vadimcn commented Apr 21, 2014

Actually, function prologues could use current bottom of stack (fs:[8]) for quick comparison, and only check the hard stack limit when kicked off of the fast path. The latter would happen only when stack commit limit is about to grow anyway, so it wouldn't matter for perf.
Of course changing this in LLVM would break everybody's runtime libraries, which use "ArbitraryUserPointer"...

@vadimcn
Copy link
Contributor

vadimcn commented Apr 21, 2014

I also fear that the root cause is still unknown so it's too soon to move away from the current implementation (which seems like it should work).

I tend to believe Raymond, when he says that "ArbitraryUserPointer" is reserved for OS use. Looks like you've discovered this yourself too.

alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 6, 2014
There's a good long comment explaining why. The tl;dr; is that I have no idea
why this is necessary, but it gets --test to work on windows which is something,
right?

cc rust-lang#13259
cc rust-lang#16275
cc rust-lang/cargo#302
@brson brson mentioned this issue Aug 12, 2014
33 tasks
@alexcrichton
Copy link
Member

Nominating, I just spent an hour tracking this down for what feels like the 10th time. This causes syscalls to fail at random for no apparent reason, and the errors are often quite misleading.

@pnkfelix
Copy link
Member

P-backcompat-libs, 1.0 milestone.

@brson
Copy link
Contributor

brson commented Sep 11, 2014

@Zoxc what's the state of your stack probe patch?

@thestinger
Copy link
Contributor

Ah, so this is why #[thread_local] doesn't work on Windows.

@thestinger
Copy link
Contributor

@brson: AFAIK, we already have stack probes on Windows via __chkstk. The segmented stack checks are likely already redundant, beyond how the error is reported.

@Zoxc
Copy link
Contributor

Zoxc commented Sep 11, 2014

@brson The error message reporting (#16388) is mostly working. I suspect there's some unrelated error breaking it on Windows). I've yet to find someone who will commit my x86 LLVM patches (which adds support for stack probes on non-Windows x86 systems), but you could just adapt my code (Zoxc@ecec992) to use stack probes on Windows only. Since fixing this error is more important than nice stack overflow messages (which will added later anyway).

@DanielKeep
Copy link
Contributor

Just FYI: this also appears to break CoCreateInstance, which means anything trying to create COM objects will explode with ERROR_NOACCESS. This is particularly fun, because according to the MSDN docs, CoCreateInstance isn't even supposed to fail with that. 😒

@brson
Copy link
Contributor

brson commented Sep 19, 2014

Working on adapting @Zoxc's patch now.

@brson brson self-assigned this Sep 19, 2014
@klutzy
Copy link
Contributor Author

klutzy commented Sep 21, 2014

Win32 backtrace also broken!

brson added a commit to brson/rust that referenced this issue Sep 25, 2014
@brson
Copy link
Contributor

brson commented Sep 25, 2014

@klutzy I can't reproduce the problem on either 64-bit win8 or 32-bit win 2008 using your test case in the op. FormatMessage succeeds producing 'Incorrect function.'.

bors added a commit that referenced this issue Sep 30, 2014
This is the bare minimum to stop using split stacks on Windows, fixing #13259 and #14742, by turning on stack probes for all functions and disabling compiler and runtime support for split stacks on Windows.

It does not restore the out-of-stack error message, which requires more runtime work.

This includes a test that the Windows TCB is no longer being clobbered, but the out-of-stack test itself is pretty weak, only testing that the program exits abnormally, not that it isn't writing to bogus memory, so I haven't truly verified that this is providing the safety we claim.

A more complete solution is in #16388, which has some unresolved issues yet.

cc @Zoxc @klutzy @vadimcn
@brson brson closed this as completed in 3b3d702 Sep 30, 2014
@thestinger thestinger reopened this Oct 1, 2014
AnthIste added a commit to AnthIste/WinMan that referenced this issue Feb 11, 2015
matthiaskrgr pushed a commit to matthiaskrgr/rust that referenced this issue Oct 11, 2022
Simplify feature representation in CargoConfig
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-windows Operating system: Windows
Projects
None yet
Development

No branches or pull requests

10 participants