New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ELT call stub issues on Linux #18977

Open
dotnetjt opened this Issue Jul 18, 2018 · 30 comments

Comments

Projects
None yet
7 participants
@dotnetjt

dotnetjt commented Jul 18, 2018

Hi @noahfalk - this is to continue the discussion from a closed issue prior.

I'm (slowly) porting our windows CLR profiler to linux. Everything was going spectacularly well until I tried to use the ICorProfilerInfo3::GetFunctionEnter3Info method. Calling this immediately results in a segmentation fault:

HRESULT hr;
COR_PRF_FRAME_INFO pFrameInfo;
ULONG pcbArgumentInfo = 0;
hr = profilerInfo->GetFunctionEnter3Info(functionId, eltInfo, &pFrameInfo, &pcbArgumentInfo, NULL);

I'm smart enough to know the issue is probably a bad pointer for eltInfo, which of course is going to be the enter stub code. Let's pretend for a moment that I don't know or understand assembly or calling conventions enough to and that I only got my windows version working with the help of some really great samples that magically worked.

Here's my setup of the ELT hooks:

hr = profilerInfo->SetEnterLeaveFunctionHooks3WithInfo((FunctionEnter3WithInfo*)MethodEntered3,
		(FunctionLeave3WithInfo*)MethodLeft3, (FunctionTailcall3WithInfo*)MethodLeft3);

These are defined as:

extern "C" {
    void MethodEntered3(FunctionID function_id_or_client_id, COR_PRF_ELT_INFO elt_info);
    void MethodLeft3(FunctionID function_id_or_client_id, COR_PRF_ELT_INFO elt_info);
}

And finally, the assembly behind all this, which is pretty much a straight lift of AsmHelpers.S but with method names changed:

.intel_syntax noprefix
#include "unixasmmacrosamd64.inc"

#define real4 dword
#define real8 qword

.equ SIZEOF_PROFILE_PLATFORM_SPECIFIC_DATA, 0x8*22 + 0x8   // includes fudge to make FP_SPILL right
.equ SIZEOF_FP_ARG_SPILL, 0x10*2

.equ SIZEOF_STACK_FRAME, SIZEOF_PROFILE_PLATFORM_SPECIFIC_DATA + SIZEOF_FP_ARG_SPILL

.equ PROFILE_ENTER, 0x1
.equ PROFILE_LEAVE, 0x2
.equ PROFILE_TAILCALL, 0x4

// ***********************************************************
//   NOTE:
//
//   Register preservation scheme:
//
//       Preserved:
//           - all non-volatile registers
//           - rax, rdx
//           - xmm0, xmm1
//
//       Not Preserved:
//           - integer argument registers (rcx, rdx, r8, r9)
//           - floating point argument registers (xmm1-3)
//           - volatile integer registers (r10, r11)
//           - volatile floating point registers (xmm4-5)
//           - upper halves of ymm registers on AVX (which are volatile)
//
// ***********************************************************

// EXTERN_C void MethodEntered3(FunctionIDOrClientID functionIDOrClientID, size_t profiledRsp);
// <NOTE>
//
// </NOTE>
NESTED_ENTRY MethodEntered3, _TEXT, NoHandler
  //       Upon entry :
  //           r14 = clientInfo
  //           r15 = profiledRsp

  push_nonvol_reg         rax

  lea                     rax, [rsp + 0x10]    // caller rsp
  mov                     r10, [rax - 0x8]     // return address

  push_argument_register  rdx
  alloc_stack             SIZEOF_STACK_FRAME

  // correctness of return value in structure doesn't matter for enter probe

  // setup ProfilePlatformSpecificData structure
  xor                     r11, r11 // nullify r11
  mov                     [rsp +  0x0], r11    // r11 is null     -- struct functionId field
  save_reg_postrsp        rbp, 0x8             //                 -- struct rbp field
  mov                     [rsp + 0x10], rax    // caller rsp      -- struct probeRsp field
  mov                     [rsp + 0x18], r10    // return address  -- struct ip field
  mov                     [rsp + 0x20], r15    //                 -- struct profiledRsp field
  mov                     [rsp + 0x28], r11    // return value    -- struct rax field
  mov                     [rsp + 0x30], r11    // r11 is null     -- struct hiddenArg field
  movsd                   real8 ptr [rsp + 0x38], xmm0    //      -- struct flt0 field
  movsd                   real8 ptr [rsp + 0x40], xmm1    //      -- struct flt1 field
  movsd                   real8 ptr [rsp + 0x48], xmm2    //      -- struct flt2 field
  movsd                   real8 ptr [rsp + 0x50], xmm3    //      -- struct flt3 field
  movsd                   real8 ptr [rsp + 0x58], xmm4    //      -- struct flt4 field
  movsd                   real8 ptr [rsp + 0x60], xmm5    //      -- struct flt5 field
  movsd                   real8 ptr [rsp + 0x68], xmm6    //      -- struct flt6 field
  movsd                   real8 ptr [rsp + 0x70], xmm7    //      -- struct flt7 field
  mov                     [rsp + 0x78], rdi     //                -- struct rdi field
  mov                     [rsp + 0x80], rsi     //                -- struct rsi field
  mov                     [rsp + 0x88], rdx     //                -- struct rdx field
  mov                     [rsp + 0x90], rcx     //                -- struct rcx field
  mov                     [rsp + 0x98], r8      //                -- struct r8 field
  mov                     [rsp + 0xa0], r9      //                -- struct r9 field
  mov                     r10, 0x1 // PROFILE_ENTER
  mov                     [rsp + 0xa8], r10d   //                -- struct flags field

  // get aligned stack ptr (rsp + FRAME_SIZE) & (-16)
  lea                     rax, [rsp + 0xb8]
  and                     rax, -16

  // we need to be able to restore the fp return register
  // save fp return registers
  movdqa                  [rax + 0x00], xmm0
  movdqa                  [rax + 0x10], xmm1

  END_PROLOGUE

  // rdi already contains the clientInfo
  mov                     rdi, r14
  lea                     rsi, [rsp + 0x0]
  call                    C_PLTFUNC(EnterMethodStub)

  // restore fp return registers
  lea                     rax, [rsp + 0xb8]
  and                     rax, -16
  movdqa                  xmm0, [rax + 0x00]
  movdqa                  xmm1, [rax + 0x10]

  // restore arg registers
  mov                     rdi, [rsp + 0x78]
  mov                     rsi, [rsp + 0x80]
  mov                     rdx, [rsp + 0x88]
  mov                     rcx, [rsp + 0x90]
  mov                     r8, [rsp + 0x98]
  mov                     r9, [rsp + 0xa0]

  // begin epilogue
  free_stack              SIZEOF_STACK_FRAME
  pop_argument_register   rdx

  pop_nonvol_reg          rax

  ret
NESTED_END MethodEntered3, _TEXT

// EXTERN_C void MethodLeft3(FunctionIDOrClientID functionIDOrClientID, size_t profiledRsp);
// <NOTE>
//
// </NOTE>
NESTED_ENTRY MethodLeft3, _TEXT, NoHandler
//       Upon entry :
//           rdi = clientInfo
//           rsi = profiledRsp

  push_nonvol_reg         rbx

  lea                     rbx, [rsp + 0x10]    // caller rsp
  mov                     r10, [rbx - 0x8]     // return address

  // rdx should be saved here because it can be used for returning struct values
  push_argument_register  rdx
  alloc_stack             SIZEOF_STACK_FRAME

  // correctness of argument registers in structure doesn't matter for leave probe

  // setup ProfilePlatformSpecificData structure
  xor                     r11, r11  // nullify r11
  mov                     [rsp +  0x0], r11    // r11 is null     -- struct functionId field
  save_reg_postrsp        rbp, 0x8             //                 -- struct rbp field
  mov                     [rsp + 0x10], rbx    // caller rsp      -- struct probeRsp field
  mov                     [rsp + 0x18], r10    // return address  -- struct ip field
  mov                     [rsp + 0x20], rsi    //                 -- struct profiledRsp field
  mov                     [rsp + 0x28], rax    // return value    -- struct rax field
  mov                     [rsp + 0x30], r11    // r11 is null     -- struct hiddenArg field
  movsd                   real8 ptr [rsp + 0x38], xmm0    //      -- struct flt0 field
  movsd                   real8 ptr [rsp + 0x40], xmm1    //      -- struct flt1 field
  movsd                   real8 ptr [rsp + 0x48], xmm2    //      -- struct flt2 field
  movsd                   real8 ptr [rsp + 0x50], xmm3    //      -- struct flt3 field
  movsd                   real8 ptr [rsp + 0x58], xmm4    //      -- struct flt4 field
  movsd                   real8 ptr [rsp + 0x60], xmm5    //      -- struct flt5 field
  movsd                   real8 ptr [rsp + 0x68], xmm6    //      -- struct flt6 field
  movsd                   real8 ptr [rsp + 0x70], xmm7    //      -- struct flt7 field
  mov                     [rsp + 0x78], r11     //                -- struct rdi field
  mov                     [rsp + 0x80], r11     //                -- struct rsi field
  mov                     [rsp + 0x88], r11     //                -- struct rdx field
  mov                     [rsp + 0x90], r11     //                -- struct rcx field
  mov                     [rsp + 0x98], r11     //                -- struct r8 field
  mov                     [rsp + 0xa0], r11    //                -- struct r9 field
  mov                     r10, 0x2  // PROFILE_LEAVE
  mov                     [rsp + 0xa8], r10d   // flags           -- struct flags field

  // get aligned stack ptr (rsp + FRAME_SIZE) & (-16)
  lea                     rax, [rsp + 0xb8]
  and                     rax, -16

  // we need to be able to restore the fp return register
  // save fp return registers
  movdqa                  [rax + 0x00], xmm0
  movdqa                  [rax + 0x10], xmm1

  END_PROLOGUE

  // rdi already contains the clientInfo
  lea                     rsi, [rsp + 0x0]
  call                    C_PLTFUNC(LeaveMethodStub)

  // restore fp return registers
  lea                     rax, [rsp + 0xb8]
  and                     rax, -16
  movdqa                  xmm0, [rax + 0x00]
  movdqa                  xmm1, [rax + 0x10]

  // restore int return register
  mov                     rax, [rsp + 0x28]

  // begin epilogue
  free_stack              SIZEOF_STACK_FRAME
  pop_argument_register   rdx

  pop_nonvol_reg          rbx

  ret
NESTED_END MethodLeft3, _TEXT

It seems to me like the frame isn't being set up right, but I just have no idea where to begin. Help? Thanks in advance for any insight.

@jkotas

This comment has been minimized.

Show comment
Hide comment
@jkotas

jkotas Jul 19, 2018

Member

Calling this immediately results in a segmentation fault:

Could you please share the details about the segmentation fault?

Member

jkotas commented Jul 19, 2018

Calling this immediately results in a segmentation fault:

Could you please share the details about the segmentation fault?

@jkotas

This comment has been minimized.

Show comment
Hide comment
@jkotas

jkotas Jul 19, 2018

Member

This looks suspect:

  // get aligned stack ptr (rsp + FRAME_SIZE) & (-16)
  lea                     rax, [rsp + 0xb8]
  and                     rax, -16

We should be running with aligned stack pointer here. It should not be required to manually align the stack pointer. Calling the C/C++ stub methods with misagligned stack pointer can lead to segmentation faults.

This was introduced in #12603. @sergign60 Do you happen to remember why you had to do the manual stack alignment there?

Member

jkotas commented Jul 19, 2018

This looks suspect:

  // get aligned stack ptr (rsp + FRAME_SIZE) & (-16)
  lea                     rax, [rsp + 0xb8]
  and                     rax, -16

We should be running with aligned stack pointer here. It should not be required to manually align the stack pointer. Calling the C/C++ stub methods with misagligned stack pointer can lead to segmentation faults.

This was introduced in #12603. @sergign60 Do you happen to remember why you had to do the manual stack alignment there?

@noahfalk

This comment has been minimized.

Show comment
Hide comment
@noahfalk

noahfalk Jul 19, 2018

Member

In addition to wherever the stack alignment issue raised by Jan leads, I think you've got a more fundamental issue - there are two different variations of the callbacks:
FunctionEnter3WithInfo(FunctionIDOrClientID functionIDOrClientID, COR_PRF_ELT_INFO eltInfo)
FunctionEnter3(FunctionOrRemappedID functionOrRemappedID)

The assembly you copied from the runtime isn't actually either of those signatures. The runtime uses it as if it were a FunctionEnter3 but when we do so we are relying on an implementation detail of the jitted code that passes us a 2nd hidden argument not mentioned in the FunctionEnter3 signature. In your case you are casting incorrectly and claiming to the runtime that the assembly routine implements the FunctionEnter3WithInfo signature. Because the COR_PROF_ELT_INFO argument is not the same type as the hidden profiledRsp argument, the ABI doesn't match up and the code crashes. You either need to use SetEnterLeaveFunctionHooks3 that takes FunctionEnter3 as input, or you need to implement a method with the FunctionEnter3WithInfo signature.

The runtime doesn't have any implementation of FunctionEnter3WithInfo to use an example, but a decent starting point that would probably get you unblocked is to write the function in C. Technically that doesn't give strong enough register preservation requirements according to MSDN, but if you stay away from doing any floating point math (including things the compiler would auto-vectorize into floating point math) it would probably work. Looking over some of the code it doesn't feel like even the devs at Microsoft quite agree on what the correct bar is for preserving registers here, but I hope we can start getting that ironed out (#19023). I'll look around and see if I can find an example implementation of FunctionEnter3WithInfo that looks correct.

Member

noahfalk commented Jul 19, 2018

In addition to wherever the stack alignment issue raised by Jan leads, I think you've got a more fundamental issue - there are two different variations of the callbacks:
FunctionEnter3WithInfo(FunctionIDOrClientID functionIDOrClientID, COR_PRF_ELT_INFO eltInfo)
FunctionEnter3(FunctionOrRemappedID functionOrRemappedID)

The assembly you copied from the runtime isn't actually either of those signatures. The runtime uses it as if it were a FunctionEnter3 but when we do so we are relying on an implementation detail of the jitted code that passes us a 2nd hidden argument not mentioned in the FunctionEnter3 signature. In your case you are casting incorrectly and claiming to the runtime that the assembly routine implements the FunctionEnter3WithInfo signature. Because the COR_PROF_ELT_INFO argument is not the same type as the hidden profiledRsp argument, the ABI doesn't match up and the code crashes. You either need to use SetEnterLeaveFunctionHooks3 that takes FunctionEnter3 as input, or you need to implement a method with the FunctionEnter3WithInfo signature.

The runtime doesn't have any implementation of FunctionEnter3WithInfo to use an example, but a decent starting point that would probably get you unblocked is to write the function in C. Technically that doesn't give strong enough register preservation requirements according to MSDN, but if you stay away from doing any floating point math (including things the compiler would auto-vectorize into floating point math) it would probably work. Looking over some of the code it doesn't feel like even the devs at Microsoft quite agree on what the correct bar is for preserving registers here, but I hope we can start getting that ironed out (#19023). I'll look around and see if I can find an example implementation of FunctionEnter3WithInfo that looks correct.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 19, 2018

@noahfalk Thank you, that was what my gut was telling me as well, I just needed to validate. Perhaps it would help if I posted my working code from the Windows side? The syntax is different as I was relying on VS/ML64 over there. But it might be a better starting point, if someone can help translate to what really needs done.

I can work on this on the straight C approach as well - but again, not fully understanding the ABI and how to deal with saving registers is my main issue. Assembly just adds a level of syntax on top of that issue.

And yes, what I need is a working FunctionEnter3WithInfo method.

Thanks again for the help.

@jkotas no additional detail on the fault. Segmentation fault (core dumped) and process exits.

dotnetjt commented Jul 19, 2018

@noahfalk Thank you, that was what my gut was telling me as well, I just needed to validate. Perhaps it would help if I posted my working code from the Windows side? The syntax is different as I was relying on VS/ML64 over there. But it might be a better starting point, if someone can help translate to what really needs done.

I can work on this on the straight C approach as well - but again, not fully understanding the ABI and how to deal with saving registers is my main issue. Assembly just adds a level of syntax on top of that issue.

And yes, what I need is a working FunctionEnter3WithInfo method.

Thanks again for the help.

@jkotas no additional detail on the fault. Segmentation fault (core dumped) and process exits.

@noahfalk

This comment has been minimized.

Show comment
Hide comment
@noahfalk

noahfalk Jul 20, 2018

Member

I peaked at some of our old test code laying around and it was doing it directly as C/C++:

FunctionEnter3WithInfo *pFEWI3 = reinterpret_cast<FunctionEnter3WithInfo *>(&FunctionEnter3WithInfoStub);
...
PINFO->SetEnterLeaveFunctionHooks3WithInfo(pFEWI3, pFLWI3, pFTCWI3));

EXTERN_C VOID STDMETHODCALLTYPE FunctionEnter3WithInfoStub(FunctionIDOrClientID functionIDOrClientID, COR_PRF_ELT_INFO eltInfo)
{
    TestProfiler::Instance()->FunctionEnter3WithInfoCallBack(functionIDOrClientID, eltInfo);
}

but again, not fully understanding the ABI and how to deal with saving registers is my main issue.

At the moment I'm not even sure we at Microsoft have a solid consensus on what the ABI is supposed to be - ProfileEnter callers/ProfilerEnter implementers/MSDN documentation/comments are all making different claims about what the ABI is (see #19023) so I don't have an immediate answer for you - I'm working on it : ) In the meantime doing it in C may not qualify as rigorously correct, but it will probably unblock you while we work this out. Once we've got a specific ABI we are trying to hit we can help a bit with the mechanics of implementing an example routine.

Member

noahfalk commented Jul 20, 2018

I peaked at some of our old test code laying around and it was doing it directly as C/C++:

FunctionEnter3WithInfo *pFEWI3 = reinterpret_cast<FunctionEnter3WithInfo *>(&FunctionEnter3WithInfoStub);
...
PINFO->SetEnterLeaveFunctionHooks3WithInfo(pFEWI3, pFLWI3, pFTCWI3));

EXTERN_C VOID STDMETHODCALLTYPE FunctionEnter3WithInfoStub(FunctionIDOrClientID functionIDOrClientID, COR_PRF_ELT_INFO eltInfo)
{
    TestProfiler::Instance()->FunctionEnter3WithInfoCallBack(functionIDOrClientID, eltInfo);
}

but again, not fully understanding the ABI and how to deal with saving registers is my main issue.

At the moment I'm not even sure we at Microsoft have a solid consensus on what the ABI is supposed to be - ProfileEnter callers/ProfilerEnter implementers/MSDN documentation/comments are all making different claims about what the ABI is (see #19023) so I don't have an immediate answer for you - I'm working on it : ) In the meantime doing it in C may not qualify as rigorously correct, but it will probably unblock you while we work this out. Once we've got a specific ABI we are trying to hit we can help a bit with the mechanics of implementing an example routine.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 20, 2018

Thanks, @noahfalk . I'll try this approach and see what I get. When I was originally working on some very prototype-y code, I was doing similar, and it was at least passing functionID and it was stable (didn't ever try to use the ELT info). So this makes some sense. Will let you know how it turns out.

dotnetjt commented Jul 20, 2018

Thanks, @noahfalk . I'll try this approach and see what I get. When I was originally working on some very prototype-y code, I was doing similar, and it was at least passing functionID and it was stable (didn't ever try to use the ELT info). So this makes some sense. Will let you know how it turns out.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 20, 2018

@noahfalk this "doesn't not work." It no longer results in a segmentation fault, however I've yet to validate the results of the API call. If they are good - then I can keep working. Otherwise I'm just stuck at a new spot. :)

Just curious, why / how would this work, and not destroy the stack in the process?

dotnetjt commented Jul 20, 2018

@noahfalk this "doesn't not work." It no longer results in a segmentation fault, however I've yet to validate the results of the API call. If they are good - then I can keep working. Otherwise I'm just stuck at a new spot. :)

Just curious, why / how would this work, and not destroy the stack in the process?

@noahfalk

This comment has been minimized.

Show comment
Hide comment
@noahfalk

noahfalk Jul 20, 2018

Member

Just curious, why / how would this work, and not destroy the stack in the process?

The current implementation for FunctionEnter3WithInfo (and similar for Leave/Tailcall) calls your callback with a stack like this:

your_function_enter_3_with_info
coreclr!ProfileEnter
coreclr!ProfileEnterNaked
jitted_code

ProfileEnterNaked saves and restores some registers on your behalf (like the integer argument registers and some of the return registers). The remaining registers aren't saved, but it works either because undocumented implementation details in the JIT means we don't wind up caring about those registers, or your callback doesn't wind up trashing them. For an example that might not work you could intrument a managed method that takes floating point arguments and you call a native method in your callback that also uses floating point arguments. I'd expect the native call with floating point args would change some of the xmm register values, no code in any of those functions would restore them and after the callback returns the jitted code would observe the arguments used within your callback, not the ones its caller originally provided.

Member

noahfalk commented Jul 20, 2018

Just curious, why / how would this work, and not destroy the stack in the process?

The current implementation for FunctionEnter3WithInfo (and similar for Leave/Tailcall) calls your callback with a stack like this:

your_function_enter_3_with_info
coreclr!ProfileEnter
coreclr!ProfileEnterNaked
jitted_code

ProfileEnterNaked saves and restores some registers on your behalf (like the integer argument registers and some of the return registers). The remaining registers aren't saved, but it works either because undocumented implementation details in the JIT means we don't wind up caring about those registers, or your callback doesn't wind up trashing them. For an example that might not work you could intrument a managed method that takes floating point arguments and you call a native method in your callback that also uses floating point arguments. I'd expect the native call with floating point args would change some of the xmm register values, no code in any of those functions would restore them and after the callback returns the jitted code would observe the arguments used within your callback, not the ones its caller originally provided.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 23, 2018

Makes sense. Thanks!

dotnetjt commented Jul 23, 2018

Makes sense. Thanks!

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 25, 2018

@noahfalk your above explanation makes sense. It got me past the point that I cared about it - I was only trying to get to class layout and field info.

But now, moving on, trying to get parameter info. And seeing some odd behavior that I think is related.

Take this .net method:

public class AwesomeClass
{
    public string MyField;
    public void PrintMyField(string testThis)
    {
            
      Console.WriteLine(MyField);
    }
}

I want to get the string testThis param from PrintMyField

I start to set things up by getting the range of arguments from the profiling api:

HRESULT hr;
COR_PRF_FRAME_INFO pFrameInfo;
ULONG pcbArgumentInfo = 0;

hr = profilerInfo->GetFunctionEnter3Info(functionId, eltInfo, &pFrameInfo, &pcbArgumentInfo, NULL);

COR_PRF_FUNCTION_ARGUMENT_INFO* pArgumentInfo = (COR_PRF_FUNCTION_ARGUMENT_INFO*)malloc(pcbArgumentInfo);

hr = profilerInfo->GetFunctionEnter3Info(functionId, eltInfo, &pFrameInfo, &pcbArgumentInfo, pArgumentInfo);

But then something a bit odd happens. My COR_PRF_FUNCTION_ARGUMENT_INFO structure shows me this:
image

numRanges: 2 makes sense because this method should have a hidden 'this' argument. But then, it only shows one argument. And, of course, when I go to that startAddress, it's actually the address of that 'this' object hidden as the first argument, and not of the first parameter of my method.

Of course the documentation here says that "Arguments that are stored in registers are spilled into memory to build the structures. "

So, I'm guessing that until I resolve this, I'm not going to be getting valid addresses for arguments.

Let me know if there's anything else I can do to help this all out.

dotnetjt commented Jul 25, 2018

@noahfalk your above explanation makes sense. It got me past the point that I cared about it - I was only trying to get to class layout and field info.

But now, moving on, trying to get parameter info. And seeing some odd behavior that I think is related.

Take this .net method:

public class AwesomeClass
{
    public string MyField;
    public void PrintMyField(string testThis)
    {
            
      Console.WriteLine(MyField);
    }
}

I want to get the string testThis param from PrintMyField

I start to set things up by getting the range of arguments from the profiling api:

HRESULT hr;
COR_PRF_FRAME_INFO pFrameInfo;
ULONG pcbArgumentInfo = 0;

hr = profilerInfo->GetFunctionEnter3Info(functionId, eltInfo, &pFrameInfo, &pcbArgumentInfo, NULL);

COR_PRF_FUNCTION_ARGUMENT_INFO* pArgumentInfo = (COR_PRF_FUNCTION_ARGUMENT_INFO*)malloc(pcbArgumentInfo);

hr = profilerInfo->GetFunctionEnter3Info(functionId, eltInfo, &pFrameInfo, &pcbArgumentInfo, pArgumentInfo);

But then something a bit odd happens. My COR_PRF_FUNCTION_ARGUMENT_INFO structure shows me this:
image

numRanges: 2 makes sense because this method should have a hidden 'this' argument. But then, it only shows one argument. And, of course, when I go to that startAddress, it's actually the address of that 'this' object hidden as the first argument, and not of the first parameter of my method.

Of course the documentation here says that "Arguments that are stored in registers are spilled into memory to build the structures. "

So, I'm guessing that until I resolve this, I'm not going to be getting valid addresses for arguments.

Let me know if there's anything else I can do to help this all out.

@mjsabby

This comment has been minimized.

Show comment
Hide comment
@mjsabby

mjsabby Jul 25, 2018

Member

@dotnetjt are you sure the data is not there? Try printing out the values of the data from the pArgumentInfo array. I would recommend you use non-pointer types so you can easily validate using printf debugging.

Your screenshot indicates to me that the data might be there but your debugger UI doesn't know to associate numRanges with the the ranges array.

Member

mjsabby commented Jul 25, 2018

@dotnetjt are you sure the data is not there? Try printing out the values of the data from the pArgumentInfo array. I would recommend you use non-pointer types so you can easily validate using printf debugging.

Your screenshot indicates to me that the data might be there but your debugger UI doesn't know to associate numRanges with the the ranges array.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 25, 2018

@mjsabby You may just be entirely right. I let my loop run through the ranges array, and there was a 2nd value there. But, I have a separate issue that seems to be something different from my Windows version.

When I enumerate ranges I parse the signature of each parameter to check the type, etc, and all that checks out.

I pass through a couple of helper methods, and in this case (correctly) land on a helper to pull that string buffer out of memory. I know this code works, as it works on fields (ObjectID of the class + offset of field). But passing the startAddress of a param doesn't. Admittedly, I haven't dug into it yet, but maybe you know something I don't about why this works fine on windows, but not Linux.

Basically, it looks like this (startAddress being passed to the helper):

ObjectID* id = (ObjectID*)(startAddress);
LPWSTR buffer = (LPWSTR)(*id+pBufferOffset);

pBufferOffset of course obtained via ICorProfilerInfo2::GetStringLayout

Anything obvious that I'm missing?

dotnetjt commented Jul 25, 2018

@mjsabby You may just be entirely right. I let my loop run through the ranges array, and there was a 2nd value there. But, I have a separate issue that seems to be something different from my Windows version.

When I enumerate ranges I parse the signature of each parameter to check the type, etc, and all that checks out.

I pass through a couple of helper methods, and in this case (correctly) land on a helper to pull that string buffer out of memory. I know this code works, as it works on fields (ObjectID of the class + offset of field). But passing the startAddress of a param doesn't. Admittedly, I haven't dug into it yet, but maybe you know something I don't about why this works fine on windows, but not Linux.

Basically, it looks like this (startAddress being passed to the helper):

ObjectID* id = (ObjectID*)(startAddress);
LPWSTR buffer = (LPWSTR)(*id+pBufferOffset);

pBufferOffset of course obtained via ICorProfilerInfo2::GetStringLayout

Anything obvious that I'm missing?

@mjsabby

This comment has been minimized.

Show comment
Hide comment
@mjsabby

mjsabby Jul 25, 2018

Member

I don't have any immediate insight on why you're running into this.

I recommend you look at the length offset that is returned from GetStringLayout2 and see if at least matches what you're passing in.

Member

mjsabby commented Jul 25, 2018

I don't have any immediate insight on why you're running into this.

I recommend you look at the length offset that is returned from GetStringLayout2 and see if at least matches what you're passing in.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 26, 2018

Nope, definitely some issue. If I take the UINT_PTR startAddress for the argument I'm looking for, gdb confirms to me that there's no object there:

-exec x/a 139941557150376
0x7f46aecda2a8:	0x0

Whereas if I take the address of a field in a class layout, run it through the exact same helper, it's good:

-exec x/a 140602064603752
0x7fe0781d8a68:	0x7fe0781d89d0

Substituting into gdb what my code does:

-exec x/hs *(UINT_PTR*)140602064603752 + 12
0x7fe0781d89dc:	u"hello profiler"

So, all I'm trying to say is that the code that looks at memory is good. But for parameters, I'm getting suspect startAddresses, which I am going to assume is either this issue around the calling convention and register preservation, or it is something else as of yet unreported. ¯_(ツ)_/¯

Thanks for your continued help on this guys.

dotnetjt commented Jul 26, 2018

Nope, definitely some issue. If I take the UINT_PTR startAddress for the argument I'm looking for, gdb confirms to me that there's no object there:

-exec x/a 139941557150376
0x7f46aecda2a8:	0x0

Whereas if I take the address of a field in a class layout, run it through the exact same helper, it's good:

-exec x/a 140602064603752
0x7fe0781d8a68:	0x7fe0781d89d0

Substituting into gdb what my code does:

-exec x/hs *(UINT_PTR*)140602064603752 + 12
0x7fe0781d89dc:	u"hello profiler"

So, all I'm trying to say is that the code that looks at memory is good. But for parameters, I'm getting suspect startAddresses, which I am going to assume is either this issue around the calling convention and register preservation, or it is something else as of yet unreported. ¯_(ツ)_/¯

Thanks for your continued help on this guys.

@mjsabby

This comment has been minimized.

Show comment
Hide comment
@mjsabby

mjsabby Jul 26, 2018

Member

Microsoft/clr-samples#11 is tracking adding a sample that uses the ELT APIs for Linux X64 (as well as the well-tested Windows X64/X86) for everyone's benefit.

I did a quick test this morning, and while it wasn't exhaustive I'm able to do some inspections but the jury isn't out yet so I might poke more when I add that sample.

Your issue doesn't say but are you using a recent build of CoreCLR? The official .NET Core 2.1 latest builds may have bugs.

Given that @noahfalk is still chasing exactly what is preserved or not, can you try just preserving and restoring everything? The sample I intend to write at first will do just that.

Member

mjsabby commented Jul 26, 2018

Microsoft/clr-samples#11 is tracking adding a sample that uses the ELT APIs for Linux X64 (as well as the well-tested Windows X64/X86) for everyone's benefit.

I did a quick test this morning, and while it wasn't exhaustive I'm able to do some inspections but the jury isn't out yet so I might poke more when I add that sample.

Your issue doesn't say but are you using a recent build of CoreCLR? The official .NET Core 2.1 latest builds may have bugs.

Given that @noahfalk is still chasing exactly what is preserved or not, can you try just preserving and restoring everything? The sample I intend to write at first will do just that.

@noahfalk

This comment has been minimized.

Show comment
Hide comment
@noahfalk

noahfalk Jul 26, 2018

Member

Let me know if there's anything else I can do to help this all out.

Sorry I've been preoccupied with some other issues so I haven't gotten a chance to to look into this 2nd part of the issue yet. If there is a repro that would be handy, but if not there might be enough info above or I can tinker with our profiler tests. Sorry I just haven't gotten to this one yet.

Member

noahfalk commented Jul 26, 2018

Let me know if there's anything else I can do to help this all out.

Sorry I've been preoccupied with some other issues so I haven't gotten a chance to to look into this 2nd part of the issue yet. If there is a repro that would be handy, but if not there might be enough info above or I can tinker with our profiler tests. Sorry I just haven't gotten to this one yet.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 27, 2018

@mjsabby would you be willing to share with me what you have so far? As I've mentioned, assembly is far from my strong suit. It might be enough to keep me going and I can help test and provide feedback.

dotnetjt commented Jul 27, 2018

@mjsabby would you be willing to share with me what you have so far? As I've mentioned, assembly is far from my strong suit. It might be enough to keep me going and I can help test and provide feedback.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Jul 27, 2018

Also, using current 2.1 release.

dotnetjt commented Jul 27, 2018

Also, using current 2.1 release.

@noahfalk

This comment has been minimized.

Show comment
Hide comment
@noahfalk

noahfalk Jul 30, 2018

Member

Just wanted to give an update - haven't forgotten but its going to be at least another week before we can get to investigating this.

Member

noahfalk commented Jul 30, 2018

Just wanted to give an update - haven't forgotten but its going to be at least another week before we can get to investigating this.

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Aug 2, 2018

Hey @mjsabby and @noahfalk just an update for you.

I took what @mjsabby provided in the ELT sample referenced and plugged in:

NESTED_ENTRY MethodEntered3, _TEXT, NoHandler
  //       Upon entry :
  //           r14 = clientInfo
  //           r15 = profiledRsp

    push rax
    push rcx
    push rdx
    push rsi
    push rdi
    push r8
    push r9
    push r10
    push r11
    call C_PLTFUNC(EnterMethodStub)
    pop r11
    pop r10
    pop r9
    pop r8
    pop rdi
    pop rsi
    pop rdx
    pop rcx
    pop rax
    ret


NESTED_END MethodEntered3, _TEXT

Still the same result. The 'startAddress` given for the argument is wrong, by a long shot. I hate to beat a dead horse, but this to me seems like an issue somewhere with the API giving back the right addresses. Or, something continues to clobber the stack, despite the above ^.

I'm happy to test out whatever you throw over my way. :)

dotnetjt commented Aug 2, 2018

Hey @mjsabby and @noahfalk just an update for you.

I took what @mjsabby provided in the ELT sample referenced and plugged in:

NESTED_ENTRY MethodEntered3, _TEXT, NoHandler
  //       Upon entry :
  //           r14 = clientInfo
  //           r15 = profiledRsp

    push rax
    push rcx
    push rdx
    push rsi
    push rdi
    push r8
    push r9
    push r10
    push r11
    call C_PLTFUNC(EnterMethodStub)
    pop r11
    pop r10
    pop r9
    pop r8
    pop rdi
    pop rsi
    pop rdx
    pop rcx
    pop rax
    ret


NESTED_END MethodEntered3, _TEXT

Still the same result. The 'startAddress` given for the argument is wrong, by a long shot. I hate to beat a dead horse, but this to me seems like an issue somewhere with the API giving back the right addresses. Or, something continues to clobber the stack, despite the above ^.

I'm happy to test out whatever you throw over my way. :)

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Aug 9, 2018

@noahfalk any more traction on this?

dotnetjt commented Aug 9, 2018

@noahfalk any more traction on this?

@noahfalk

This comment has been minimized.

Show comment
Hide comment
@noahfalk

noahfalk Aug 9, 2018

Member

@sywhang I think started looking into it yesterday, not sure where he got with it so far

Member

noahfalk commented Aug 9, 2018

@sywhang I think started looking into it yesterday, not sure where he got with it so far

@bbc2

This comment has been minimized.

Show comment
Hide comment
@bbc2

bbc2 Aug 9, 2018

Hi, I think I might be hitting the same issue so I thought additional info could be useful.

My FunctionEnter does the following:

    printf(
            "  argumentInfo:\n    numRanges: %u\n    totalArgumentSize: %u\n    ranges:\n",
            argumentInfo->numRanges,
            argumentInfo->totalArgumentSize
    );

    for (uint64_t i = 0; i < argumentInfo->numRanges; i++) {
        COR_PRF_FUNCTION_ARGUMENT_RANGE range = argumentInfo->ranges[i];
        printf(
                "      startAddress: %p\n      length: %u\n",
                (void *) range.startAddress,
                range.length
        );

        printf("      data:");
        for (UINT_PTR index = 0; index < range.length; index++) {
            uint8_t byte = *(((uint8_t *) range.startAddress) + index);
            printf(" %02x", byte);
        }
        printf("\n");
    }

With the following C# function being profiled:

        static int foo(int a, int b, int c, string d)
        {
            return a + b + c + d.Length;
        }

For foo(0xaa, 0xbb, 0xcc, "bar"), I get the following output:

  argumentInfo:
    numRanges: 4
    totalArgumentSize: 20
    ranges:
      startAddress: 0x7fffa22e8d80
      length: 4
      data: 00 00 00 00
      startAddress: 0x7fffa22e8d88
      length: 4
      data: 01 00 00 00
      startAddress: 0x7fffa22e8d90
      length: 4
      data: 40 d4 01 a4
      startAddress: 0x7fffa22e8d98
      length: 8
      data: 28 d4 01 a4 ec 7f 00 00

The data doesn't make sense. I'm not familiar with the memory layout of .NET objects so maybe I missed something.

Complete code is available at https://github.com/bbc2/dotnet-test-profiler/tree/function-enter-arg-issue.

> dotnet --info
.NET Core SDK (reflecting any global.json):
 Version:   2.1.302
 Commit:    9048955601

Runtime Environment:
 OS Name:     debian
 OS Version:  
 OS Platform: Linux
 RID:         debian-x64
 Base Path:   /usr/share/dotnet/sdk/2.1.302/

Host (useful for support):
  Version: 2.1.2
  Commit:  811c3ce6c0

.NET Core SDKs installed:
  2.1.302 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.2 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.2 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.2 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download

bbc2 commented Aug 9, 2018

Hi, I think I might be hitting the same issue so I thought additional info could be useful.

My FunctionEnter does the following:

    printf(
            "  argumentInfo:\n    numRanges: %u\n    totalArgumentSize: %u\n    ranges:\n",
            argumentInfo->numRanges,
            argumentInfo->totalArgumentSize
    );

    for (uint64_t i = 0; i < argumentInfo->numRanges; i++) {
        COR_PRF_FUNCTION_ARGUMENT_RANGE range = argumentInfo->ranges[i];
        printf(
                "      startAddress: %p\n      length: %u\n",
                (void *) range.startAddress,
                range.length
        );

        printf("      data:");
        for (UINT_PTR index = 0; index < range.length; index++) {
            uint8_t byte = *(((uint8_t *) range.startAddress) + index);
            printf(" %02x", byte);
        }
        printf("\n");
    }

With the following C# function being profiled:

        static int foo(int a, int b, int c, string d)
        {
            return a + b + c + d.Length;
        }

For foo(0xaa, 0xbb, 0xcc, "bar"), I get the following output:

  argumentInfo:
    numRanges: 4
    totalArgumentSize: 20
    ranges:
      startAddress: 0x7fffa22e8d80
      length: 4
      data: 00 00 00 00
      startAddress: 0x7fffa22e8d88
      length: 4
      data: 01 00 00 00
      startAddress: 0x7fffa22e8d90
      length: 4
      data: 40 d4 01 a4
      startAddress: 0x7fffa22e8d98
      length: 8
      data: 28 d4 01 a4 ec 7f 00 00

The data doesn't make sense. I'm not familiar with the memory layout of .NET objects so maybe I missed something.

Complete code is available at https://github.com/bbc2/dotnet-test-profiler/tree/function-enter-arg-issue.

> dotnet --info
.NET Core SDK (reflecting any global.json):
 Version:   2.1.302
 Commit:    9048955601

Runtime Environment:
 OS Name:     debian
 OS Version:  
 OS Platform: Linux
 RID:         debian-x64
 Base Path:   /usr/share/dotnet/sdk/2.1.302/

Host (useful for support):
  Version: 2.1.2
  Commit:  811c3ce6c0

.NET Core SDKs installed:
  2.1.302 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.2 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.2 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.2 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download
@sywhang

This comment has been minimized.

Show comment
Hide comment
@sywhang

sywhang Aug 9, 2018

Member

Hey @dotnetjt and @bbc2, thanks for reporting this issue. I was told about this issue by @noahfalk a couple days ago, but I haven't been able to spend much time on it because of a more pressing issue I've been asked to look at.

I will spend some time today to go over this and get back to you later tonight. This may be in the form of a full sample ELT profiler code. (That is, if this is not an issue on our ELT API but simply a misuse of our API)

If it turns out that our API has an issue, then I will also follow up on this thread with what I found.

Thanks,
Sung

Member

sywhang commented Aug 9, 2018

Hey @dotnetjt and @bbc2, thanks for reporting this issue. I was told about this issue by @noahfalk a couple days ago, but I haven't been able to spend much time on it because of a more pressing issue I've been asked to look at.

I will spend some time today to go over this and get back to you later tonight. This may be in the form of a full sample ELT profiler code. (That is, if this is not an issue on our ELT API but simply a misuse of our API)

If it turns out that our API has an issue, then I will also follow up on this thread with what I found.

Thanks,
Sung

@sywhang sywhang self-assigned this Aug 9, 2018

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Aug 9, 2018

@bbc2 your code looks correct to me for grabbing an int out of memory, assuming the startAddress you are given is correct. :)

My profiler is a little more open ended and does a lot of stuff before hand, inspecting signatures to determine the type, checking to see if it's boxed, etc. But in a controlled test, you can do exactly what you are there, and that's how I encountered the same issue. Glad you were able to see it as well so I feel more confident that it's not a coding issue on my end.

@noahfalk and @sywhang thank you!!

dotnetjt commented Aug 9, 2018

@bbc2 your code looks correct to me for grabbing an int out of memory, assuming the startAddress you are given is correct. :)

My profiler is a little more open ended and does a lot of stuff before hand, inspecting signatures to determine the type, checking to see if it's boxed, etc. But in a controlled test, you can do exactly what you are there, and that's how I encountered the same issue. Glad you were able to see it as well so I feel more confident that it's not a coding issue on my end.

@noahfalk and @sywhang thank you!!

@sywhang

This comment has been minimized.

Show comment
Hide comment
@sywhang

sywhang Aug 15, 2018

Member

Hi @dotnetjt and @bbc2 ,

So sorry for the late response. I just wanted to let you know that I am still looking into this issue! I wrote an example profiler for Windows but it turns out that there are subtle changes I have to make it also work on Linux (which mostly has to do with the wide string types that the profiler API uses).. I've just been pulled away for random things so I haven't had the time to get back to this but I still wanted to let you know that I haven't run away from this :) I will get post an update to this thread as soon as I have something concrete and tested on all platforms!

Member

sywhang commented Aug 15, 2018

Hi @dotnetjt and @bbc2 ,

So sorry for the late response. I just wanted to let you know that I am still looking into this issue! I wrote an example profiler for Windows but it turns out that there are subtle changes I have to make it also work on Linux (which mostly has to do with the wide string types that the profiler API uses).. I've just been pulled away for random things so I haven't had the time to get back to this but I still wanted to let you know that I haven't run away from this :) I will get post an update to this thread as soon as I have something concrete and tested on all platforms!

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Aug 21, 2018

@sywhang any updates? Completely understand on the string types.... been there myself!

dotnetjt commented Aug 21, 2018

@sywhang any updates? Completely understand on the string types.... been there myself!

@sywhang

This comment has been minimized.

Show comment
Hide comment
@sywhang

sywhang Aug 21, 2018

Member

@dotnetjt

So I was able to replicate your issue - I wrote a sample profiler that can successfully retrieve function arguments with FunctionEnter3WithInfo callback, but the same profiler code does not seem to behave as documented on Linux. I'm still continuing my investigation on whether it is the profiler that's at fault, or the runtime API itself that's at fault. It may be entirely possible that you've unveiled a bug in a part of the profiler API that hasn't been tested well, so thank you for reporting this issue!

I'll continue posting updates as I continue my investigation.

Member

sywhang commented Aug 21, 2018

@dotnetjt

So I was able to replicate your issue - I wrote a sample profiler that can successfully retrieve function arguments with FunctionEnter3WithInfo callback, but the same profiler code does not seem to behave as documented on Linux. I'm still continuing my investigation on whether it is the profiler that's at fault, or the runtime API itself that's at fault. It may be entirely possible that you've unveiled a bug in a part of the profiler API that hasn't been tested well, so thank you for reporting this issue!

I'll continue posting updates as I continue my investigation.

@sywhang

This comment has been minimized.

Show comment
Hide comment
@sywhang

sywhang Aug 23, 2018

Member

I've successfully verified that this is indeed a runtime bug and filed #19622 to track the specific issue in the runtime (since this thread is more about the general usage of ELT API I thought it'd be better to track the runtime bug in a different thread :) )

Member

sywhang commented Aug 23, 2018

I've successfully verified that this is indeed a runtime bug and filed #19622 to track the specific issue in the runtime (since this thread is more about the general usage of ELT API I thought it'd be better to track the runtime bug in a different thread :) )

@dotnetjt

This comment has been minimized.

Show comment
Hide comment
@dotnetjt

dotnetjt Sep 12, 2018

Hi @sywhang sorry for the slow reply from myself. Guess I missed the updates to this thread. Thanks for the update. I know you have been speaking with @sporty81 as well on this. We'll just keep an eye on it.

dotnetjt commented Sep 12, 2018

Hi @sywhang sorry for the slow reply from myself. Guess I missed the updates to this thread. Thanks for the update. I know you have been speaking with @sporty81 as well on this. We'll just keep an eye on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment