Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Conversation

@sywhang
Copy link

@sywhang sywhang commented Jan 4, 2019

EventPipe has a buffer that's stored in the Thread object, but in the runtime we may be emitting events from threads that are not Thread object. (i.e. BGC threads). This moves the pointer to EventPipeBufferList that's stored in Thread to TLS so that it can be accessed from a "non-Thread" thread.

This should address https://github.com/dotnet/coreclr/issues/21380

@sywhang sywhang requested review from jorive, noahfalk and vancem January 4, 2019 23:45
pActivityId,
pRelatedActivityId);
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pInstance = new (m_pCurrent) EventPipeEventInstance(
session,
event,
osThreadId,
pDataDest,
payload.GetSize(),
(pThread == NULL) ? NULL : pActivityId,
pRelatedActivityId);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the conditional on pActivityId is needed. I think we can further simplify to:

Suggested change
pInstance = new (m_pCurrent) EventPipeEventInstance(
session,
event,
osThreadId,
pDataDest,
payload.GetSize(),
pActivityId,
pRelatedActivityId);

src/vm/threads.h Outdated
AppDomain* m_pAppDomain; // This field is read only by the SOS plugin to get the AppDomain
void** m_EETlsData; // ClrTlsInfo::data
EventPipeBufferList* m_pThreadEventBufferList;
bool m_threadEventWriteInProgress;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread local slots are moderately precious things, so would like to economize. You can put the m_threadEventWriteInProgress as a property of EventPipeBufferList and avoid needing it here (it belongs there anyway, it is faster (since you already have a EventPipeBufferList when you need to know EventWriteInProgress, and this variable is protecting exactly the EventPipeBufferList structure anyway, so there is no down-side to moving it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fyi changes in DeAllocateBuffers have broken the original usage of m_threadEventWriteInProgress. It is dead-code at this point. Originally this field was used so that DeAllocateBuffers could clean up per-thread buffers that were still in use. Looking over the current code I believe we are leaking all the thread local buffers on shutdown.

I think we'd be OK checking in with the leaks and then fixing it in a follow-up PR.

Background: Prior to the PR we had two per-thread volatile fields and the code used them like this:

/* cleanup-thread in DeAllocateBuffers() */
if(!pThread->m_threadEventWriteInProgress)
{
    // delete all the memory in the buffers
    m_pThreadEventBufferList = NULL;
}
else
{
     // leak it
}


/* event writing thread in EventPipeBufferManager::WriteEvent */
m_threadEventWriteInProgress = TRUE 
pBufferList = m_pThreadEventBufferList
WriteToTheBuffer(pBufferList)
m_threadEventWriteInProgress = FALSE

We've since eliminated that code in DeAllocateBuffers() and I didn't notice any replacement.

That original code didn't look thread-safe either. The WriteEvent thread could set the in-use flag and begin using the buffers just after the cleanup thread checked they weren't in use.

@vancem
Copy link

vancem commented Jan 5, 2019

@jkotas @brianrob FYI.

Copy link

@vancem vancem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a couple of change request, but mostly it looks good.

@sywhang
Copy link
Author

sywhang commented Jan 5, 2019

Thanks Vance! I'll update the PR accordingly.

if (pThread == NULL)
{
// if pthread is NULL, it's likely we are running in something like a GC thread which is not a Thread object, so it can't have an activity ID set anyway
pInstance = new (m_pCurrent) EventPipeEventInstance(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing thing different between the two cases is the activity ID argument. You can just have the condition for that one argument without duplicating the whole thing.

(pThread != NULL) ? pActivityId : NULL

pActivityId,
pRelatedActivityId);
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the conditional on pActivityId is needed. I think we can further simplify to:

Suggested change
pInstance = new (m_pCurrent) EventPipeEventInstance(
session,
event,
osThreadId,
pDataDest,
payload.GetSize(),
pActivityId,
pRelatedActivityId);

src/vm/threads.h Outdated
Thread* m_pThread;
AppDomain* m_pAppDomain; // This field is read only by the SOS plugin to get the AppDomain
void** m_EETlsData; // ClrTlsInfo::data
EventPipeBufferList* m_pThreadEventBufferList;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously these types were marked volatile and now they aren't. I would either restore those markings or walk me through the rationale of why they are no longer necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas @kouvel - Is there rationale about why some fields are placed directly in ThreadLocalInfo vs. most others with are indirected inside m_EETlsData? Its unclear to me if putting new fields here is appropriate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should be able do define your own local __thread EventPipeBufferList* t_pThreadEventBufferList. The reasons why we had to have all thread-local statics in this central structure should no longer exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

walk me through the rationale of why they are no longer necessary.

The central structure was required because of the offsets of thread-local statics were hardcoded in the debugger. They are no longer hardcoded. The code in InitThreadManager computes the current offset and communicates it to debugger via DAC.

Copy link
Member

@noahfalk noahfalk Jan 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkotas - sorry I had two separate issues in those two comments. The text you quoted was in the first comment asking about Sung changing types from Volatile<T> to T but you answered referring to the second comment about usage of TLS slots.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noahfalk Yeah, I meant to put the Volatile there. Thanks for catching that!
@jkotas Regarding where these are placed (in ThreadLocalInfo vs just defining __thread EventPipeBufferList* t_pThreadEventBufferList - is one preferred over the other?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be the local definition - __thread EventPipeBufferList* t_pThreadEventBufferList.

EventPipeBufferList *pBufferList = GetThreadEventBufferList();
if(pBufferList != NULL)
{
pBufferList->SetOwnedByThread(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the buffers aren't owned by the Thread object any longer then we should move the cleanup code to a path that applies to both managed and unmanaged threads. As it is, creating an unmanaged thread, logging from it, then destroying it would leak a buffer. CExecutionEngine::ThreadDetaching looks like a reasonable place though admittedly I am not that familiar with that portion of our code. @kouvel might be more knowledgable about it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like CExecutionEngine::ThreadDetaching only applies to Thread objects and is called when the Thread object is being cleaned up. One way may be to add a destructor to ThreadLocalInfo, seems to work with __declspec(thread) but would need to ensure that it also works with clang. Another option is to keep the thread local info in Thread and create Thread objects. The Thread object is not necessarily tied to managed threads, that would be the m_exposedObject in Thread. For instance, wait threads in the thread pool (see ThreadpoolMgr::CreateWaitThread()) set up the thread to create a Thread object even though they would not run any managed code. I don't know the code well enough yet to tell when a Thread object is necessary vs not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Kount! I agree if we could do a destructor that seems very elegant.

@jkotas believed creating a Thread for those threads would be the harder of the options (https://github.com/dotnet/coreclr/issues/21380#issuecomment-447876984) but I don't have the GC knowledge to have a good opinion myself.

if (!pThreadBufferList->OwnedByThread())
{
Thread *pThread = NULL;
while ((pThread = ThreadStore::GetThreadList(pThread)) != NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brianrob was there a reason we needed to find the Thread object that was pointing at a specific buffer list and NULL it out? It appears we only unowned a buffer when the Thread was being detached/destroyed and I would assume we could either NULL the m_pEventPipeBufferList field at that point or ignore it because it would never be read from again?

@sywhang - This loop is weird now. It is iterating over all the Thread objects, but since you don't use the Thread object in the body of the loop it is just checking the current thread's list (the thread that is executing this code, not the thread refered to by the pThread object) over and over again. Depending on what Brian says I suspect we can delete this entire while((pThread = …) loop as no longer useful, and we could set the TLS slot containing the EventPipeBufferList NULL at the same time it is unowned by a thread.

// so that they can be cleaned up after the thread dies.
EventPipeBufferList *pBufferList = th->GetEventPipeBufferList();
EventPipeBufferList *pBufferList = GetThreadEventBufferList();
if(pBufferList != NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as mentioned below. If the buffers aren't owned by Thread then Thread related code shouldn't be responsible for cleaning them up.

bool success = true;
EX_TRY
{
LPCGUID pThreadActivityID = pActivityId;
Copy link

@jorive jorive Jan 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assignment intended? Variable is not used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I missed that. Thanks for catching that

@sywhang
Copy link
Author

sywhang commented Jan 31, 2019

@noahfalk @jkotas Could you please take another look at this PR? Thank you!

} // extern "C"


EXTERN_C inline EventPipeBufferList* STDCALL GetThreadEventBufferList()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These do not need to be EXTERN_C. In fact, these can be a static methods inside EventPipeBufferList.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or inside ThreadEventBufferList)

{
if (payload[j-1] == (byte)(0) && payload[j] == (byte)(0))
{
byteCount = j;
Copy link
Member

@noahfalk noahfalk Feb 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an off-by-one-error. For example in the string "cat", I'd expect byteCount = 8 but this computes 7. Below

charPayload = MemoryMarshal.Cast<byte, char>(payload.Slice(0, byteCount));

creates payload with an odd number of bytes, then casts it as a charPayload. I'm guessing the implicit rounding in the length conversion within the cast makes this work out, but it would be nice not to rely on those subtleties.

@noahfalk
Copy link
Member

noahfalk commented Feb 6, 2019

heads up, I responded to Vance's comment way above, its easy to miss in all the noise.

// when the thread dies so we can free EventPipeBufferList in the destructor.
class ThreadEventBufferList
{
public:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Since you have get&set methods, the field itself can be private. Only the methods need to be public.

for (int j = 0; j < payload.Length; j+=2)
{
if (payload[j-1] == (byte)(0) && payload[j] == (byte)(0))
if (payload[j] == (byte)(0) && payload[j+1] == (byte)(0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately now the code will read off the end of the buffer when given an odd size buffer (which isn't supposed to happen, but depending on global correctness of code to prevent overruning a buffer isn't great)

I'd suggest leaving the code how it was before, except set byteCount = j+1 and adjust the payload slice length computations.

@sywhang
Copy link
Author

sywhang commented Feb 9, 2019

@noahfalk @jkotas Is there anything else you'd like me to address in this PR? (@noahfalk, I'll fix the race condition and the leak in a subsequent PR as you mentioned in earlier comment)

{
charPayload = charPayload.Slice(0, charCount);
payload = payload.Slice((charCount + 1) * sizeof(char));
charPayload = MemoryMarshal.Cast<byte, char>(payload.Slice(0, byteCount));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
charPayload = MemoryMarshal.Cast<byte, char>(payload.Slice(0, byteCount));
//byteCount includes the null terminator, which we do not want in the managed string object
charPayload = MemoryMarshal.Cast<byte, char>(payload.Slice(0, byteCount - sizeof(char));

byteCount includes the null terminator bytes, but managed string buffers aren't canonically null-terminated. For an example, this code snippet will fail the assert evaluating 4 == 3

        char[] catChars = { 'c', 'a', 't', (char)0 };
        ReadOnlySpan<char> span = new ReadOnlySpan<char>(catChars);
        string cat = new string(span);
        string cat2 = "cat";
        Debug.Assert(cat.Length == cat2.Length);

Copy link
Member

@noahfalk noahfalk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you apply that one final fix then lgtm

int charCount = charPayload.IndexOf('\0');
if (charCount < 0)
// Try to find null terminator (0x00) from the byte span
// NOTE: we do this by hand instead of using payload.IndexOf('\n') because payload may be unaligned due to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This comment talks about payload.IndexOf('\n'), but the code is looking for zeros (\n is not a zero).

@sywhang
Copy link
Author

sywhang commented Feb 10, 2019

Merging this since CI failures seem to be infrastructure issues (timeouts). Thanks everyone for all the feedback!

@sywhang sywhang merged commit ee8cda0 into dotnet:master Feb 10, 2019
@sywhang sywhang deleted the fix-eventpipe-buffer branch May 2, 2019 00:46
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* start ripping out eventpipe buffer to tls

* can now emit events from gc threads

* cleanup

* more cleanup

* more cleanup

* tested on linux

* Addressing PR comments

* Move things around a bit to build in Linux

* change eventpipe buffer deallocation code

* more cleanup

* this while loop doesnt do anything now

* Fix build

* fixing build

* More cleanup

* more pr comments

* Fix unix build

* more pr comments

* trying to add a message to assertion that seems to be causing CIs to fail

* more pr feedback

* handle non-2-byte aligned string payloads inside payload buffers

* some more cleanup

* Fix off by one error in null index calculation

* Make Get/SetThreadEventBufferList a static member of ThreadEventBufferList

* make only the methods public in ThreadEventBufferList

* Addressing noah's comments

* fix comment and last off by 1 error


Commit migrated from dotnet/coreclr@ee8cda0
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants