Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve property parsing #41

Merged
merged 4 commits into from Sep 27, 2018
Merged

Improve property parsing #41

merged 4 commits into from Sep 27, 2018

Conversation

aydany
Copy link
Contributor

@aydany aydany commented Mar 15, 2018

This set introduces two improvements:

  1. The properties cache in the parser could store multiple copies of the same property
    This situation could occur because the last property parsed was added to the cache and returned but the buffer pointer and property index were not incremented. This is now fixed: every parsed property can exist in the cache just once and there is no duplicate property parsing.

  2. String values (ANSI and UNICODE) that belong to the last property in an event record are not guaranteed to be null-terminated. The code assumes that every string is null-terminated, which causes strlen and wcslen to potentially access invalid memory looking for a null terminator. If the memory access succeeds, the parser will throw an exception as the length of the string property will be too large. These error conditions were exposed when processing events from http.sys. I modeled the fix after the way Microsoft.Diagnostics.Tracing.TraceEventRawReaders deals with strings. All unit tests still pass and the events from http.sys are now processed without errors.

aydanyu added 2 commits March 14, 2018 22:50
The last property parsed was not added to the cache and the property
index was not advanced. This resulted in extra work during parsing and
the same property being present multiple times in the cache.
Strings at the end of a event buffer may not contain a null terminator.
@swannman
Copy link
Member

@aydany thanks for the pull request! I'm reviewing now. Also cc @zacbrown

Here's the TraceEventRawReaders reference: https://github.com/Microsoft/perfview/blob/master/src/TraceEvent/TraceEvent.cs#L3886

@swannman
Copy link
Member

swannman commented Mar 15, 2018

@aydany would you be willing to contribute a unit test that covers the non-null-terminated case so we have test coverage on this new code?

// The property was found, return it
if (name == pName) {
// advance the index since we've already processed this property
++i;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this line necessary? if we're about to return propInfo then i will go out of scope so we don't need to advance it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that i is defined as a reference to lastPropertyIndex_` at the start of the for loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I missed the ampersand. Thanks!

@aydany
Copy link
Contributor Author

aydany commented Mar 16, 2018

@swannman I'll take a look at adding a test tomorrow. Thanks for the review!

// size of unicode string is length + 1 (null byte) * sizeof(wchar)
auto chars = (ULONG)wcslen((const wchar_t*)propertyStart) + 1;
propertyLength = chars * sizeof(wchar_t);
const wchar_t* p = (const wchar_t*)propertyStart;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stylistically - prefer auto here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto line below as well

}
else if (propertyInfo.nonStructType.InType == TDH_INTYPE_ANSISTRING)
{
// size of ansi string is length + 1 (null byte)
propertyLength = (ULONG)strlen((const char*)propertyStart) + 1;
const char* p = (const char*)propertyStart;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer auto here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto line below as well

@swannman
Copy link
Member

swannman commented Apr 8, 2018

@aydany it looks like there are a few minor changes necessary before we can merge this in. Are you still willing to make those?

aydanyu added 2 commits September 27, 2018 10:00
Modified the record_generator to trim the generated record (optionally) if the last property is a string.
Include a test for UNICODE strings, but do not have an ETW event for to test the ANSI string case.
The included test results in error, when the size check changes are reverted.
@swannman
Copy link
Member

Nice! @zacbrown any concerns?

@aydany
Copy link
Contributor Author

aydany commented Sep 27, 2018

I apologize for taking way too long to get back to this. I addressed the comment about style and added a unit test to cover the code changes. The test will fail when run against master, but passes here.

Please note that the test covers only the UNICODE string case. To cover the ANSI string case we need a valid ETW event that has an ANSI string as its last property. I happened to encounter the UNICODE case accidentally and I do not have an example that can be used for the ANSI case at this time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants