New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing of procfs stat files when comm name contains spaces #12791

Merged
merged 1 commit into from Oct 19, 2016

Conversation

Projects
None yet
4 participants
@stephentoub
Member

stephentoub commented Oct 19, 2016

If a process or thread name contains a space, our procfs stat file parsing logic breaks as it sees the space in the name as a separator between elements. This in turn causes Process.GetProcesses() and other such functions that rely on these stat files to fail with InvalidDataExceptions. This commit fixes it by adding knowledge of '(' and ')' groupings to the parser.

Fixes #12755
cc: @ellismg, @ianhays, @mellinoe, @AndreyAkinshin

@mellinoe

This comment has been minimized.

Show comment
Hide comment
@mellinoe

mellinoe Oct 19, 2016

Contributor

The changes look good to me, overall. I wanted to point something out that I found in some brief testing I did. I see there's some tests for files with parentheses in their names, but when I actually create a program with parentheses in the name, I get weird output from /proc/<pid>/stat. It doesn't print out the entire name, and often has mismatched parentheses. For example, I created a program named executable-(name), and it shows up in /proc/<pid>/stat as (executable-(nam). Did you encounter this kind of behavior? I think that it is also easy to craft a name with a bunch of "right parentheses", which I believe would trigger an exception in the parser.

Contributor

mellinoe commented Oct 19, 2016

The changes look good to me, overall. I wanted to point something out that I found in some brief testing I did. I see there's some tests for files with parentheses in their names, but when I actually create a program with parentheses in the name, I get weird output from /proc/<pid>/stat. It doesn't print out the entire name, and often has mismatched parentheses. For example, I created a program named executable-(name), and it shows up in /proc/<pid>/stat as (executable-(nam). Did you encounter this kind of behavior? I think that it is also easy to craft a name with a bunch of "right parentheses", which I believe would trigger an exception in the parser.

@@ -139,7 +139,7 @@ public static IPEndPoint[] ParseActiveUdpListenersFromFiles(string udp4File, str
// Parsing logic for local and remote addresses and ports, as well as socket state.
internal static TcpConnectionInformation ParseTcpConnectionInformationFromLine(string line)
{
StringParser parser = new StringParser(line, ' ', true);
StringParser parser = new StringParser(line, ' ', skipEmpty: true);

This comment has been minimized.

@mellinoe

mellinoe Oct 19, 2016

Contributor

You could also just switch the optional parameters in the overload (make the new one last). Up to you.

@mellinoe

mellinoe Oct 19, 2016

Contributor

You could also just switch the optional parameters in the overload (make the new one last). Up to you.

@mellinoe

This comment has been minimized.

Show comment
Hide comment
@mellinoe

mellinoe Oct 19, 2016

Contributor

To expand on the last point, I created a file called

I)mN)o)tEvenTrying)ToMatch)Parentheses)

and it shows up in /proc/<pid>/stat as

(I)mN)o)tEvenTry)

which seems bizarre to me.

Contributor

mellinoe commented Oct 19, 2016

To expand on the last point, I created a file called

I)mN)o)tEvenTrying)ToMatch)Parentheses)

and it shows up in /proc/<pid>/stat as

(I)mN)o)tEvenTry)

which seems bizarre to me.

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Oct 19, 2016

Member

For example, I created a program named executable-(name), and it shows up in /proc//stat as (executable-(nam)

Just to be clear, you're not using the parsing behavior, and just commenting on what's showing up in the files in procfs, right? I've not seen the behavior you're referring to.

To expand on the last point

Hmm, I'm not sure how to handle this. Seems like maybe I just remove the throw I added for mismatched parens in order to try to be a bit more forgiving. We could also eat errors we get from parsing and just don't include processes/threads in the result if there's parsing errors.

Member

stephentoub commented Oct 19, 2016

For example, I created a program named executable-(name), and it shows up in /proc//stat as (executable-(nam)

Just to be clear, you're not using the parsing behavior, and just commenting on what's showing up in the files in procfs, right? I've not seen the behavior you're referring to.

To expand on the last point

Hmm, I'm not sure how to handle this. Seems like maybe I just remove the throw I added for mismatched parens in order to try to be a bit more forgiving. We could also eat errors we get from parsing and just don't include processes/threads in the result if there's parsing errors.

@mellinoe

This comment has been minimized.

Show comment
Hide comment
@mellinoe

mellinoe Oct 19, 2016

Contributor

From the few tests I did, it looks like, if the "name" field in proc/<pid>/stat has any parentheses, then it is always "bounded" by matching parentheses. I don't think any of the other fields in the file can ever have parentheses, so I think we may be able to just assume that, if there are any parentheses, then the name is whatever is between the outer matching pair of parentheses. If that assumption doesn't hold up, I'm not sure what to do.

Contributor

mellinoe commented Oct 19, 2016

From the few tests I did, it looks like, if the "name" field in proc/<pid>/stat has any parentheses, then it is always "bounded" by matching parentheses. I don't think any of the other fields in the file can ever have parentheses, so I think we may be able to just assume that, if there are any parentheses, then the name is whatever is between the outer matching pair of parentheses. If that assumption doesn't hold up, I'm not sure what to do.

@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Oct 19, 2016

Member

assume that, if there are any parentheses, then the name is whatever is between the outer matching pair of parentheses

Ok, I'll do that.

Member

stephentoub commented Oct 19, 2016

assume that, if there are any parentheses, then the name is whatever is between the outer matching pair of parentheses

Ok, I'll do that.

Fix parsing of procfs stat files when comm name contains spaces
If a process or thread name contains a space, our procfs stat file parsing logic breaks as it sees the space in the name as a separator between elements.  This in turn causes Process.GetProcesses() and other such functions that rely on these stat files to fail with InvalidDataExceptions.  This commit fixes it by adding knowledge of '(' and ')' groupings to the parser.
@stephentoub

This comment has been minimized.

Show comment
Hide comment
@stephentoub

stephentoub Oct 19, 2016

Member

Thanks for the feedback, @mellinoe. I've changed it to add a dedicated method to StringParser for parsing the next entry that's expected to be surrounded in the only set of top-level parentheses in the string. I also added a few more test cases based on our chat.

Member

stephentoub commented Oct 19, 2016

Thanks for the feedback, @mellinoe. I've changed it to add a dedicated method to StringParser for parsing the next entry that's expected to be surrounded in the only set of top-level parentheses in the string. I also added a few more test cases based on our chat.

@stephentoub stephentoub merged commit aab5e06 into dotnet:master Oct 19, 2016

10 checks passed

Innerloop CentOS7.1 Debug Build and Test Build finished.
Details
Innerloop CentOS7.1 Release Build and Test Build finished.
Details
Innerloop Linux ARM Emulator Debug Cross Build Build finished.
Details
Innerloop Linux ARM Emulator Release Cross Build Build finished.
Details
Innerloop OSX Debug Build and Test Build finished.
Details
Innerloop OSX Release Build and Test Build finished.
Details
Innerloop Ubuntu14.04 Debug Build and Test Build finished.
Details
Innerloop Ubuntu14.04 Release Build and Test Build finished.
Details
Innerloop Windows_NT Debug Build and Test Build finished.
Details
Innerloop Windows_NT Release Build and Test Build finished.
Details

@stephentoub stephentoub deleted the stephentoub:fix_procfs_parsing branch Oct 19, 2016

@karelz karelz modified the milestone: 1.2.0 Dec 3, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment