Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upwindows/certificates: Improve table's coverage of Personal certificates #5640
Conversation
They aren't returned by the cert enumeration APIs, but if we directly look into their %APPDATA%, we can find them. The files there are a custom file format, which we can parse to extract the actual encoded certificate. We can then forward that back to the winapi to extract info out of it. We can also proactively look for them on disk, regardless of the enumeration, which will guarantee that we will always be able to show Personal certs regardless of things like, e.g. login history. For example, if a user has never logged in on a system since boot, they may not appear in the registry, and thus the enumeration, so their Personal certs will not appear, even though they are available.
It would safely fail before with a file not found error, but this is more explicit.
| if (storeLocation != "Users" || boost::ends_with(storeId, "_Classes")) { | ||
| TLOG << " Trying harder to get Personal store."; | ||
|
|
||
| // TODO: This can technically be optimized. In certain cases, we will |
theopolis
Jul 17, 2019
Member
Do you mind moving the content of this TODO into an issue and replacing this with
// TODO(#1234): Title of the issue
Do you mind moving the content of this TODO into an issue and replacing this with
// TODO(#1234): Title of the issue
offlinemark
Jul 22, 2019
Author
Contributor
done 👍
done
| } | ||
|
|
||
| void findUserPersonalCertsOnDisk(const std::string& username, | ||
| const std::string storeId, |
theopolis
Jul 17, 2019
Member
Be aware that you are making a copy of the content when it is passed into this function as a non-reference. storeId ... etc are passed by value, is this your intention?
Be aware that you are making a copy of the content when it is passed into this function as a non-reference. storeId ... etc are passed by value, is this your intention?
offlinemark
Jul 22, 2019
Author
Contributor
done 👍
done
|
Thanks so much! Will address these shortly. |
|
Overall this lgtm, I just had a couple of questions/nits. I think another iteration and it should be g2g, thanks for adding so much love to the certs table! |
|
|
||
| Header hdr; | ||
|
|
||
| while (true) { |
muffins
Jul 17, 2019
Contributor
Hrm. Having a while (true) tends to make me pretty nervous. Any chance this could have some sort of max_depth value set and an increment, to ensure we don't loop forever?
Moreover... Scanning through your looping logic, do you know how often we hit the situation where we continue? Is there any chance you could re-write this to not make use of the while loop, but possibly a recursive function or the like?
Hrm. Having a while (true) tends to make me pretty nervous. Any chance this could have some sort of max_depth value set and an increment, to ensure we don't loop forever?
Moreover... Scanning through your looping logic, do you know how often we hit the situation where we continue? Is there any chance you could re-write this to not make use of the while loop, but possibly a recursive function or the like?
offlinemark
Jul 22, 2019
Author
Contributor
Yeah, I also think while(true) is a slight code smell, but in this particular case I found it to be one of the cleaner solutions I could think of (although I hadn't considered recursion).
re: infinite loops, those shouldn't be possible because of how we are reading from the stream. So at worst case, we will loop until we reach the end of the stream, then one of the !blob.good() checks will fail, and we will exit. (these file streams are typically less than 1kb, fwiw).
The continue will hit pretty often. The structure of the file being parsed is basically an array of something like
struct Chunk {
Header hdr;
char buf[hdr.size]; // variable length depending on the header
};
and generally the chunk we're interested in is at the end (so we are continueing past all the other chunks that don't have the propid we are looking for). (it would probably be good to have a short description of the format in a comment, I can add one)
I thought of rewriting it using normal loop constructs, or do-while, but it's tricky to avoid duplicated code because of all the error checking on the stream.
A recursive version is doable! It would look something like this:
int getEncodedCert(std::basic_istream<char> &blob,
std::vector<char>& encodedCert) {
const unsigned long CERT_CERT_PROP_ID = 0x20;
Header hdr;
blob.read(reinterpret_cast<char*>(&hdr), sizeof(hdr));
if (!blob.good()) {
return Status::failure();
}
if (hdr.propid != CERT_CERT_PROP_ID) {
blob.ignore(hdr.len);
if (!blob.good()) {
return Status::failure();
}
return getEncodedCert(blob, encodedCert); // recurse
}
encodedCert.resize(hdr.len);
blob.read(encodedCert.data(), hdr.len);
if (!blob.good()) {
return Status::failure();
}
return Status::success();
}
For me this is slightly less readable than the current (due to the recursion, maybe just me), and since I don't think the infinite loop should be an issue, I'd vote to keep it iterative for the typical reasons we prefer it in C++.
Yeah, I also think while(true) is a slight code smell, but in this particular case I found it to be one of the cleaner solutions I could think of (although I hadn't considered recursion).
re: infinite loops, those shouldn't be possible because of how we are reading from the stream. So at worst case, we will loop until we reach the end of the stream, then one of the !blob.good() checks will fail, and we will exit. (these file streams are typically less than 1kb, fwiw).
The continue will hit pretty often. The structure of the file being parsed is basically an array of something like
struct Chunk {
Header hdr;
char buf[hdr.size]; // variable length depending on the header
};
and generally the chunk we're interested in is at the end (so we are continueing past all the other chunks that don't have the propid we are looking for). (it would probably be good to have a short description of the format in a comment, I can add one)
I thought of rewriting it using normal loop constructs, or do-while, but it's tricky to avoid duplicated code because of all the error checking on the stream.
A recursive version is doable! It would look something like this:
int getEncodedCert(std::basic_istream<char> &blob,
std::vector<char>& encodedCert) {
const unsigned long CERT_CERT_PROP_ID = 0x20;
Header hdr;
blob.read(reinterpret_cast<char*>(&hdr), sizeof(hdr));
if (!blob.good()) {
return Status::failure();
}
if (hdr.propid != CERT_CERT_PROP_ID) {
blob.ignore(hdr.len);
if (!blob.good()) {
return Status::failure();
}
return getEncodedCert(blob, encodedCert); // recurse
}
encodedCert.resize(hdr.len);
blob.read(encodedCert.data(), hdr.len);
if (!blob.good()) {
return Status::failure();
}
return Status::success();
}
For me this is slightly less readable than the current (due to the recursion, maybe just me), and since I don't think the infinite loop should be an issue, I'd vote to keep it iterative for the typical reasons we prefer it in C++.
muffins
Jul 30, 2019
Contributor
I'd so follow your gut on this, if you think the current setup is good I'll be for it :) I tend to push back any time I see a while(true), but as you mention it should for the most part be ok. While a recursive solution might look a little better, it will still have the same consequences for someone trying to exhaust resources. It should be fine to leave.
I'd so follow your gut on this, if you think the current setup is good I'll be for it :) I tend to push back any time I see a while(true), but as you mention it should for the most part be ok. While a recursive solution might look a little better, it will still have the same consequences for someone trying to exhaust resources. It should be fine to leave.
offlinemark
Aug 1, 2019
Author
Contributor
Yeah, I think keeping it as is is my preference. I'd think that a recursive solution might possibly be more error prone due to being able to run out of stack space if there happened to be a blob file with a pathological number of chunks.
Yeah, I think keeping it as is is my preference. I'd think that a recursive solution might possibly be more error prone due to being able to run out of stack space if there happened to be a blob file with a pathological number of chunks.
|
@theopolis @muffins Thanks so much for your thorough reviews! I believe I have addressed all of the comments, except the one about whether to make the blob parsing function recursive. Just wanted to confirm that's what we want to do before I do it :) |
| } | ||
|
|
||
| void findUserPersonalCertsOnDisk(const std::string& username, | ||
| const std::string storeId, |
offlinemark
Jul 22, 2019
Author
Contributor
done 👍
done
|
Lgtm! I had a couple of nits/thoughts on the work. I think one last bit that has been discussed by the small council is we'd like to see unit tests for diffs coming in? Any chance you could add some unit tests for the table? Looks like we've already got some built out, would it be easy to extend the testing to add some checks for your code and increase the coverage? |
| << getSystemRoot().root_name().string() << "\\Users\\" << username | ||
| << "\\AppData\\Roaming\\Microsoft\\SystemCertificates\\My\\Certificates"; | ||
|
|
||
| try { |
muffins
Jul 30, 2019
Contributor
nit: I'll leave this one up to you, but this try is encompassing quite a few API calls. What are you trying to catch here? My concern is that if an exception is thrown in getEncodedCert, CertCreateCertificateContext, addCertRow, or in your simple enumeration of the file system, we'll always see the same error message displayed. Could this try/catch be brought in a bit closer to the parts where you think an issue might occur?
nit: I'll leave this one up to you, but this try is encompassing quite a few API calls. What are you trying to catch here? My concern is that if an exception is thrown in getEncodedCert, CertCreateCertificateContext, addCertRow, or in your simple enumeration of the file system, we'll always see the same error message displayed. Could this try/catch be brought in a bit closer to the parts where you think an issue might occur?
offlinemark
Aug 1, 2019
Author
Contributor
The try is meant for fs::directory_iterator which will throw filesystem_error. This is mainly to catch when a nonexistent certsPath is passed in which can occasionally happen. Since only filesystem_error is caught, exceptions raised from the functions you mentioned shouldn't trigger this logging message (and will either crash the program, or bubble up to a higher catch). Because of how we use directory_iterator in the for loop, I'm not sure the scope can be tighter.
https://www.boost.org/doc/libs/1_70_0/libs/filesystem/doc/reference.html#Error-reporting
The try is meant for fs::directory_iterator which will throw filesystem_error. This is mainly to catch when a nonexistent certsPath is passed in which can occasionally happen. Since only filesystem_error is caught, exceptions raised from the functions you mentioned shouldn't trigger this logging message (and will either crash the program, or bubble up to a higher catch). Because of how we use directory_iterator in the for loop, I'm not sure the scope can be tighter.
https://www.boost.org/doc/libs/1_70_0/libs/filesystem/doc/reference.html#Error-reporting
| storeLocation != "Users" || boost::ends_with(storeId, "_Classes"); | ||
|
|
||
| if (is_personal_store && not_already_added) { | ||
| VLOG(1) << "Trying harder to get Personal store."; |
muffins
Jul 30, 2019
Contributor
Hrm, what are your thoughts on the verbiage of this? if I saw this message coming across in my logs, I'd be curious what Trying harder means. How would you feel about elaborating a bit more? Perhaps something to the affect of Checking disk for additional user stored certificates or something?
Another thought I had, is there any extra added cost to "trying harder"? Is this something that we should be toggling on/off in the event someone wants more performance?
Hrm, what are your thoughts on the verbiage of this? if I saw this message coming across in my logs, I'd be curious what Trying harder means. How would you feel about elaborating a bit more? Perhaps something to the affect of Checking disk for additional user stored certificates or something?
Another thought I had, is there any extra added cost to "trying harder"? Is this something that we should be toggling on/off in the event someone wants more performance?
offlinemark
Aug 1, 2019
Author
Contributor
I addressed the first part, and added some performance notes in the PR description. tldr; performance impact should be negligible, even if a user legitimately is explicitly not interested in querying for Personal certificates.
I addressed the first part, and added some performance notes in the PR description. tldr; performance impact should be negligible, even if a user legitimately is explicitly not interested in querying for Personal certificates.
|
Thanks @muffins for the second round of review! |
| storeLocation != "Users" || boost::ends_with(storeId, "_Classes"); | ||
|
|
||
| if (is_personal_store && not_already_added) { | ||
| VLOG(1) << "Trying harder to get Personal store."; |
offlinemark
Aug 1, 2019
Author
Contributor
I addressed the first part, and added some performance notes in the PR description. tldr; performance impact should be negligible, even if a user legitimately is explicitly not interested in querying for Personal certificates.
I addressed the first part, and added some performance notes in the PR description. tldr; performance impact should be negligible, even if a user legitimately is explicitly not interested in querying for Personal certificates.
| << getSystemRoot().root_name().string() << "\\Users\\" << username | ||
| << "\\AppData\\Roaming\\Microsoft\\SystemCertificates\\My\\Certificates"; | ||
|
|
||
| try { |
offlinemark
Aug 1, 2019
Author
Contributor
The try is meant for fs::directory_iterator which will throw filesystem_error. This is mainly to catch when a nonexistent certsPath is passed in which can occasionally happen. Since only filesystem_error is caught, exceptions raised from the functions you mentioned shouldn't trigger this logging message (and will either crash the program, or bubble up to a higher catch). Because of how we use directory_iterator in the for loop, I'm not sure the scope can be tighter.
https://www.boost.org/doc/libs/1_70_0/libs/filesystem/doc/reference.html#Error-reporting
The try is meant for fs::directory_iterator which will throw filesystem_error. This is mainly to catch when a nonexistent certsPath is passed in which can occasionally happen. Since only filesystem_error is caught, exceptions raised from the functions you mentioned shouldn't trigger this logging message (and will either crash the program, or bubble up to a higher catch). Because of how we use directory_iterator in the for loop, I'm not sure the scope can be tighter.
https://www.boost.org/doc/libs/1_70_0/libs/filesystem/doc/reference.html#Error-reporting
|
|
||
| Header hdr; | ||
|
|
||
| while (true) { |
offlinemark
Aug 1, 2019
Author
Contributor
Yeah, I think keeping it as is is my preference. I'd think that a recursive solution might possibly be more error prone due to being able to run out of stack space if there happened to be a blob file with a pathological number of chunks.
Yeah, I think keeping it as is is my preference. I'd think that a recursive solution might possibly be more error prone due to being able to run out of stack space if there happened to be a blob file with a pathological number of chunks.
| certBuff.data(), | ||
| static_cast<unsigned long>(certBuff.size())); | ||
| r["common_name"] = certBuff.data(); | ||
| VLOG(1) << "cert name: " << certBuff.data(); |
theopolis
Aug 5, 2019
Member
I don't like this VLOG. If you are adding these for your own purposes of developing the table then we should remove them before landing. I don't see the value-add for a user running the table and attempting to debug.
I don't like this VLOG. If you are adding these for your own purposes of developing the table then we should remove them before landing. I don't see the value-add for a user running the table and attempting to debug.
offlinemark
Aug 5, 2019
Author
Contributor
done 👍
done
| QueryData& results) { | ||
| std::stringstream certsPath; | ||
| certsPath | ||
| << getSystemRoot().root_name().string() << "\\Users\\" << username |
theopolis
Aug 5, 2019
Member
Is this the best approach? Could this find the home directory from the users table? I am not sure if Windows users can change their home to another location. If they cannot then this seems fine.
Is this the best approach? Could this find the home directory from the users table? I am not sure if Windows users can change their home to another location. If they cannot then this seems fine.
offlinemark
Aug 6, 2019
Author
Contributor
Thanks for questioning this! I refactoring things to reuse some code from the users table for retrieving the home directory of a user. In doing so, I discovered that system accounts (e.g. Local System) also have directories for Personal certificates, and so I've modified the PR to also be able to retrieve these. (Previously, it would bail out with a filesystem error). I've modified the PR description with a few screenshots of this.
Thanks for questioning this! I refactoring things to reuse some code from the users table for retrieving the home directory of a user. In doing so, I discovered that system accounts (e.g. Local System) also have directories for Personal certificates, and so I've modified the PR to also be able to retrieve these. (Previously, it would bail out with a filesystem error). I've modified the PR description with a few screenshots of this.
| @@ -322,7 +581,25 @@ void enumerateCertStore(const HCERTSTORE& certStore, | |||
| auto certContext = CertEnumCertificatesInStore(certStore, nullptr); | |||
|
|
|||
| if (certContext == nullptr && GetLastError() == CRYPT_E_NOT_FOUND) { | |||
| TLOG << " Store was empty."; | |||
| VLOG(1) << "Store was empty."; | |||
theopolis
Aug 5, 2019
Member
This is not a great VLOG.
This is not a great VLOG.
offlinemark
Aug 5, 2019
Author
Contributor
removed 👍
removed
Similar to the user_groups table, we filter out some of the accounts we know will not have directories on disk for certificates
Previously, we assumed a certain path structure for where to find a user's Personal certificates. Now we reuse some functionality from the users table which checks the registry to retrieve the home directory.
For system accounts (e.g. Local System) the home dir paths we get back from getUserHomeDir may contain environment variables. This expands them so we can now retrieve certificates within certificate directories for system accounts.
|
Lgtm! |
|
Thanks so much @muffins and @theopolis! |
This is a follow up PR to #5631.
The winapis for certificate store enumeration generally do not show any certificates for other users' Personal stores, even when there are certificates there. Furthermore, whether the table shows information about other users' stores depends on whether the user has a registry hive mounted, which can be affected by events e.g. whether the user has logged in since boot. Personal certificates are actually stored on disk, rather than in the registry, which makes it possible to reliably read them (from an Admin process), regardless of whether they belong to a different user, and regardless of the state of the system's login history. This PR takes advantage of this to make the certificates table more complete wrt Personal certificates; Personal certificates are now guaranteed to show for all users, regardless of login history.
Testing:
I measured performance of the table with the
findUserPersonalCertsOnDiskfunction enabled and disabled (return immediately). RanMeasure-Command { .\osqueryd.exe -S "select * from certificates;" }20 times. My system has 3 local users with 1 Personal cert each.Disabled (average, ms): 1582
Enabled (average, ms): 1594
It really depends on the number of local users on the system, and the number of personal certs on disk each of them have, but I believe those should typically both be fairly low, so I don't think this PR should affect performance dramatically.
Here's an example screenshot of it running:
Here's 1 full line of output when run in json mode (happens to have been run on a different machine):
Here's a gist of full output after a fresh boot where only
markhas logged in. Notice howuserandtmponly have entries in theUsersstore location for thePersonalstore (because they are proactively searched for). No entries for other stores.https://gist.github.com/mossberg/31df3182aa719673a5eb3ba56d8a61ce
Here's full output after
userandtmphave logged in; there are entries in theUsersstore location for other stores because their hives have been loaded, so the winapis find them. https://gist.github.com/mossberg/048d4fb04c75bdd9f1d66c27e549ded7For additional testing, I added a certificate into the Local System account's directory for Personal certificates. This exists inside
C:\Windows\system32\config\systemprofile. I then verified that the table will find certificates within this directory for the Local System account. (To install a certificate there, I actually installed it to a local account, and then manually copied the blob file from the local user's certificate directory to Local System's).This code is hard to unit test because the testing system needs to have certificates and user accounts set up in a certain way. Writing a mock layer is also difficult due to the nature of the system APIs and the way they use (nested) callbacks. In general, for testing I placed a number of certificates explicitly in the Personal stores of several users on the system and ensured the queries could always find at least all Personal certificates I had installed, regardless of the user login history.