Add new ADS table #8190

nachorpaez · 2023-11-13T20:01:39Z

This PR adds a new table called ads to query Alternate Data Streams (ADS) in Windows. Resolves #5250

The table uses FindFirstStreamW and FindNextStreamW to enumerate the streams in a file and read the content.

If the stream name is Zone.Identifier it parses the values in a separate function.

I tried to keep the logic similar to the extended_attributes table and how it handles the kMDItemWhereFroms and quarantine attributes.

Screenshots

Additional resources

Practical Guide to Alternative Data Streams in NTFS

Alternate Data Streams Documentation

Highway To The Danger Zone.Identifier

About URL Security Zones

Open question

NTFS ADS also support executable files, for example we can hide the notepad executable in test.txt.

C:\WINDOWS>echo Test>test.txt
C:\WINDOWS>type notepad.exe>test.txt:note.exe
C:\WINDOWS>type test.txt
Test

As of now the table will base64 encode the binary but would love some guidance on what would be the best option for this cases.

linux-foundation-easycla · 2023-11-13T20:01:47Z

The committers listed above are authorized under a signed CLA.

✅ login: nachorpaez / name: Nacho Rodriguez Paez (4f3eec5, ff85549, 7efb23e, 33dce52, 35061e2, 0486fb7, f03eac9)

zwass

Thank you! This is a Windows "feature" I did not know about previously 😅

specs/windows/ads.table

osquery/tables/system/windows/ads.cpp

zwass · 2023-11-22T18:19:58Z

osquery/tables/system/windows/ads.cpp

+    // Folders can have ADS streams too
+    if (!(boost::filesystem::is_regular_file(path, ec) ||
+          boost::filesystem::is_directory(path, ec))) {
+      continue;
+    }
+    enumerateStreams(results, path.string());


Would this miss directories that are included in the directory constraint?

zwass · 2023-11-22T18:22:26Z

osquery/tables/system/windows/ads.cpp

+
+  if (hFind != INVALID_HANDLE_VALUE) {
+    do {
+      // Skip first stream


Why? As a reader previously unfamiliar with ADS, I don't understand why the first stream is not relevant. Is it because the first stream is the standard file?

Yes, per FindFirstStreamW docs:

The FindFirstStreamW function opens a search handle and returns information about the first $DATA stream in the specified file or directory. For files, this is always the default, unnamed data stream, "::$DATA". Directories do not have $DATA streams by default and cannot have an unnamed data stream, but may have named data streams set after they have been created.

The unnamed data stream is the file contents so I thought it's best to skip that one. Although I have just noticed that by the way the table works now it will skip the first stream of a directory when it shouldn't do it since they can't have unnamed data streams.

I will address this alongside the other comments.

zwass · 2023-11-22T18:24:56Z

osquery/tables/system/windows/ads.cpp

+            const std::string& value) {
+  Row r;
+  r["path"] = path;
+  r["directory"] = boost::filesystem::path(path).parent_path().string();


In the case of a directory would this erroneously return the parent directory? I'm thinking in that case path and directory should be the same?

Yes, if a directory is specified the table will return something like:

> select * from ads where path = 'C:\Users\ignacior\Downloads\subdir'; +------------------------------------+-----------------------------+----------+-----------------+--------+ | path | directory | key | value | base64 | +------------------------------------+-----------------------------+----------+-----------------+--------+ | C:\Users\ignacior\Downloads\subdir | C:\Users\ignacior\Downloads | hide.txt | secret | 0 | +------------------------------------+-----------------------------+----------+-----------------+--------+

Which is a similar behaviour to the extended_attributes table:

Using a virtual database. Need help, type '.help' osquery> select version from osquery_info; +---------+ | version | +---------+ | 5.4.0 | +---------+ osquery> select * from extended_attributes where path = '/Users/ignacior/Downloads'; +---------------------------+-----------------+----------------+--------------------------------------+--------+ | path | directory | key | value | base64 | +---------------------------+-----------------+----------------+--------------------------------------+--------+ | /Users/ignacior/Downloads | /Users/ignacior | com.apple.macl | BAAoRInjLHdMmrS8U0MEzCBvBADshbS1+VFBm| 1 | +---------------------------+-----------------+----------------+--------------------------------------+--------+

I'm happy to change it though if it's better to keep path and directory the same.

tests/integration/tables/ads.cpp

nachorpaez · 2023-11-30T19:39:03Z

@zwass I applied some of your suggested changes. I will take a look at having the tests create a file later this week.

directionless · 2023-12-15T03:24:15Z

osquery/tables/system/windows/ads.cpp

+  }
+}
+
+QueryData genAds(QueryContext& context) {


I'm a little unsure of how the path and directory expansion logic work.

I think the intent is when one can query a directory, and get back all the files inside. (Similar to the file table). Or one can query a specific path. (And both probably support LIKE). This makes sense to me.

But the implementation enumerates the the path first, and then the directory. And I think that will result in equal work being done. Sometimes? I guess it's confusing.

If both are part of the query predicate, sqlite will filter to only return rows that match both. Eg select * from ads where directory = '/tmp/' AND path = '/var/tmp/foo' would result first in enumerating /var/tmp/foo, then all the files in /tmp, and would ultimately return nothing, because sqlite filtered them.

Though, to correct myself, that's not at all true there's an OR in there.

I guess I'd suggest a reasonable pattern is to generate the list of things to enumerate, and then find the union of them. This would, at least, prevent duplicate enumeration.

The logic is a copy paste from the extended_attributes table since it was from where I based by work form. I agree is a bit confusing 😅

I can take a look into making it more efficient.

directionless · 2023-12-15T03:27:34Z

specs/windows/ads.table

+    Column("path", TEXT, "Absolute file path", required=True, index=True),
+    Column("directory", TEXT, "Directory of file(s)", required=True),
+    Column("key", TEXT, "Name of the value generated from the stream"),
+    Column("value", TEXT, "The parsed information from the attribute"),


What kind of data shows up here? Some internet things talk about malware smuggling entire file contents here. Will that be okay to push back in a column? (We don't generally push that much data through osquery, so it feels a little amiss)

100% agree. I think the main value of this table is the content of the Zone.Identifier stream which can help during investigations to identify where the file was downloaded from.
In my PR description I left an open question about how to handle cases where a stream contains an entire file. Maybe we can set a hard limit on the length of the content and warn users if the content is too large to be displayed by osquery?
Happy to hear other thoughts.

I'm also not sure how the extended_attributes table handles this type of cases in nix systems.

I'm also not sure how the extended_attributes table handles this type of cases in nix systems.

The possible max value size is much more limited: https://en.wikipedia.org/wiki/Extended_file_attributes

The Linux kernel allows extended attribute to have names of up to 255 bytes and values of up to 64 KiB, as do XFS and ReiserFS, but ext2/3/4 and btrfs impose much smaller limits, requiring all the attributes (names and values) of one file to fit in one "filesystem block" (usually 4 KiB)

I think it probably depends on what we're concerned about.

If it's content, the most conservative approach would be to only fetch the value for keys on a compiled allowlist. (pushing people to use carves if they want the content)

If it's size, truncation and warnings probably make sense.

nachorpaez added 2 commits November 13, 2023 19:27

add windows ads table

4f3eec5

add missing newline

ff85549

nachorpaez requested review from a team as code owners November 13, 2023 20:01

nachorpaez added 2 commits November 13, 2023 22:04

linter fixes

33dce52

fixed tests syntax error

35061e2

zwass requested changes Nov 22, 2023

View reviewed changes

zwass reviewed Nov 22, 2023

View reviewed changes

tests/integration/tables/ads.cpp Show resolved Hide resolved

Addressed PR feedback

7efb23e

improved tests

0486fb7

nachorpaez requested a review from zwass December 14, 2023 22:22

directionless reviewed Dec 15, 2023

View reviewed changes

directionless mentioned this pull request Dec 21, 2023

Table Request: zone_identifier (Windows) kolide/launcher#585

Open

Merge branch 'osquery:master' into master

f03eac9

michael-myers added virtual tables Windows labels Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new ADS table #8190

Add new ADS table #8190

nachorpaez commented Nov 13, 2023 •

edited

Loading

linux-foundation-easycla bot commented Nov 13, 2023 •

edited

Loading

zwass left a comment

zwass Nov 22, 2023

zwass Nov 22, 2023

nachorpaez Nov 25, 2023

zwass Nov 22, 2023

nachorpaez Nov 30, 2023

nachorpaez commented Nov 30, 2023

directionless Dec 15, 2023

nachorpaez Dec 15, 2023

directionless Dec 15, 2023

nachorpaez Dec 15, 2023

Smjert Dec 15, 2023 •

edited

Loading

directionless Dec 19, 2023

Add new ADS table #8190

Are you sure you want to change the base?

Add new ADS table #8190

Conversation

nachorpaez commented Nov 13, 2023 • edited Loading

Screenshots

Additional resources

Open question

linux-foundation-easycla bot commented Nov 13, 2023 • edited Loading

zwass left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nachorpaez commented Nov 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Smjert Dec 15, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nachorpaez commented Nov 13, 2023 •

edited

Loading

linux-foundation-easycla bot commented Nov 13, 2023 •

edited

Loading

Smjert Dec 15, 2023 •

edited

Loading