New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[master task] event driven file integrity monitoring #722

Closed
marpaia opened this Issue Feb 9, 2015 · 16 comments

Comments

Projects
None yet
4 participants
@marpaia
Contributor

marpaia commented Feb 9, 2015

Let's add a top-level config as such:

"threat_intel" : {
  "file_paths": {
    "downloads": [
      "/Users/%/Downloads/%%"
    ],
    "system_binaries": [
      "/bin/%",
      "/usr/bin/%",
      "/usr/local/bin/%"
    ]
  }
}

For context, see the example config.

On process startup, we should expand all of those paths. Let's say the expanded paths looked something like:

/Users/marpaia/Downloads/foobar.exe
/bin/bash
/usr/bin/env
/usr/local/bin/python

We should hash all of those paths and store the results in RocksDB. If hash results already exist for those pasts, these values should be updated to reflect the new, correct hash value.

With regards to RocksDB, this will probably cause us to create one or two new column families.

The first column family (threats_map) would look something like:

key value
/Users/marpaia/Downloads/foobar.exe downloads
/bin/bash system_binaries
/usr/bin/env system_binaries
/usr/local/bin/python system_binaries

The second column family (threats_hashes) would look something like:

key value
/Users/marpaia/Downloads/foobar.exe {"sha256": "72b7650b237c3e124aa6b6e625a4f665ca137b411c0b257d696080cdf9254a01", "sha1": "1b29766d8882c8279b2c20e32f933578dfdf41b4", "md5": "f81cce1751382506604e244039bf4724"}
/bin/bash {"sha256": "72b7650b237c3e124aa6b6e625a4f665ca137b411c0b257d696080cdf9254a01", "sha1": "1b29766d8882c8279b2c20e32f933578dfdf41b4", "md5": "f81cce1751382506604e244039bf4724"}
/usr/bin/env {"sha256": "72b7650b237c3e124aa6b6e625a4f665ca137b411c0b257d696080cdf9254a01", "sha1": "1b29766d8882c8279b2c20e32f933578dfdf41b4", "md5": "f81cce1751382506604e244039bf4724"}
/usr/local/bin/python {"sha256": "72b7650b237c3e124aa6b6e625a4f665ca137b411c0b257d696080cdf9254a01", "sha1": "1b29766d8882c8279b2c20e32f933578dfdf41b4", "md5": "f81cce1751382506604e244039bf4724"}

So, again, on process startup, we'll have to expand all paths and make the appropriate entry in threats_map. There should be a background thread that constantly (but performantly) keeps threats_map up to date.

Once threats_map is expanded and up to date, the process should make an initial run through threats_hashes to make sure all of the data is correct. We'll obviously incur a performance penalty here.

Once threats_hashes is up to date, event subscribers should be attached to all of the expanded paths (or their parent directories). If the files change, the file should be re-hashed and threats_hashes should be updated.

@marpaia

This comment has been minimized.

Show comment
Hide comment
@marpaia

marpaia Feb 9, 2015

Contributor

@jedi22, when you get a chance, can you comment on here with your initial run at a plan of action?

Contributor

marpaia commented Feb 9, 2015

@jedi22, when you get a chance, can you comment on here with your initial run at a plan of action?

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Feb 10, 2015

Contributor

There are several parts of getting this done:

  1. Take file regex strings from the configuration data and organize them into categories.
  2. Take the regexes and resolve them into all matching files.
  3. Add a listener to each file so track changes to the file and add it into the table.
  4. Create a table for each category so generated data can be queried.

So far, every table in osquery is defined by its own .cpp file. With these tables that may no longer be the case. The tables and their names will be dynamically generated at program start based on the contents of the configuration file.

An example of such configuration data is shown above, and the plan is for the schema to remain consistent with existing event based tables:

CREATE VIRTUAL TABLE passwd_changes USING passwd_changes(target_path TEXT, time TEXT, action TEXT, transaction_id BIGINT);

The above example would translate to following schemas:

CREATE VIRTUAL TABLE file_paths_download_changes USING file_paths_download_changes(target_path TEXT, time TEXT, action TEXT, transaction_id BIGINT);
CREATE VIRTUAL TABLE file_paths_system_binaries_changes USING file_path_system_binaries_changes(target_path TEXT, time TEXT, action TEXT, transaction_id BIGINT);

Presently, the extra file_paths data is being parsed out and placed into a map in the OsqueryConfig structure.

Contributor

obelisk commented Feb 10, 2015

There are several parts of getting this done:

  1. Take file regex strings from the configuration data and organize them into categories.
  2. Take the regexes and resolve them into all matching files.
  3. Add a listener to each file so track changes to the file and add it into the table.
  4. Create a table for each category so generated data can be queried.

So far, every table in osquery is defined by its own .cpp file. With these tables that may no longer be the case. The tables and their names will be dynamically generated at program start based on the contents of the configuration file.

An example of such configuration data is shown above, and the plan is for the schema to remain consistent with existing event based tables:

CREATE VIRTUAL TABLE passwd_changes USING passwd_changes(target_path TEXT, time TEXT, action TEXT, transaction_id BIGINT);

The above example would translate to following schemas:

CREATE VIRTUAL TABLE file_paths_download_changes USING file_paths_download_changes(target_path TEXT, time TEXT, action TEXT, transaction_id BIGINT);
CREATE VIRTUAL TABLE file_paths_system_binaries_changes USING file_path_system_binaries_changes(target_path TEXT, time TEXT, action TEXT, transaction_id BIGINT);

Presently, the extra file_paths data is being parsed out and placed into a map in the OsqueryConfig structure.

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Feb 18, 2015

Contributor

A just made the first commit to a pull request that will eventually implement this feature, #770.

In both OS X and Linux, you can now add a top level dictionary to your config JSON in this form,

"additional_monitoring" : {
    "file_paths": {
      "system_binaries": [
        "/bin/bash",
        "/bin/zsh"
      ]
    }
  }

Wildcard are also supported and every file that the wildcard resolves to will have a file listener attached to it. This will generate table entries if the file is ever changed.

Example output of this table is:

+--------------+-----------------+------------+---------+----------------+
| target_path  | category        | time       | action  | transaction_id |
+--------------+-----------------+------------+---------+----------------+
| /tmp/bash    | system_binaries | 1424224965 | UPDATED | 0              |
| /tmp/bash    | system_binaries | 1424224972 | UPDATED | 0              |
| /tmp/bash    | system_binaries | 1424224978 | UPDATED | 0              |
+--------------+-----------------+------------+---------+----------------+

As you can see, the top level inside additional monitoring is the value of category, so you can easily distinguish your lists this way. This has several advantages to the previously proposed approach:

  1. It does not require many tables to be searched
  2. The table can be defined in a single file and not by code that has no file presence
  3. It makes it quite easy and efficient to join multiple categories together with other tables

Outstanding Issues

Currently these tables allocate memory use new which is never freed. For our leak tests, this will cause issues even though in practice, never freeing them should be fine because the pointers are never lost, allocations only happen once and the tables are lived through the entire run of the program.

This issue was resolved after we started returning a static reference in config.

We would like the ability to only optionally define a callback with user_data instead of always having regardless of using it or not.

Also it seems the that the time stamps on events are temperamental. This should be investigated before a final merge.

Contributor

obelisk commented Feb 18, 2015

A just made the first commit to a pull request that will eventually implement this feature, #770.

In both OS X and Linux, you can now add a top level dictionary to your config JSON in this form,

"additional_monitoring" : {
    "file_paths": {
      "system_binaries": [
        "/bin/bash",
        "/bin/zsh"
      ]
    }
  }

Wildcard are also supported and every file that the wildcard resolves to will have a file listener attached to it. This will generate table entries if the file is ever changed.

Example output of this table is:

+--------------+-----------------+------------+---------+----------------+
| target_path  | category        | time       | action  | transaction_id |
+--------------+-----------------+------------+---------+----------------+
| /tmp/bash    | system_binaries | 1424224965 | UPDATED | 0              |
| /tmp/bash    | system_binaries | 1424224972 | UPDATED | 0              |
| /tmp/bash    | system_binaries | 1424224978 | UPDATED | 0              |
+--------------+-----------------+------------+---------+----------------+

As you can see, the top level inside additional monitoring is the value of category, so you can easily distinguish your lists this way. This has several advantages to the previously proposed approach:

  1. It does not require many tables to be searched
  2. The table can be defined in a single file and not by code that has no file presence
  3. It makes it quite easy and efficient to join multiple categories together with other tables

Outstanding Issues

Currently these tables allocate memory use new which is never freed. For our leak tests, this will cause issues even though in practice, never freeing them should be fine because the pointers are never lost, allocations only happen once and the tables are lived through the entire run of the program.

This issue was resolved after we started returning a static reference in config.

We would like the ability to only optionally define a callback with user_data instead of always having regardless of using it or not.

Also it seems the that the time stamps on events are temperamental. This should be investigated before a final merge.

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Feb 19, 2015

Contributor

It is important to note:

Monitoring /bin/% will not alert you of new files added to /bin/

It will alert on change or removal of already existing files.

Contributor

obelisk commented Feb 19, 2015

It is important to note:

Monitoring /bin/% will not alert you of new files added to /bin/

It will alert on change or removal of already existing files.

@marpaia

This comment has been minimized.

Show comment
Hide comment
@marpaia

marpaia Feb 19, 2015

Contributor

@jedi22 we definitely need to fix that

Contributor

marpaia commented Feb 19, 2015

@jedi22 we definitely need to fix that

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Feb 25, 2015

Contributor

This is fixed on OS X because fsevents is really robust and generally awesome.

The inotify API is going to take more work to get to a stage where we are happy with the functionality of it.

Contributor

obelisk commented Feb 25, 2015

This is fixed on OS X because fsevents is really robust and generally awesome.

The inotify API is going to take more work to get to a stage where we are happy with the functionality of it.

@marpaia

This comment has been minimized.

Show comment
Hide comment
@marpaia

marpaia Feb 25, 2015

Contributor

👍 nice!

Contributor

marpaia commented Feb 25, 2015

👍 nice!

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Feb 25, 2015

Contributor

Progress on inotify is coming along, there are probably a few bugs and edge cases to be worked out, but so far we should now have recursive directory monitoring with updates. Meaning that if a folder is added, this is detected and also added to the listening group.

Currently our overflow case handling is not tested but it does exist. The thread should exit if an overflow is encounter less than 10 seconds after the previous one.

Example Served Configuration

{
  "scheduledQueries": [
    {
      "name": "time",
      "query": "select * from time;",
      "interval": 1
    }
  ],
  "additional_monitoring" : {
    "file_paths": {
      "system_files": [
    "/etc"
      ],
      "core_binaries": [
        "/bin/"
      ],
      "other_binaries":[
        "/sbin/",
    "/usr/sbin"
      ]
    }
  }
}

Output:

osquery> select * from file_changes;

+----------------------------+----------------+------------+---------------------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+
| target_path                | category       | time       | action              | transaction_id | md5                              | sha1                                     | sha256                                                           |
+----------------------------+----------------+------------+---------------------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+
| /sbin/dhclient             | other_binaries | 1424906630 | ATTRIBUTES_MODIFIED | 0              | a66ed71ff10aeca7c7da78751f49d2ac | c78ad296ac89a28b6f6940cc4e8323eefa0c6753 | 4508f701c57be1de511834498c99c4a7e8df5c3ba02a68c9e5a35150c846e0b5 |
| /bin/bash                  | core_binaries  | 1424906655 | ATTRIBUTES_MODIFIED | 0              | 164ebd6889588da166a52ca0d57b9004 | 8e3aa19fdc42e87659746f6dc8ea3af74ab30362 | 8c4d49445d0050884e0703571f187338b10c7836b08ed822cc5fc6cf15ac76b0 |
| /etc/passwd                | system_files   | 1424906686 | ATTRIBUTES_MODIFIED | 0              | 75c3fe2762cb49e8752b184782b29bf7 | fc40119ce13444f4440b7178e41014405d9d4f8c | 7aea7dc57d9534710ef9b90d31fb9789fbed48a9b1f133c2a1c27aef3ac9526d |
| /usr/sbin/adduser          | other_binaries | 1424906738 | ATTRIBUTES_MODIFIED | 0              | 8d2d0e52087d5410833fda48bc4a92d3 | 3676086256f82ece513d9942b8442e5ccfe8fa97 | 18e157120375a40bea7e412d2ce97fa61bb3fb131d63545bfd201c45da7103f9 |
| /usr/sbin/grub-install     | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 4dd119dd265cc2d74f27f1863d4cdfae | 56a6c844e317e12128cb7a5105b51db566df375c | f0297edf2432fd85d487639de50039eb7520976938d4661c35eeb1609eb8215c |
| /usr/sbin/grub-macbless    | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 641cd7ea45e9afaee445ca2f4c0c4152 | c7c81e8c64c35d65fb7b40ac332c60526499a4a0 | 27a339352999089efb8f6fb296a1cca9f000a72146966f7767b8f1513b1896d4 |
| /usr/sbin/grub-mkconfig    | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 1312e159da6a017d4e0b067598ba4365 | 2d35016860c8ee4cb7fc1f104fc35ad9ace1b039 | 02a5063699dfd052216f22df4752860192a5b116eda61516e036791b4224c862 |
| /usr/sbin/grub-mkdevicemap | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 809949137438d12cb9f812018c136af1 | 60f449ed7530d1ba0bd208fe9372c517c38aee5e | 293e5876916454da53693d2657c87e082c3df062e85c848e2490906f1ae3fb86 |
| /usr/sbin/grub-probe       | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 0ddfb4197eb788ac8cf75091e7290032 | d2fc2255da3ed398806f9613b68fa255c50509e0 | 25c93b73d9dfd0cad18bbb7196a9e56aec8be33b9c8b5d1615a46c7f26e13a21 |
| /usr/sbin/grub-reboot      | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 6082170501f578d40a2853a0d56f5a25 | e50f6adbfdcee2ac7a1d6539eaecbc32c912b4c7 | 30c9513d0690a255a4977587d9ead3854638d5781c82f3f53c78783f38c71853 |
| /usr/sbin/grub-set-default | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 0602cd96d977c0c14558d1a9f2ad8500 | fc3ded3079cf1c89c2e5f85e70d39c8264ed7639 | 32bd4f931a2bde91bf467eb63f6bded21f2f66cc935d7a5b435fcbbf36b6c632 |
+----------------------------+----------------+------------+---------------------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+

Also, much thanks to @theopolis for his help and his existing inotify code :D

Contributor

obelisk commented Feb 25, 2015

Progress on inotify is coming along, there are probably a few bugs and edge cases to be worked out, but so far we should now have recursive directory monitoring with updates. Meaning that if a folder is added, this is detected and also added to the listening group.

Currently our overflow case handling is not tested but it does exist. The thread should exit if an overflow is encounter less than 10 seconds after the previous one.

Example Served Configuration

{
  "scheduledQueries": [
    {
      "name": "time",
      "query": "select * from time;",
      "interval": 1
    }
  ],
  "additional_monitoring" : {
    "file_paths": {
      "system_files": [
    "/etc"
      ],
      "core_binaries": [
        "/bin/"
      ],
      "other_binaries":[
        "/sbin/",
    "/usr/sbin"
      ]
    }
  }
}

Output:

osquery> select * from file_changes;

+----------------------------+----------------+------------+---------------------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+
| target_path                | category       | time       | action              | transaction_id | md5                              | sha1                                     | sha256                                                           |
+----------------------------+----------------+------------+---------------------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+
| /sbin/dhclient             | other_binaries | 1424906630 | ATTRIBUTES_MODIFIED | 0              | a66ed71ff10aeca7c7da78751f49d2ac | c78ad296ac89a28b6f6940cc4e8323eefa0c6753 | 4508f701c57be1de511834498c99c4a7e8df5c3ba02a68c9e5a35150c846e0b5 |
| /bin/bash                  | core_binaries  | 1424906655 | ATTRIBUTES_MODIFIED | 0              | 164ebd6889588da166a52ca0d57b9004 | 8e3aa19fdc42e87659746f6dc8ea3af74ab30362 | 8c4d49445d0050884e0703571f187338b10c7836b08ed822cc5fc6cf15ac76b0 |
| /etc/passwd                | system_files   | 1424906686 | ATTRIBUTES_MODIFIED | 0              | 75c3fe2762cb49e8752b184782b29bf7 | fc40119ce13444f4440b7178e41014405d9d4f8c | 7aea7dc57d9534710ef9b90d31fb9789fbed48a9b1f133c2a1c27aef3ac9526d |
| /usr/sbin/adduser          | other_binaries | 1424906738 | ATTRIBUTES_MODIFIED | 0              | 8d2d0e52087d5410833fda48bc4a92d3 | 3676086256f82ece513d9942b8442e5ccfe8fa97 | 18e157120375a40bea7e412d2ce97fa61bb3fb131d63545bfd201c45da7103f9 |
| /usr/sbin/grub-install     | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 4dd119dd265cc2d74f27f1863d4cdfae | 56a6c844e317e12128cb7a5105b51db566df375c | f0297edf2432fd85d487639de50039eb7520976938d4661c35eeb1609eb8215c |
| /usr/sbin/grub-macbless    | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 641cd7ea45e9afaee445ca2f4c0c4152 | c7c81e8c64c35d65fb7b40ac332c60526499a4a0 | 27a339352999089efb8f6fb296a1cca9f000a72146966f7767b8f1513b1896d4 |
| /usr/sbin/grub-mkconfig    | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 1312e159da6a017d4e0b067598ba4365 | 2d35016860c8ee4cb7fc1f104fc35ad9ace1b039 | 02a5063699dfd052216f22df4752860192a5b116eda61516e036791b4224c862 |
| /usr/sbin/grub-mkdevicemap | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 809949137438d12cb9f812018c136af1 | 60f449ed7530d1ba0bd208fe9372c517c38aee5e | 293e5876916454da53693d2657c87e082c3df062e85c848e2490906f1ae3fb86 |
| /usr/sbin/grub-probe       | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 0ddfb4197eb788ac8cf75091e7290032 | d2fc2255da3ed398806f9613b68fa255c50509e0 | 25c93b73d9dfd0cad18bbb7196a9e56aec8be33b9c8b5d1615a46c7f26e13a21 |
| /usr/sbin/grub-reboot      | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 6082170501f578d40a2853a0d56f5a25 | e50f6adbfdcee2ac7a1d6539eaecbc32c912b4c7 | 30c9513d0690a255a4977587d9ead3854638d5781c82f3f53c78783f38c71853 |
| /usr/sbin/grub-set-default | other_binaries | 1424906770 | ATTRIBUTES_MODIFIED | 0              | 0602cd96d977c0c14558d1a9f2ad8500 | fc3ded3079cf1c89c2e5f85e70d39c8264ed7639 | 32bd4f931a2bde91bf467eb63f6bded21f2f66cc935d7a5b435fcbbf36b6c632 |
+----------------------------+----------------+------------+---------------------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+

Also, much thanks to @theopolis for his help and his existing inotify code :D

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Feb 28, 2015

Contributor

The inotify and fsevents interfaces should now be the same (save some bugs I haven't found yet). You should be able to use the entire supported wildcard spec implemented in osquery to monitor file, folders or both.

The osquery wildcard spec

osquery has a simplified wildcarding system for matching operating system directories and files.

% - Match all
%% - Match all recursively
%XX - Match all ending in XX
XX% - Match all starting with XX

Examples

/bin/% - Resolves a vector of every file in /bin
/bin/%% - Match all files in bin and all files in any sub directory(n deep, to a limit)
/bin/%sh - Match all files in bin ending with sh. /bin/bash /bin/sh /bin/zsh
/bin/ba% - Match all files in /bin starting with ba. /bin/bash

Note

%XX% and XX%XX is undefined and will not resolve wildcards in the expected way. This may be implemented in future but there are no plans.

Filesystem API Changes

In this update I have tried to keep resolveFilePattern API compatible, it should still be but I would still encourage any testing of extensions that use this functionality that they still work as intended. This will also help surface any bugs in the new implementation.

As far as features go, you can now specify a bitmask for returning:
*Only Files
*Only Folder
*Or both

You may notice there is another flag called REC_EVENT_OPT. This is flag meant to optimize wildcard resolutions for the event tables. This is not supported outside this context and may be removed in a future update.

Other notes

Events, at least with fsevents, do not fire on links. This may be fixed in the future but it is a known issue now.

Prior to this, fsevents was not returning results for touches. This has been fixed by adding the new event type kFSEventStreamEventFlagItemInodeMetaMod. If you were waiting for this functionality, it is now there.

Listening will only apply starting at the last wildcard, or component. For example, if you add monitoring to /Users/%/Downloads/%%

You will get updates for when any user add, removes, or updates a file in their downloads folder. You will NOT however get updates if a new user is added.

Conclusion

This table is nearing it's v1 release and should have all the features listed above. If you find any bugs, please open a task and link it here and I will try to resolve it.

The merge goal for this code in Friday March 6th, 2015

Contributor

obelisk commented Feb 28, 2015

The inotify and fsevents interfaces should now be the same (save some bugs I haven't found yet). You should be able to use the entire supported wildcard spec implemented in osquery to monitor file, folders or both.

The osquery wildcard spec

osquery has a simplified wildcarding system for matching operating system directories and files.

% - Match all
%% - Match all recursively
%XX - Match all ending in XX
XX% - Match all starting with XX

Examples

/bin/% - Resolves a vector of every file in /bin
/bin/%% - Match all files in bin and all files in any sub directory(n deep, to a limit)
/bin/%sh - Match all files in bin ending with sh. /bin/bash /bin/sh /bin/zsh
/bin/ba% - Match all files in /bin starting with ba. /bin/bash

Note

%XX% and XX%XX is undefined and will not resolve wildcards in the expected way. This may be implemented in future but there are no plans.

Filesystem API Changes

In this update I have tried to keep resolveFilePattern API compatible, it should still be but I would still encourage any testing of extensions that use this functionality that they still work as intended. This will also help surface any bugs in the new implementation.

As far as features go, you can now specify a bitmask for returning:
*Only Files
*Only Folder
*Or both

You may notice there is another flag called REC_EVENT_OPT. This is flag meant to optimize wildcard resolutions for the event tables. This is not supported outside this context and may be removed in a future update.

Other notes

Events, at least with fsevents, do not fire on links. This may be fixed in the future but it is a known issue now.

Prior to this, fsevents was not returning results for touches. This has been fixed by adding the new event type kFSEventStreamEventFlagItemInodeMetaMod. If you were waiting for this functionality, it is now there.

Listening will only apply starting at the last wildcard, or component. For example, if you add monitoring to /Users/%/Downloads/%%

You will get updates for when any user add, removes, or updates a file in their downloads folder. You will NOT however get updates if a new user is added.

Conclusion

This table is nearing it's v1 release and should have all the features listed above. If you find any bugs, please open a task and link it here and I will try to resolve it.

The merge goal for this code in Friday March 6th, 2015

@theopolis

This comment has been minimized.

Show comment
Hide comment
@theopolis

theopolis Feb 28, 2015

Contributor

Epic! Can we summarize this in a wiki format next week?

Sent from my Android

On Feb 27, 2015, at 5:51 PM, Mitchell Grenier notifications@github.com wrote:

The inotify and fsevents interfaces should now be the same (save some bugs I haven't found yet). You should be able to use the entire supported wildcard spec implemented in osquery to monitor file, folders or both.

The osquery wildcard spec

osquery has a simplified wildcarding system for matching operating system directories and files.

% - Match all
%% - Match all recursively
%XX - Match all ending in XX
XX% - Match all starting with XX

Examples

/bin/% - Resolves a vector of every file in /bin
/bin/%% - Match all files in bin and all files in any sub directory(n deep, to a limit)
/bin/%sh - Match all files in bin ending with sh. /bin/bash /bin/sh /bin/zsh
/bin/ba% - Match all files in /bin starting with ba. /bin/bash

Note

%XX% and XX%XX is undefined and will not resolve wildcards in the expected way. This may be implemented in future but there are no plans.

Filesystem API Changes

In this update I have tried to keep resolveFilePattern API compatible, it should still be but I would still encourage any testing of extensions that use this functionality that they still work as intended. This will also help surface any bugs in the new implementation.

As far as features go, you can now specify a bitmask for returning:
*Only Files
*Only Folder
*Or both

You may notice there is another flag called REC_EVENT_OPT. This is flag meant to optimize wildcard resolutions for the event tables. This is not supported outside this context and may be removed in a future update.

Other notes

Events, at least with fsevents, do not fire on links. This may be fixed in the future but it is a known issue now.

Prior to this, fsevents was not returning results for touches. This has been fixed by adding the new event type kFSEventStreamEventFlagItemInodeMetaMod. If you were waiting for this functionality, it is now there.

Listening will only apply starting at the last wildcard, or component. For example, if you add monitoring to /Users/%/Downloads/%%

You will get updates for when any user add, removes, or updates a file in their downloads folder. You will NOT however get updates if a new user is added.

Conclusion

This table is nearing it's v1 release and should have all the features listed above. If you find any bugs, please open a task and link it here and I will try to resolve it.

The merge goal for this code in Friday March 6th, 2015


Reply to this email directly or view it on GitHub.

Contributor

theopolis commented Feb 28, 2015

Epic! Can we summarize this in a wiki format next week?

Sent from my Android

On Feb 27, 2015, at 5:51 PM, Mitchell Grenier notifications@github.com wrote:

The inotify and fsevents interfaces should now be the same (save some bugs I haven't found yet). You should be able to use the entire supported wildcard spec implemented in osquery to monitor file, folders or both.

The osquery wildcard spec

osquery has a simplified wildcarding system for matching operating system directories and files.

% - Match all
%% - Match all recursively
%XX - Match all ending in XX
XX% - Match all starting with XX

Examples

/bin/% - Resolves a vector of every file in /bin
/bin/%% - Match all files in bin and all files in any sub directory(n deep, to a limit)
/bin/%sh - Match all files in bin ending with sh. /bin/bash /bin/sh /bin/zsh
/bin/ba% - Match all files in /bin starting with ba. /bin/bash

Note

%XX% and XX%XX is undefined and will not resolve wildcards in the expected way. This may be implemented in future but there are no plans.

Filesystem API Changes

In this update I have tried to keep resolveFilePattern API compatible, it should still be but I would still encourage any testing of extensions that use this functionality that they still work as intended. This will also help surface any bugs in the new implementation.

As far as features go, you can now specify a bitmask for returning:
*Only Files
*Only Folder
*Or both

You may notice there is another flag called REC_EVENT_OPT. This is flag meant to optimize wildcard resolutions for the event tables. This is not supported outside this context and may be removed in a future update.

Other notes

Events, at least with fsevents, do not fire on links. This may be fixed in the future but it is a known issue now.

Prior to this, fsevents was not returning results for touches. This has been fixed by adding the new event type kFSEventStreamEventFlagItemInodeMetaMod. If you were waiting for this functionality, it is now there.

Listening will only apply starting at the last wildcard, or component. For example, if you add monitoring to /Users/%/Downloads/%%

You will get updates for when any user add, removes, or updates a file in their downloads folder. You will NOT however get updates if a new user is added.

Conclusion

This table is nearing it's v1 release and should have all the features listed above. If you find any bugs, please open a task and link it here and I will try to resolve it.

The merge goal for this code in Friday March 6th, 2015


Reply to this email directly or view it on GitHub.

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Mar 2, 2015

Contributor

For sure

Contributor

obelisk commented Mar 2, 2015

For sure

@wxsBSD

This comment has been minimized.

Show comment
Hide comment
@wxsBSD

wxsBSD Mar 3, 2015

Contributor

I decided to implement an idea of mine in osquery and it turns out I need many of these features, so thank you for writing them! I hope you don't mind me hijacking this issue but I think it is related.

I am not sure things are working right or if I'm just misunderstanding everything. I decided to take a look at how file_changes works and here's what I've found. I'm doing this on an OS X 10.10.2 system.

This is the config I'm using:

{
  "scheduledQueries": [
  ],
  "additional_monitoring" : {
    "file_paths": {
      "system_binaries": [
        "/Users/wxs/tmp/foo/%"
      ]
    }
  }
}

If I start with an empty /Users/wxs/tmp/foo/, while osqueryi is running I can do echo "foo" > /Users/wxs/tmp/foo/bar and there will not be an entry in the file_changes table. If, however, ~/tmp/foo/bar does exist at the time I start osqueryi and I modify or delete that file the change is properly recorded in file_changes. I can even delete the file and recreate it and the create event is recorded.

I suspect this has to do with the fsevents code missing the event for "file created in this directory" but I'm not able to track it down right now before I have to get some sleep. 😜

Contributor

wxsBSD commented Mar 3, 2015

I decided to implement an idea of mine in osquery and it turns out I need many of these features, so thank you for writing them! I hope you don't mind me hijacking this issue but I think it is related.

I am not sure things are working right or if I'm just misunderstanding everything. I decided to take a look at how file_changes works and here's what I've found. I'm doing this on an OS X 10.10.2 system.

This is the config I'm using:

{
  "scheduledQueries": [
  ],
  "additional_monitoring" : {
    "file_paths": {
      "system_binaries": [
        "/Users/wxs/tmp/foo/%"
      ]
    }
  }
}

If I start with an empty /Users/wxs/tmp/foo/, while osqueryi is running I can do echo "foo" > /Users/wxs/tmp/foo/bar and there will not be an entry in the file_changes table. If, however, ~/tmp/foo/bar does exist at the time I start osqueryi and I modify or delete that file the change is properly recorded in file_changes. I can even delete the file and recreate it and the create event is recorded.

I suspect this has to do with the fsevents code missing the event for "file created in this directory" but I'm not able to track it down right now before I have to get some sleep. 😜

@theopolis

This comment has been minimized.

Show comment
Hide comment
@theopolis

theopolis Mar 3, 2015

Contributor

@jedi22 thank it's possible to add a unittest for the workflow @wxsBSD described?

Contributor

theopolis commented Mar 3, 2015

@jedi22 thank it's possible to add a unittest for the workflow @wxsBSD described?

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Mar 3, 2015

Contributor

Absolutely, I will attempt to fix this today

Contributor

obelisk commented Mar 3, 2015

Absolutely, I will attempt to fix this today

@wxsBSD

This comment has been minimized.

Show comment
Hide comment
@wxsBSD

wxsBSD Mar 3, 2015

Contributor

I was confused last night. I thought this was all dealt with in #770, which was already merged. Turns out that #784 also is relevant, which was merged while I slept apparently. I believe this is working exactly as advertised in the latest master. Again, thank you for this!

Contributor

wxsBSD commented Mar 3, 2015

I was confused last night. I thought this was all dealt with in #770, which was already merged. Turns out that #784 also is relevant, which was merged while I slept apparently. I believe this is working exactly as advertised in the latest master. Again, thank you for this!

@obelisk

This comment has been minimized.

Show comment
Hide comment
@obelisk

obelisk Mar 3, 2015

Contributor

blog_120906_08

Contributor

obelisk commented Mar 3, 2015

blog_120906_08

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment