-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mark time column as "additional" for event tables #8020
Mark time column as "additional" for event tables #8020
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems kinda safe, the one thing that gives me pause is that I don't remember how how row_id
is calculated.
sqlite needs each row to have a unique row_id
. If there are no indexes it generates one by concating everything. If there are indexes, it concatenates those columns.
But I can't remember how additional works. It would be a problem if the time was the rowid.
Good question, it looks like it does concatenate all the columns together if there're only osquery/osquery/core/tables.cpp Lines 280 to 287 in 9f09b7c
Before this change
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for digging that up in tables.cpp. Interesting how pk
column changes, but maybe that doesn't matter.
It's in the part that @bgirardeau-figma has shown, since with only additional and no other primary key column which is actually unique, all the columns are part of the primary key. Plus
|
Background
Osquery optimizes event tables in scheduled queries to only return data since the last time the scheduled query was run (unless
--events_optimize=false
is set). Sometimes this behavior is not desirable if the query is for a specifically set time range, so there is code to turn off the optimization if a constraint on thetime
column is present. However this does not work today because the constraint is not available to the table implementation unless it is marked "additional" in the schema.Change
This change adds
additional=True
totime
columns in event tables, disabling event optimization when a time constraint is present in a scheduled query.If the old behavior is desired (having the time constraint filtered by SQL while optimization is applied), this is still possible by using the
+
operator to disable indexing in a query like+time > 0
.In theory this change has been the expected behavior, but it's possible others are relying on the old behavior. This should help fix #7352 (at least partially)
Test Plan
I did a couple basic tests with the query planner and disk events tables to make sure the time constraint was passed to xFilter when expected. It would be helpful if anyone else has more time or ideas on how to best test this.