Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add target-diff #664

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

JSCU-CNI
Copy link
Contributor

@JSCU-CNI JSCU-CNI commented Apr 4, 2024

This PR adds the command target-diff, which can be used to compare two or more targets against one another:

$ target-diff --help

target-diff

positional arguments:
  {shell,fs,query}      Mode for differentiating targets
    shell               Open an interactive shell to compare two or more targets.
    fs                  Yield records about differences between target filesystems.
    query               Differentiate plugin outputs between two or more targets.

options:
  -d, --deep            Compare file contents even if metadata suggests they have been left unchanged (default: False)
  -l LIMIT, --limit LIMIT
                        How many bytes to compare before assuming a file is left unchanged (0 for no limit) (default:
                        32768)

fs mode outputs records denoting filesystem changes from one target to the other:

$ target-diff --deep fs src.tar dst.tar

<differential/file/created hostname=None domain=None src_target='src.tar' dst_target='dst.tar' path='/changes/only_on_dst'>
<differential/file/deleted hostname=None domain=None src_target='src.tar' dst_target='dst.tar' path='/changes/only_on_src'>
<differential/file/modified hostname=None domain=None src_target='src.tar' dst_target='dst.tar' path='/changes/changed' diff=[b'--- \n', b'+++ \n', b'@@ -1 +1 @@\n', b'-SRC', b'+DST']>

Using query mode, you can compare plugin outputs from one target to the other:

$ target-diff query -f users src.tar dst.tar

<differential/record/unchanged hostname=None domain=None src_target='src.tar' dst_target='dst.tar' record=<unix/user hostname='dst_target' domain=None name='root' passwd='x' uid=0 gid=0 gecos='root' home='/root' shell='/bin/bash' source='/etc/passwd'>>
<differential/record/unchanged hostname=None domain=None src_target='src.tar' dst_target='dst.tar' record=<unix/user hostname='dst_target' domain=None name='user' passwd='x' uid=1000 gid=1000 gecos='user' home='/home/user' shell='/bin/bash' source='/etc/passwd'>>
<differential/record/created hostname=None domain=None src_target='src.tar' dst_target='dst.tar' record=<unix/user hostname='dst_target' domain=None name='dst_user' passwd='x' uid=1001 gid=1001 gecos='dst_user' home='/home/dst_user' shell='/bin/bash' source='/etc/passwd'>>
<differential/record/deleted hostname=None domain=None src_target='src.tar' dst_target='dst.tar' record=<unix/user hostname='src_target' domain=None name='src_user' passwd='x' uid=1001 gid=1001 gecos='src_user' home='/home/src_user' shell='/bin/bash' source='/etc/passwd'>>

In shell mode, you can browse the target filesystems like in target-shell, where directory listings will show which files / directories have been changed, added or deleted. Using the plugin command, plugin outputs can be compared from within the shell context.

$ target-diff shell src.tar dst.tar

(dst_target/src_target)/diff />help

Target Diff
==========


Documented commands (type help <topic>):
=================================================================
cat  clear  diff   exit  help  ls    plugin  previous  set
cd   cyber  enter  find  list  next  prev    python  

(dst_target/src_target)/diff />cd changes
(dst_target/src_target)/diff /changes>ls
changed
only_on_dst (deleted)
only_on_src (created)
subdirectory_both
subdirectory_dst (deleted)
subdirectory_src (created)
unchanged

target-diff depends on fox-it/flow.record#107. To allow tests to run for this PR we've temporarily bumped flow.record to 3.15.dev10 in pyproject.toml

When three or more targets are provided, you can choose between treating every target as a 'delta' or compare every target against one 'absolute' target. Treating targets as 'deltas' is useful if you have multiple snapshots of the same target from different points in time. Treating targets as 'absolutes' can be useful in situations where you have a 'golden image' that you want to compare different targets against.

To keep code duplication low between tools/diff.py and tools/shell.py, this PR adds a superclass ExtendedCmd to shell.py that contains most of the functionality that is shared between the two. Both TargetCmd and DifferentialCli inherit from this class.

Copy link
Member

@yunzheng yunzheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review the whole PR, just added some small flow.record quality of life suggestions since the merge of fox-it/flow.record#115

Available in flow.record==3.15.

@@ -34,7 +34,7 @@ dependencies = [
"dissect.regf>=3.3.dev,<4.0.dev",
"dissect.util>=3.0.dev,<4.0.dev",
"dissect.volume>=3.0.dev,<4.0.dev",
"flow.record~=3.14.0",
"flow.record~=3.15.dev10 ",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"flow.record~=3.15.dev10 ",
"flow.record~=3.15.0",

Comment on lines +16 to +21
from flow.record import (
IGNORE_FIELDS_FOR_COMPARISON,
Record,
RecordOutput,
set_ignored_fields_for_comparison,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can now use the ignore_fields_for_comparison context manager.

Suggested change
from flow.record import (
IGNORE_FIELDS_FOR_COMPARISON,
Record,
RecordOutput,
set_ignored_fields_for_comparison,
)
from flow.record import (
ignore_fields_for_comparison,
Record,
RecordOutput,
)

Comment on lines +312 to +331
old_ignored_values = IGNORE_FIELDS_FOR_COMPARISON
set_ignored_fields_for_comparison(["_generated", "_source", "hostname", "domain"])

src_records = set(get_plugin_output_records(plugin_name, plugin_arg_parts, self.src_target))
src_records_seen = set()

for dst_record in get_plugin_output_records(plugin_name, plugin_arg_parts, self.dst_target):
if dst_record in src_records:
src_records_seen.add(dst_record)
yield RecordUnchangedRecord(
src_target=self.src_target.path, dst_target=self.dst_target.path, record=dst_record
)
else:
yield RecordCreatedRecord(
src_target=self.src_target.path, dst_target=self.dst_target.path, record=dst_record
)
for record in src_records - src_records_seen:
yield RecordDeletedRecord(src_target=self.src_target.path, dst_target=self.dst_target.path, record=record)

set_ignored_fields_for_comparison(old_ignored_values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can now be wrapped in with ignore_fields_for_comparison([..]): block

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants