[merged] Atomic/diff.py: Use go-mtree for file comparisons#777
[merged] Atomic/diff.py: Use go-mtree for file comparisons#777baude wants to merge 1 commit intoprojectatomic:masterfrom
Conversation
Atomic/diff.py
Outdated
| diffp.add_argument("-k", "--keywords", nargs='+', | ||
| choices=['link', 'nlink', 'mode', 'type', 'time', 'uid', 'gid', 'size', 'sha256digest'], | ||
| default=['link', 'nlink', 'mode', 'type', 'time', 'uid', 'gid', 'size'], | ||
| help=_("iExclusive keywords to be used for file level comparision")) |
|
This works well for me. |
| diffp.add_argument("-k", "--keywords", nargs='+', | ||
| choices=['link', 'nlink', 'mode', 'type', 'time', 'uid', 'gid', 'size', 'sha256digest'], | ||
| default=['link', 'nlink', 'mode', 'type', 'time', 'uid', 'gid', 'size'], | ||
| help=_("Exclusive keywords to be used for file level comparision")) |
There was a problem hiding this comment.
@rhatdan I can do 'all' but if we ever add two algo's for digesting, they both would then get used. Adding for now as it makes sense.
Atomic/diff.py
Outdated
| self.chroot_left = chroot_left | ||
| self.chroot_right = chroot_right | ||
| self.delta(self.compare) | ||
| self.chroot_left = [] #chroot_left |
There was a problem hiding this comment.
Comments are useless, either describe or remove the comment.
| self.parse_mtree_json() | ||
|
|
||
| def parse_mtree_json(self): | ||
| def extra(_result): #pylint: disable=unused-variable |
There was a problem hiding this comment.
These variables are used, aren't they, remove pylint?
There was a problem hiding this comment.
Because of the use of eval below, pylint doesnt recognize they are used.
Leaving until better solution.
There was a problem hiding this comment.
Ok, I thought they might have been left over.
| def extra(_result): #pylint: disable=unused-variable | ||
| self.right.append(_result['path']) | ||
|
|
||
| def missing(_result): #pylint: disable=unused-variable |
There was a problem hiding this comment.
Because of the use of eval below, pylint doesnt recognize they are used.
Leaving until better solution.
| def missing(_result): #pylint: disable=unused-variable | ||
| self.left.append(_result['path']) | ||
|
|
||
| def modified(_result): #pylint: disable=unused-variable |
There was a problem hiding this comment.
Because of the use of eval below, pylint doesnt recognize they are used.
Leaving until better solution.
| _print_diff(self.right) | ||
| if len(self.common_diff): | ||
| util.write_out("\nCommon files that are different:") | ||
| util.write_out("\nCommon files that are different: (reason)") |
There was a problem hiding this comment.
Not sure what you are doing here?
There was a problem hiding this comment.
I supply a reason in the output to show why things are different. Example:
Files only in centos:latest:
usr/bin/view
usr/bin/ex
usr/bin/rvi
usr/bin/vi
etc/virc
usr/bin/rview
Files only in 19f8d8b7147b:
run/secrets
Common files that are different: (reason)
var/lib/rpm/Basenames (time)
var/lib/rpm/Packages (time)
There was a problem hiding this comment.
How does that happen. This looks like it will just output
"(reason)"
There was a problem hiding this comment.
That LOC is just to print the header(title). The rest comes in the _print_diff method.
There was a problem hiding this comment.
Do you handle this if there are multiple reasons?
| --help | ||
| --json | ||
| --display | ||
| --keywords, -k |
There was a problem hiding this comment.
It would be nice to automatically expand the keywords.
There was a problem hiding this comment.
I couldn't figure that out ... do you know how to?
There was a problem hiding this comment.
Ok we can merge and I will look to add this.
| **--json** | ||
| Output in the form of JSON. | ||
|
|
||
| **-k** **--keywords** |
There was a problem hiding this comment.
Give a list of keywords available.
|
|
||
| Compare files by 'sha256digests' and 'time' between images 'foo1' and 'foo2' | ||
|
|
||
| atomic diff foo1 foo2 --keywords sha256digest time |
There was a problem hiding this comment.
I think the handling of multiple keywords does not work well. What happens if I do
atomic diff --keywords sha256digest time foo1 foo2
Does it work
Or if I had a container named time.
atomic diff --keywords sha256digest time foo1
There was a problem hiding this comment.
It only works if if the keywords are the last arguments. Verified this in argparse as well. Example:
[bbaude@bbaude atomic (diff_mtree)]$ sudo ./atomic diff -k sha256digest centos:latest 19f8d8b7147b
atomic diff: argument -k/--keywords: invalid choice: 'centos:latest' (choose from 'all', 'link', 'nlink', 'mode', 'type', 'time', 'uid', 'gid', 'size', 'sha256digest')
Try 'atomic diff --help' for more information.
^^ Fails. Keywords have to be last.
[bbaude@bbaude atomic (diff_mtree)]$ sudo ./atomic diff centos:latest 19f8d8b7147b -k sha256digest
Ideas and/or suggestions welcome.
There was a problem hiding this comment.
Lets make it a comma separated list? Or do it the way we did for TOP?
There was a problem hiding this comment.
-o [{time,stime,ppid,uid,gid,user,group}], --optional [{time,stime,ppid,uid,gid,user,group}]
There was a problem hiding this comment.
FWIW using "--" between the keyword list and the image names also works, but that's not very easy to use/discover.
There was a problem hiding this comment.
Yes, but not easy to understand for most.
There was a problem hiding this comment.
Added the ability to use -k one or more times. One keyword per -k
There was a problem hiding this comment.
Why not do the "," separated list?
-k time -k stime -k ppid
Versus
-k time,ktime,ppid
But I guess it is not that big a deal
Most people will just do -k all, or is this the default?
There was a problem hiding this comment.
-all is the default. I thought we agreed to mimic the top usage which is one option per keyword. Want different?
There was a problem hiding this comment.
Ok, I though atomic top did both, but if it use -o foo -o bar, then LGTM
tests/integration/test_diff.sh
Outdated
| # logic can be added to a "cleanup stack", by cascading function calls | ||
| # within traps. See tests/integration/test_mount.sh for an example. | ||
| trap teardown EXIT | ||
| if [ -e ${GOMTREE} ]; then |
There was a problem hiding this comment.
Probably better to indicate a skip here by returning 77.
Also, if you want these tests to run on the Atomic Host platforms until gomtree is pulled in automatically, you can simply add to the YAML:
packages:
- gomtreeThere was a problem hiding this comment.
@jlebon is that done with an exit 77 or elsewise?
There was a problem hiding this comment.
Yes exactly. See e.g. test_verify.sh.
c1c6641 to
7c4d8ea
Compare
The previous algorithm for comparing files used python's dircmp and is considered to be a shallow comparision. This allowed distinctly small possibilities that two files being compared could be different but not caught. We now use go-mtree to do the comparison. This can emulate the shallow comparison we had before but we can also adding a sha256digest as part of the comparison using the new --keywords option. Also, made slight tweaks to gomtree functions in Atomic.util so we debug and influence the return of JSON data. This solves projectatomic#761
|
📌 Commit 171cba2 has been approved by |
|
☀️ Test successful - status-atomicjenkins |
The previous algorithm for comparing files used python's
dircmp and is considered to be a shallow comparision. This
allowed distinctly small possibilities that two files being
compared could be different but not caught.
We now use go-mtree to do the comparison. This can emulate the
shallow comparison we had before but we can also adding a
sha256digest as part of the comparison using the new --keywords
option.
Also, made slight tweaks to gomtree functions in Atomic.util
so we debug and influence the return of JSON data.
This solves #761