Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

md5sum matcher returns wrong value for a binary file in windows #1898

Closed
kenichi-yamada-nssol opened this issue Jun 7, 2017 · 6 comments
Closed
Assignees
Labels
Type: Bug Feature not working as expected

Comments

@kenichi-yamada-nssol
Copy link

Description

md5sum matcher returns wrong hash value for a binary file in Windows environment. I suspect the wrong value is calculated from wrong bits obtaining through unnecessary conversion caused by using of text mode. Please fix it to use binary mode for a binary file?

InSpec and Platform Version

inspec version
1.25.1

Replication Case

obtaining md5sum value for notepad.exe using Certutil.

CertUtil -hashfile C:\Windows\notepad.exe MD5
3b 50 8c ae 5d eb cb a9 28 b5 bc 35 55 17 e2 e6

control/md5sum.rb

describe file('C:\Windows\notepad.exe') do
  its('md5sum') { should eq '3b508cae5debcba928b5bc355517e2e6' }
end

execute test

inspec exec md5sum

Profile: InSpec Profile (md5sum)
Version: 0.1.0
Target:  local://


File C:\Windows\notepad.exe
     [MAJR]  md5sum should eq "3b508cae5debcba928b5bc355517e2e6"

     expected: "3b508cae5debcba928b5bc355517e2e6"
          got: "058e95f26d995a71eaf2352b0c1c0566"

     (compared using ==)


Test Summary: 0 successful, 1 failures, 0 skipped

I can get the same value as Inspec reported like below:

irb
irb(main):001:0> require 'digest'
=> true
irb(main):002:0> Digest::MD5.hexdigest(IO.read('C:/Windows/notepad.exe', {:mode=>'r', :external_encoding=>'UTF-8', :internal_encoding=>'UTF-8', :textmode=>true}))
=> "058e95f26d995a71eaf2352b0c1c0566"

Possible Solutions

Stacktrace

@adamleff
Copy link
Contributor

adamleff commented Jun 7, 2017

This is caused by Train (the underlying transport library that InSpec uses to run commands and read files) gets the content of the file and pulls it back to the host running InSpec in order to compute the checksum. The checksum is not calculated on the remote host.

On Windows, the file content is gathered as such:

Get-Content("path-to-file") | Out-String

... which you can see in the code here: https://github.com/chef/train/blob/master/lib/train/extras/file_windows.rb#L34

Outputting a binary file as a string and then calculating the checksum is obviously not going to yield the results you're looking for. I don't believe this was ever intended to do checksums on non-text files, hence the issue you're experiencing.

Here's what it looks like as if InSpec were doing the calculations. You can see the checksums are different even though the source file is the same - one is the original file, and one is the string representation of the original file:

PS C:\Users\vagrant> get-content c:\windows\system32\notepad.exe | out-string > notepad.out
PS C:\Users\vagrant> CertUtil -hashfile notepad.out MD5
MD5 hash of file notepad.out:
d3 34 6a e8 25 8e 25 e1 da bb 35 cc d9 a3 cc 82
CertUtil: -hashfile command completed successfully.
PS C:\Users\vagrant> CertUtil -hashfile c:\windows\system32\notepad.exe MD5
MD5 hash of file c:\windows\system32\notepad.exe:
95 9a 31 d0 cd 01 3c ea 0c 66 db 7c 03 bc bd df
CertUtil: -hashfile command completed successfully.

Thank you for logging this issue, @kenichi-yamada-nssol. I will flag this as a bug and leave this issue opened for further analysis by the other maintainers.

@adamleff adamleff added the Type: Bug Feature not working as expected label Jun 7, 2017
@kenichi-yamada-nssol
Copy link
Author

kenichi-yamada-nssol commented Jun 8, 2017

For your information, using the combination "Get-Content -Encoding Byte" and "Set-Content -Encoding Byte" can handle a file as keep the same as original bytes. Any other similar method is acceptable, it is important to avoid CRLF conversion in text mode, anyway.

Get-Content -path c:\windows\notepad.exe -Encoding Byte -ReadCount 0 | Set-Content -path notepad.out -Encoding Byte
CertUtil -hashfile notepad.out MD5
MD5 ハッシュ (ファイル notepad.out):
3b 50 8c ae 5d eb cb a9 28 b5 bc 35 55 17 e2 e6
CertUtil: -hashfile コマンドは正常に完了しました。

@adamleff
Copy link
Contributor

@kenichi-yamada-nssol the challenge for us is that we compute things like checksums on the machine on which InSpec was initially run. So I could initiate an InSpec run against a Windows host from my macOS workstation and I should be able to compute the checksum of a Windows file.

Currently, we do that by grabbing the content of the file (incorrectly, I might add, for binary files) and bringing it back to the InSpec workstation and then using the Ruby stdlib to compute the digest. While I realize this isn't terribly efficient, it's the only way we can guarantee a higher rate of success; for example, Certificate Utilities may not be installed on the target Windows machine, so we don't want to rely on using CertUtil.

Your tip of using -Encoding Byte is great! However, I still haven't been able to figure out how to pack the resulting list of byte ints back into a Ruby object such that the content matches what was on the target system and the checksums match. If you have suggestions, I'm all ears!

@username-is-already-taken2
Copy link
Contributor

@adamleff - what do you think about adding a md5 method to the file resource and shelling to compute the hash?

Doing something like this in windows?

image

@adamleff
Copy link
Contributor

adamleff commented Sep 7, 2017

I know so little about Windows... is Get-FileHash something guaranteed to be installed on most Windows machines? If so, I'm totally good with it, but I wouldn't add it to InSpec... I'd probably modify Train's md5sum method to calculate the hash on the target.

This is where it's defined in Train: https://github.com/chef/train/blob/master/lib/train/extras/file_common.rb#L44-L50

I'd override that method within https://github.com/chef/train/blob/master/lib/train/extras/file_windows.rb

@jerryaldrichiii
Copy link
Contributor

Closing this error in favor of tracking in inspec/train#235.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Feature not working as expected
Projects
None yet
Development

No branches or pull requests

4 participants