Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Compatibility #75

Closed
shibumi opened this issue Aug 25, 2020 · 10 comments
Closed

Windows Compatibility #75

shibumi opened this issue Aug 25, 2020 · 10 comments

Comments

@shibumi
Copy link
Collaborator

shibumi commented Aug 25, 2020

Description of issue or feature request:

I had a a deep-dive into Windows today via installing a Windows10 Vagrant box, starting Jetbrains GoLand in it and executing the tests for master and for my branches.

I have bad news.

Did we ever execute tests via appveyor on Windows, besides go fmt?
If so, then I am interested in what appveyor is actually doing, because the tests on Windows definitely fail.

There are several issues. First I thought it's the Github Runner, that is behaving odd, but in reality it's our code.
Somehow generateKeyID calculates a different checksum on Windows. I have no explanation for this.

Here is what I did:

  1. First I compared file sizes of our test data files, these were differently. Later I read at StackOverflow, that these are just different metrics: https://askubuntu.com/questions/573586/files-sizes-differs-from-windows-to-linux#:~:text=The%20different%20file%20structure%20which,sizes%20in%20different%20operating%20systems.

  2. So I decided to calculate the checksum on Linux via md5sum and sha256sum and on Windows via CertUtil -hashfile <input file> sha256. Well.. and all of our test data files, had a different check sum. Do you have any explanation for this?

I tried setting .gitattributes and eol=lf and such stuff, but it didn't help.

What is even more odd is, that we use pem.Decode(), so actually the file format is irrelevant. But still we are ending up with a different keyID. I expect an issue in EncodeCanonicalData, but I can't really explain what is going on there.

Current behavior:
Right now we are not Windows compatible and I doubt that we ever were windows compatible.

Expected behavior:
We should be compatible with windows.

PS:
Do we have checks for Windows/Linux compatibility with in-toto-py and/or TUF?

CC: @SantiagoTorres @trishankatdatadog maybe you have an explanation for this.

@SantiagoTorres
Copy link
Member

Hmm, first I wanted to share this admittedly not great repo to help you set up a dev machine in windows. I'm pretty sure I've used py-in-toto to generate metadata on windows and consume it on linux and vice versa.

Somehow generateKeyID calculates a different checksum on Windows. I have no explanation for this.

I thought I had explained this on slack, but I think this has to do to the fact that windows defaults to utf-16, so you may be hashing a bunch of additional zeros. Have you printed the buffers themselves before they are being passed to the hashing function? (i.e., dump the whole memory of the hashable buffer and then compare them)

@shibumi
Copy link
Collaborator Author

shibumi commented Aug 25, 2020

Hi @SantiagoTorres
I dig a little bit deeper. Quickfix for most of the test data is using LF for the files. Maybe we should add a .gitattributes file.

Another Problem I have spotted is this one here:

 // Read key bytes
	pemBytes, err := ioutil.ReadAll(pemFile)
	if err != nil {
		return err
	}

	key, err := decodeAndParse(pemBytes)
	if err != nil {
		return err
	}

	// Use type switch to identify the key format
	switch key.(type) {
	case *rsa.PublicKey:
		if err := k.setKeyComponents(pemBytes, []byte{}, rsaKeyType, scheme, keyIdHashAlgorithms); err != nil {
			return err
		}

As shown in the snippet above, we are reading the pemBytes and using them as input for setKeyComponents. This means, that we are operating on the bytes from the file, hence it works as long as the line separator is the same (LF-LF). But when we try to import a CLRF formatted file, we will end up with a different checksum, because we will load the file bytes directly as public key string in our key object.

@shibumi
Copy link
Collaborator Author

shibumi commented Aug 25, 2020

Update:

I wonder if .gitattributes:

*  eol=lf

would make sense for the whole project, although I think that fixing the problem above in the source code, might fix the problem in general. I will need to have a look on the other test output.

@lukpueh
Copy link
Member

lukpueh commented Aug 25, 2020

Did we ever execute tests via appveyor on Windows, besides go fmt?

Judging from the Appveyor build log it does look like it:

...
go test ./...
go: downloading golang.org/x/crypto v0.0.0-20191029031824-8986dd9e96cf
go: extracting golang.org/x/crypto v0.0.0-20191029031824-8986dd9e96cf
go: finding golang.org/x/crypto v0.0.0-20191029031824-8986dd9e96cf
ok  	github.com/in-toto/in-toto-golang/in_toto	0.837s
Build success

You could use rdp and see what's going on on the appveyor workers

@shibumi
Copy link
Collaborator Author

shibumi commented Aug 25, 2020

Judging from the Appveyor build log it does look like it:

Has appveyor some different git settings? Maybe autocrlf = false? It definitely does not work in my Windows VM with CRLF files.

@shibumi shibumi mentioned this issue Aug 26, 2020
2 tasks
@shibumi
Copy link
Collaborator Author

shibumi commented Aug 26, 2020

I have build a patch for the issue, I have explained here: #75 (comment)

and indeed, this patch fixes our Windows compatibility problems, that I have introduced in my in-toto run PR.
Have a look on my Github Actions report for Windows: https://github.com/shibumi/in-toto-golang/runs/1029349929?check_suite_focus=true

This works now (please ignore, the send coverage error).

You can find the patch here: #76

@shibumi
Copy link
Collaborator Author

shibumi commented Aug 26, 2020

@lukpueh This can be closed now :)

@shibumi
Copy link
Collaborator Author

shibumi commented Aug 26, 2020

@lukpueh Ok, I think we are not done here.. I played around with the github actions branch I have and it looks like it only worked, because I have accidentally pushed a gitattributes file to it. The bugs around key loading are fixed through the PR we have merged, but we still have issues:

  shell: C:\Program Files\PowerShell\7\pwsh.EXE -command ". '{0}'"
  env:
    GOROOT: C:\hostedtoolcache\windows\go\1.14.7\x64
go: downloading golang.org/x/crypto v0.0.0-20191029031824-8986dd9e96cf
json: unsupported type: func(*testing.T)
open bad/path: The system cannot find the path specified.
--- FAIL: TestRecordArtifacts (0.00s)
    runlib_test.go:216: RecordArtifacts returned '(map[demo.layout.template:map[sha256:991b055e7001d6f1481849cecb55292e99058ca3d45be0ccaf3ba9a5b34ca6fa] foo.tar.gz:map[sha256:52947cb78b91ad01fe81cd6aef42d1f6817e92b9e6936c1e5aabb7c98514f355] tmpdir/tmpfile:map[sha256:ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad]], %!s(<nil>))', expected '(map[demo.layout.template:map[sha256:019e121a1e0a34aecde0aebb642162b11db4248c781cb8119f81f592723a0424] foo.tar.gz:map[sha256:52947cb78b91ad01fe81cd6aef42d1f6817e92b9e6936c1e5aabb7c98514f355] tmpdir/tmpfile:map[sha256:ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad]], nil)'
--- FAIL: TestInTotoRun (0.15s)
    runlib_test.go:331: InTotoRun returned '({{link Name map[demo.layout.template:map[sha256:991b055e7001d6f1481849cecb55292e99058ca3d45be0ccaf3ba9a5b34ca6fa]] map[foo.tar.gz:map[sha256:52947cb78b91ad01fe81cd6aef42d1f6817e92b9e6936c1e5aabb7c98514f355]] map[return-value:%!s(float64=0) stderr:err stdout:out] [sh -c printf out; printf err >&2] map[]} [{be6371bc627318218191ce0780fd3183cce6c36da02938a477d2e4dfae1804a6 c0d67be0689b1f9470929c023b6733124833aea33c345d737f1a5047cac75a46a3d88b9b9c81610dbc2c18d9fdb08ad40170cdc6ce3e45609c8fdcd5bff82a08}]}, %!s(<nil>))', expected '({{link Name map[demo.layout.template:map[sha256:019e121a1e0a34aecde0aebb642162b11db4248c781cb8119f81f592723a0424]] map[foo.tar.gz:map[sha256:52947cb78b91ad01fe81cd6aef42d1f6817e92b9e6936c1e5aabb7c98514f355]] map[return-value:%!s(float64=0) stderr:err stdout:out] [sh -c printf out; printf err >&2] map[]} [{be6371bc627318218191ce0780fd3183cce6c36da02938a477d2e4dfae1804a6 08a6c42b8433502f2869bb3dc73f8348f6b6f89e42bbc63f91a33e7171d762e138ed5d695fb83cebec958203e17b2285f95b198d758bc62cf30e1f7408d6c10c}]}, nil)'
WARNING: syntax error in pattern, pattern was '['
[begin test warning output]
WARNING: Expected command for step 'foo' (rm -rf .) and command reported by 'foo.a.link' (rm -rf /) differ.
[end test warning output]
FAIL
coverage: 95.6% of statements
FAIL	github.com/in-toto/in-toto-golang/in_toto	1.284s
FAIL
##[error]Process completed with exit code 1.

This time it's the demo.layout.template file

@shibumi
Copy link
Collaborator Author

shibumi commented Aug 26, 2020

The problem seems to be the RecordArtifact function:

func RecordArtifact(path string) (map[string]interface{}, error) {

	hashObjectMap := createMap()

	// Read file from passed path
	contents, err := ioutil.ReadFile(path)

	hashedContentsMap := make(map[string]interface{})

	if err != nil {
		return nil, err
	}

	// Create a map of all the hashes present in the hash_func list
	hashFunc := []string{"sha256"}
	for _, element := range hashFunc {

		result := hashObjectMap[element].Compute(contents)

		hashedContentsMap[element] = result
	}

	// Return it in a format that is conformant with link metadata artifacts
	return hashedContentsMap, nil
}

We are reading the raw input of a file path here. It's only failing for demo.layout.template, because foo.tar.gz is a binary archive (so no line separators) and the test file tmpfile is just a string without any line separator.

I don't know how we are going to fix this now.

For the Github Actions PR I plan to ship the attributes file for now, this fixes this temporary, but we really need to think about reading layout files and link files and everything file related. What we need is a function, that normalizes line separators. @trishankatdatadog pointed me to a securesystemslib function, that does this.

@shibumi
Copy link
Collaborator Author

shibumi commented Aug 26, 2020

Note. I fixed the issue in RecordArtifact in this commit here: 04b13d5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants