Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Unicode issue in Contain and ContainExactly. #378

Merged
merged 1 commit into from
Aug 13, 2015

Conversation

tkirill
Copy link
Contributor

@tkirill tkirill commented Jun 29, 2015

fix #377

These are the only places where Get-Content is used in assertions.

@tkirill
Copy link
Contributor Author

tkirill commented Jun 29, 2015

How I can fix the build? Locally I have all tests passed on powershell v4.

@dlwyatt
Copy link
Member

dlwyatt commented Jun 29, 2015

Those timeouts indicate a problem on the build server, don't worry about it for now.

As for the PR, I'd be concerned with what happens if the code encounters a file that isn't ASCII or UTF-8. (Unicode / UTF-16, for instance, which are created by default when you use Out-File or the redirection operators.)

@tkirill
Copy link
Contributor Author

tkirill commented Jun 29, 2015

@dlwyatt I took the character which takes 3 bytes 0xE3 0x83 0xAC in UTF-8. I saved the character in a file in UTF-16 with BOM and this test passed:

Describe "Get-Content" {
    It "magically works with UTF-16" {
        'test.txt' | Should Contain 'レ'

        $utf8 = Get-Content -Encoding UTF8 'test.txt'
        $utf16 = Get-Content -Encoding Unicode 'test.txt'
        $oem = Get-Content -Encoding Oem 'test.txt'

        $utf8 | Should Be 'レ'
        $utf16 | Should Be 'レ'
        $oem | Should Be 'レ'
    }
}

The test passes even without my changes but I don't know why. Maybe when Powershell sees BOM it assumes UTF-16 and doesn't look at -Encoding parameter? This is strange. Anyway, I think it would be right to say that this PR changes nothing for UTF-16.

Also, when save the character in UTF-8 this test

Describe "Should Contains" {
    It "works with UTF-8" {
        'test.txt' | Should Contain 'レ'

        $utf8 = Get-Content -Encoding UTF8 'test.txt'

        $utf8 | Should Be 'レ'
    }
}

fails without this PR.

@dlwyatt dlwyatt self-assigned this Jul 13, 2015
@tkirill
Copy link
Contributor Author

tkirill commented Aug 12, 2015

@dlwyatt If you don't like these changes I would be happy if you provide small instruction how to implement custom should assertions. It would solve my little problem too 👍

@dlwyatt
Copy link
Member

dlwyatt commented Aug 13, 2015

Sorry about that, I lost track of this one. I'll look it over now and merge if everything seems okay

dlwyatt added a commit that referenced this pull request Aug 13, 2015
Fix Unicode issue in `Contain` and `ContainExactly`.
@dlwyatt dlwyatt merged commit cd40e83 into pester:master Aug 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Contain doesn't support Unicode
2 participants