Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update relevant files to UTF8 with BOM #214

Closed
dtgm opened this issue Jun 9, 2016 · 1 comment
Closed

Update relevant files to UTF8 with BOM #214

dtgm opened this issue Jun 9, 2016 · 1 comment

Comments

@dtgm
Copy link
Owner

dtgm commented Jun 9, 2016

The notes at https://github.com/chocolatey/choco/wiki/CreatePackages#character-encoding regarding BOMs is inaccurate. The wiki needs to get updated.

  • Do not save your *.nuspec files with a Byte Order Mark (BOM). A BOM is neither required nor recommended for UTF-8, because it can lead to several issues.

This is largely FUD; while it is possible to have issues with older programs, there are more benefits than not to contain a BOM.

  • PowerShell scripts need to be saved in UTF-8 with BOM. PowerShell is ignoring the standards and needs a BOM in order to recognize scripts as UTF-8. Otherwise it processes non ASCII characters incorrectly.

I realized many of by PS1 files do not have a BOM.

@dtgm dtgm closed this as completed in 3dcd713 Jun 9, 2016
@dtgm
Copy link
Owner Author

dtgm commented Jun 9, 2016

$path = "C:\dev\choco\chocolatey-packages\automatic\[a-zA-Z0-9]*"
$include = @(
  "*.ahk",
  "*.md",
  "*.nuspec",
  "*.ps1",
  "*.psm1",
  "*.py",
  "*.txt",
  "*.xml"
)
$files = get-childitem -Path $path -Include $include -Recurse
# Find files that contain BOM by only reading first 3 bytes of file
#http://superuser.com/a/418520/64039
Function ContainsBOM
{
    return $input | where {
        $contents = new-object byte[] 3
        $stream = [System.IO.File]::OpenRead($_.FullName)
        $stream.Read($contents, 0, 3) | Out-Null
        $stream.Close()
        $contents[0] -eq 0xEF -and $contents[1] -eq 0xBB -and $contents[2] -eq 0xBF }
}
$filesBomTrue = New-Object System.Collections.ArrayList($null)
$filesBomFalse = New-Object System.Collections.ArrayList($null)
foreach ($f in $files) {
  if ($f | ContainsBOM) {[void]$filesBomTrue.Add($f)}
  else {[void]$filesBomFalse.Add($f)}
}
foreach ($f in $filesBomFalse) {
  $text = Get-Content -Raw -Encoding UTF8 -Path $f.PSPath
  [io.file]::WriteAllText($f.FullName, $text, [text.encoding]::UTF8)
}

This repo will now require files be encoded as UTF8-BOM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant