Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify cmdlets with parameter 'Encoding' to be of type System.Text.Encoding #5080

Merged

Conversation

JamesWTruher
Copy link
Member

@JamesWTruher JamesWTruher commented Oct 10, 2017

This unifies file encoding across the inbox cmdlets to be UTF-8 without a BOM for all platforms. This is a breaking change as cmdlets on windows have a number of different encodings. This supports better interoperability with tradition Linux shells as we are using the same encoding.

Use ArgumentCompletionsAttribute for the parameter to accept the currently used strings.
Register encoding providers on all platforms. This seems safe enough, and there are defintely more encodings available on CoreFX after this is done, which will be useful for some of the web cmdlet scenarios

Validate that files are created with UTF-8 encoding without BOM
Update tests to validate Encoding parameter to new type and create new tests for
parameter type validation.

Breaking Change The -Encoding Byte has been removed from the filesystem provider cmdlets. A new parameter -Byte is now the signal that a byte stream is required as input or output will be a stream of bytes.


// BigEndianUTF32 encoding is possible, but requires creation
internal static Encoding BigEndianUTF32Encoding = new UTF32Encoding(true,true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use named parameters instead of true, true

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

{
return Encoding == FileSystemCmdletProviderEncoding.Byte;
} // get
get { return _usingByteEncoding; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use automatic property and remove _usingByteEncoding.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -7688,9 +7631,14 @@ public bool WasStreamTypeSpecified
{
get
{
return (Encoding != FileSystemCmdletProviderEncoding.String);
} // get
return _wasStreamTypeSpecified;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -109,16 +104,17 @@ internal static Encoding GetDefaultEncoding()

/// <summary>
/// Facade for getting OEM encoding
/// OEM encodings work on all platforms, or rather codepage 437 is available on both Windows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably the comment intended to be codepage 437 is available on both Windows and Non-Windows

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed so, fixed

internal class ByteEncoding : System.Text.Encoding
{
// the encoding for this is not bigendian and not BOM
public Encoding ActualEncoding = new UnicodeEncoding(false, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use named parameters

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

[pscustomobject]@{ Key = $testStr } | export-csv $outputFile
$bytes = Get-FileBytes $outputFile
$bytes[0,1,2] -join "-" | should not be ($utf8Preamble -join "-")
$bytes -join "-" | should match ($utf8bytes -join "-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be MatchExactly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed (as they're numbers) - MatchExactly isn't the same as Be, it's just a case-sensitive match -cmatch I just need to be sure that my string is in the output somewhere

[pscustomobject]@{ Key = $testStr } | export-clixml $outputFile
$bytes = Get-FileBytes $outputFile
$bytes[0,1,2] -join "-" | should not be ($utf8Preamble -join "-")
$bytes -join "-" | should match ($utf8bytes -join "-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be MatchExactly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not needed - for the same reason as above, i'm just making sure that the magic bytes are in the file somewhere

${testStr} >> $outputFile
$bytes = Get-FileBytes $outputFile
$Expected = $( $ExpectedWithNewline; $ExpectedWithNewline )
$bytes -join "-" | should be ($Expected -join "-")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be BeExactly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope - they're numbers so case doesn't provide value

}
It "Encoding parameter of command '<Command>' is type 'Encoding'" -testcase $testCases {
param ( $command )
$command.Parameters['Encoding'].ParameterType.FullName | Should Be "System.Text.Encoding"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be BeExactly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be - BeExactly is a case-sensitive string compare, but the case of the response is not the point, only that it is the proper fullname.
I'm in the file so I'll update it, but we should be cautious about being overly pedantic.

@@ -523,4 +523,4 @@ public override object Transform(EngineIntrinsics engineIntrinsics, object input
}

#endregion
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed adding the change for the Encoding parameter?

@dantraMSFT
Copy link
Contributor

dantraMSFT commented Oct 10, 2017

internal static class EncodingConversion

I think this class should be in its own CS file; it's not a path utility and discoverability is problematic.

In fact, I think this and the associated ArgumentToEncodingTransformationAttribute should be moved into an encoding file.

internal const string Utf7 = "utf7";
internal const string Utf32 = "utf32";
internal const string Default = "default";
internal const string OEM = "oem";
internal const string Byte = "byte"; // This will return unicode, but 'byte' is special as it uses unicode encoding without a BOM
internal static ByteEncoding byteEncoding = new ByteEncoding();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field should be static readonly and use the same casing as the constants defined above it (internal static readonly ByteEncoding ByteEncoding).

There are numerous references to this field in the various cmdlet's Encoding parameters but the purpose of it is opaque. I suggest a comment here or, perhaps, an accessor method that does the pattern instead of placing it in the various cmdlets or even initialize the field to new ByteEncoding().ActualEncoding.

}
set
{
if ( value == EncodingConversion.byteEncoding )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you want to use Object.ReferenceEquals or override Equals in your ByteEncoding class and handle equality there; otherwise, you're relying on Encoding.Equals which may not do what you expect.

}
set
{
if ( value == EncodingConversion.byteEncoding )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These set methods need a comment as to why the comparison to EncodingConversion.byteEncoding is needed to provide a guide when creating new cmdlets that provide an Encoding parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a helper, something like:

EncodingConversion.SetEncodingParameterHelper(ref _encoding, value);

Or:

_encoding = EncodingConversion.EncodingParameterHelper(value);

In reply to: 144073559 [](ancestors = 144073559)

// We need a way to provide an encoding and also
// a way to notify that the user wants "byte" encoding
// which is just a stream
internal class ByteEncoding : System.Text.Encoding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this class and the EncodingConversion.byteEncoding moved here as a public field?

In other words, is the problem you're solving with this custom instance be something third-party cmdlets will need to solve as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really a dodge for backward compatibility. -Encoding Byte should have never existed, but it does and I don't want to have this be a difference between 6 and 5.x. This byte encoding is really needed for the *-content cmdlets where the parameter is used as a flag indicating that the user is looking for a stream of bytes, I would rather not make this public

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're already making this big breaking change, I think we should fix this by not having ByteEncoding as Byte is an artificial "encoding". The *-Content cmdlets should have a different parameter to output or receive a byte stream.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would more clear to users to have -Byte or -Binary parameter.
@mklement0 @rkeithhill What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of prefer -AsByteArray myself.

cc @HemantMahawar @joeyaiello

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Purely from a usability perspective, I don't have a problem with -Encoding Byte, because even though it is technically not a text encoding, the -Encoding parameter can be conceptualized as being about how to interpret the data on reading and writing, and -Encoding Byte (a byte stream) is one way of (non-)interpreting .

This keeps all considerations regarding encoding/non-encoding focused on a single parameter, which simplifies things (it doesn't introduce another parameter that you have to learn about and remember), both in terms of use and documentation.

Copy link
Collaborator

@iSazonov iSazonov Oct 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we interpret Get-Content -Encoding Utf8 | Set-Content -Encoding 'CP1251' -Byte?

$CR = [text.encoding]::unicode.GetBytes($asciiCR)
# create the expected - utf8 encoding without a bom
$encoding = [text.utf8encoding]::new($false)
$BOM = $encoding.GetPreamble()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very misleading; the test is verifying with a BOM yet the test is requesting the BOM. If the Encoding class has a bug or was created incorrectly, the test will still pass. In other words, if the array is empty, as expected, the test passes. If the array is non-empty, the test will still pass.

I suggest removing $BOM completely and not relying on the implicit empty array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

$TXT = [text.encoding]::unicode.GetBytes($asciiString)
$CR = [text.encoding]::unicode.GetBytes($asciiCR)
# create the expected - utf8 encoding without a bom
$encoding = [text.utf8encoding]::new($false)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest matching the casing of the type [Text.UTF8Encoding].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed up the cases of everything while I was there


$testPath = Join-Path -Path $TestDrive -ChildPath 'TailWithEncoding.txt'
$content | Set-Content -Path $testPath -Encoding $encoding
$content | Set-Content -Path $testPath -Encoding $encodingName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this test cover the ByteEncoding path? It doesn't appear to.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

covered by the tests in engine/Basic

}
else if ( encoding.IndexOf(wordToComplete, StringComparison.InvariantCultureIgnoreCase) == 0 )
{
encodings.Add(new CompletionResult(encoding, encoding, CompletionResultType.Text, encoding));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is different from Line 635?

{
// the encodingName is guaranteed to be valid, so it is safe to pass null to method
// Convert(Cmdlet cmdlet, string encoding) as the value of 'cmdlet'.
return EncodingConversion.Convert(null, encodingName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EncodingConversion.Convert() do this comparitions too - I believe we could directly call Convert() and catch an exception or change EncodingConversion.Convert to return null or create TryConvert().

/// Defines the values that can be supplied as the encoding parameter in the
/// FileSystemContentDynamicParametersBase class.
/// </summary>
public enum FileSystemCmdletProviderEncoding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we shouldn't remove public API. Maybeuse [ObsoleteAttribute].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jim, can you check if this API is in PowerShell Standard 3.0? This enum isn't used at all now, right? So leaving it in with Obsolete doesn't help.

uint currentAnsiCp = NativeMethods.GetACP();
s_defaultEncoding = Encoding.GetEncoding((int)currentAnsiCp);
#endif
s_defaultEncoding = new UTF8Encoding(false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the change. Previously (discussion #3248, PR #3467) I fixed this for Windows because we're still very sensitive to the OEM encoding. We have tons applications in enterprises that doesn't work if we don't set the correct OEM encoding on the system. If MSFT wants to cut it off, do it first in Windows 10 kernel and get feedback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov where are tests which provide validation for the changes you made in PR3467? The discussion in #3248 doesn't seem to have any concrete examples of misbehavior, which I would really like to understand. As for your last, I would rather get feedback on breaking behavior outside of the Windows ship cycles.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov, @PowerShell/powershell-committee has already decided to go with UTF8 (No BOM) as default. Can you give me a specific example where this is known to break? If OEM is needed, perhaps we could add a setting in the powershell json file but still default to utf8

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #3467#3467 haven't tests because (1) we did not have a full set of encoding tests in the time (2) we waited the encoding RFC (3) we only restore Windows PowerShell behavior by removing a breaking change.
About concrete examples. I speak about business applications. I have no idea how I can demonstrate it concrete. I understand that you hardly ever changed the OEM encoding on your computers. But ask yourself why are customers forced to do so? If MSTF wants to see it and understand it, it needs TAP with non-english enterprises. As I said earlier, the simple answer is that we have many business applications that only work with a specific OEM encoding and read/write files in OEM. PowerShell also works in this environment! The #3467 was caused by the fact that the scripts were no longer working and we needed to re-encode the files. The value of PowerShell Core disappears completely.

We can look at it on the other side. A year ago there was a categorical position that we should be backward compatible and that we should avoid making breaking changes as much as possible. The position was later changed so we want to facilitate the adaptation of UNIX users. This led to the creation of many incompatible changes. The current change from this series. These two principles have proved to be contradictory. We need to decide which one's in charge.

If the first means that we want to allow multiple Windows users to use PowerShell on Unix without much effort, and to get feedback to start the process of evolutionary convergence of platforms.

The second actually means that we're creating a brand new product that cannot be version 6, but must be PowerShell Core 1.0. It's fair to say that Windows users will have to migrate to a new product instead of using it directly. In this case, I agree that we can be ahead of the Windows 10 system and even its Notepad and use Utf8 by default like all other changes. Only then will it not be expected that the community will begin to use PowerShell Core extensively before version 3.
The second way is very dangerous. Microsoft can break the Windows PowerShell community and not build a new one-there are already well-known and popular alternatives (PHP, JS, Perl) that have proven their reliability and performance and have already formed a significant community. I've seen some examples of Microsoft already quickly destroying communities. The recovery took several years. It's sad. If it were my business, I would prefer to keep today's real customers, not future abstracts, and make the first step similar WSL like Windows PowerShell for Unix to capture a reliable bridgehead. Only then the community would be prepared to accept any breaking changes as understandable and necessary. Sorry for many words :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems again that a viable option is to have a setting for powershell.runtimeconfig.json to have this be OEM for compat reasons.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't followed this closely, but what happened to the $PSDefaultEncoding = 'WindowsLegacy' proposal from the RFC?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, just to clarify: for backward compatibility, it's both OEM and "ANSI" code pages that must be considered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mklement0 the simplified version we scoped for 6.0.0 does not include WindowsLegacy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JamesWTruher can you open an issue to cover @iSazonov 's concern?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mklement0 WindowsLegacy was pulled waiting for customer need - opened #5204 to track.

@@ -459,7 +478,7 @@ internal static Encoding Convert(Cmdlet cmdlet, string encoding)
{
if (string.IsNullOrEmpty(encoding))
{
// no parameter passed, default to Unicode (OS preferred)
// no parameter passed, default to (OS preferred)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed so, and incorrect (it should be UTF8) - fixed

// We will provide a partial list for tab completion we will convert to an encoding.
// If the user has an encoding object, we will cheerfully use it, but we can't use
// it as a validation set, as a user may have an specialty encoding which we have
// no idea about.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use ArgumentCompletionsAttribute #4835

new string[] { Unknown, String, Unicode, BigEndianUnicode, Ascii, Utf8, Utf7, Utf32, Default, OEM });
string msg = StringUtil.Format(PathUtilsStrings.OutFile_WriteToFileEncodingUnknown,
encoding, validEncodingValues);
new string[] { Unknown, String, Unicode, BigEndianUnicode, Byte, Ascii, Utf8, Utf8Bom, Utf8NoBom, Utf7, Utf32, Default, OEM });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse TabCompletionResults ?

[ValidateSetAttribute(new string[] { "Unicode", "UTF7", "UTF8", "ASCII", "UTF32", "BigEndianUnicode", "Default", "OEM" })]
public string Encoding { get; set; }
[ArgumentToEncodingTransformationAttribute()]
[ArgumentCompleter(typeof(EncodingArgumentCompleter))]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use ArgumentCompletionsAttribute #4835

@@ -470,6 +489,9 @@ internal static Encoding Convert(Cmdlet cmdlet, string encoding)
if (string.Equals(encoding, String, StringComparison.OrdinalIgnoreCase))
return System.Text.Encoding.Unicode;

if (string.Equals(encoding, Byte, StringComparison.OrdinalIgnoreCase))
return byteEncoding;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we aren't using a dictionary for fast lookup, we can at least order the tests from most likely (guessing of course) to least likely.

e.g., is Unknown really the most likely case? I hope not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reordered

// the encoding for this is not bigendian and not BOM
public Encoding ActualEncoding = new UnicodeEncoding(false, false);
public override unsafe int GetByteCount(char* c, int i1) { throw new NotImplementedException(); }
public override int GetByteCount(char[] c, int i1, int i2) { throw new NotImplementedException(); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's not hard for someone to get an instance of this encoding, so maybe we should be implementing all of these methods, assuming it's straightforward to do so (and mostly it seems like it is.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// We need a way to provide an encoding and also
// a way to notify that the user wants "byte" encoding
// which is just a stream
internal class ByteEncoding : System.Text.Encoding
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're already making this big breaking change, I think we should fix this by not having ByteEncoding as Byte is an artificial "encoding". The *-Content cmdlets should have a different parameter to output or receive a byte stream.

uint currentAnsiCp = NativeMethods.GetACP();
s_defaultEncoding = Encoding.GetEncoding((int)currentAnsiCp);
#endif
s_defaultEncoding = new UTF8Encoding(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iSazonov, @PowerShell/powershell-committee has already decided to go with UTF8 (No BOM) as default. Can you give me a specific example where this is known to break? If OEM is needed, perhaps we could add a setting in the powershell json file but still default to utf8

@SteveL-MSFT SteveL-MSFT added the Review - Committee The PR/Issue needs a review from the PowerShell Committee label Oct 13, 2017
internal const string Utf7 = "utf7";
internal const string Utf32 = "utf32";
internal const string Default = "default";
internal const string OEM = "oem";
internal const string Byte = "byte"; // This will return unicode, but 'byte' is special as it uses unicode encoding without a BOM
Copy link
Contributor

@mklement0 mklement0 Oct 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not knowing the inner workings: How will this return Unicode? With Get-Content, you get a byte array, and on output (Set-Content / Add-Content) you're forced to provide a byte array as input too (however you constructed that).

@JamesWTruher
Copy link
Member Author

@adityapatwardhan @dantraMSFT @SteveL-MSFT I've believe I've made updates that address your blocking issues

@SteveL-MSFT
Copy link
Member

@PowerShell/powershell-committee reviewed this and has agreed on:

  1. name the parameter -AsByteArray
  2. output warning if -Encoding is used with -AsByteArray that Encoding is not being used

@SteveL-MSFT SteveL-MSFT added Committee-Reviewed PS-Committee has reviewed this and made a decision and removed Review - Committee The PR/Issue needs a review from the PowerShell Committee labels Oct 18, 2017
@mklement0
Copy link
Contributor

mklement0 commented Oct 19, 2017

A message from the COSP (Crying Over Spilt Milk) department:

output warning if -Encoding is used with -AsByteArray that Encoding is not being used

This results in an awkward user experience, because users will expect incompatible parameters to be in different parameter sets, with attempts to combine them resulting in an error, not a warning.

A step backward in usability, and a breaking change to boot - and no gain that I can discern.

I also agree with @iSazonov that forgoing compatibility with the legacy encoding behavior on Windows altogether will dismay many Windows users, especially those whose "ANSI" / OEM code pages aren't based on the Latin alphabet.

If the idea is to focus on the Unix crowd first, I see trouble as well: broken quoting, broken CLI, broken globbing, lack of && and ||, ...

@SteveL-MSFT
Copy link
Member

@mklement one of the considerations the committee acknowledged is the complexity of using multiple parametersets to make parameters mutually exclusive so we were ok with using a warning as a simplification until we can revisit the recalled RFC on making it easier to both define as well as understand mutually exclusive parameters

…oding

Created support classes so tab completion would continue to wor
Create an ArgumentTransformationAttribute for the parameter to accept the currently used strings
Register encoding providers on all platforms. This seems safe enough, and there are defintely more
encodings available on CoreFX after this is done, which will be useful for some of the web cmdlet scenarios

Validate that files are created with UTF-8 encoding without BOM
Update tests to validate Encoding parameter to new type and create new tests for
parameter type validation.
refactor encoding class and other encoding utilities into new file
update redirection operator tests to
convert string matching for encoding into dictionary, which should make it more efficient
Add a new parameter to the filesystem provider cmdlets to indicate whether we want a byte stream
Unify Send-MailMessage with other cmdlets
Created a warning when both -AsByteStream and -Encoding are used together
updated tests to check that the warning is emitted
Also improve encoding completion to handle wildcards
Also make sure that Encoding parameters are properties rather than fields.
Remove unused OpenStreamReader which takes a string since it's no longer in use
…bute for encoding attribute.

Simplify logic for converting string to encoding
EncodingConversion.Utf8Bom,
EncodingConversion.Utf8NoBom,
EncodingConversion.Utf32
)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about use ArgumentCompletions(EncodingConversion.TabCompletionResults) here? It would be easy for future updates too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments for other instances of ArgumentCompletions usage.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that doesn't work, because arrays aren't constant (I tried that first)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, then we are good here. #Closed

@daxian-dbw
Copy link
Member

@dantraMSFT Can you please take another look and update your review?

@daxian-dbw daxian-dbw dismissed dantraMSFT’s stale review October 24, 2017 02:39

Comments has been addressed by later commits.

@daxian-dbw
Copy link
Member

daxian-dbw commented Oct 24, 2017

I'm getting this PR merged so that it can be validated by today's daily builds, and hopefully, it can catch Beta.9 release.
Actually, this PR should have the last commit marked with '[Feature]' tag to trigger full build run.

@daxian-dbw daxian-dbw merged commit be70072 into PowerShell:master Oct 24, 2017
@zjalexander zjalexander added the Documentation Needed in this repo Documentation is needed in this repo label Nov 15, 2017
@joeyaiello joeyaiello removed Documentation Needed in this repo Documentation is needed in this repo labels Oct 15, 2018
@JamesWTruher JamesWTruher deleted the jameswtruher/fileEncodingMin branch May 11, 2022 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Breaking-Change breaking change that may affect users Committee-Reviewed PS-Committee has reviewed this and made a decision
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants