Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Detect Unix paths on Unix and Dos&Unc paths on Windows #15925

Merged
merged 36 commits into from Mar 4, 2017

Conversation

tmds
Copy link
Member

@tmds tmds commented Feb 7, 2017

Fixes https://github.com/dotnet/corefx/issues/1745

\CC @stephentoub

TODO:

  • Equals test HashCode check 4a7ce12

{
// Implicit file
yield return new object[] { new Uri("/sharepath/path/file"), new Uri("/sharepath/path/file"), true };
yield return new object[] { new Uri("/sharepath/path/file"), new Uri("/sharepath/path/File"), false };
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub
These two inputs are failing the test (note: file vs File).

yield return new object[] { new Uri("file:///sharepath/path/file"), new Uri("file:///sharepath/path/File"), false };
yield return new object[] { new Uri("/sharepath/path/file"), new Uri("/sharepath/path/File"), false };

The Uri's are not Equal (great!), but the test also expects the HashCode to be the same. Change the test?

@@ -36,13 +36,23 @@ public void GetNormalizedStrongNameHash()
}

[Fact]
public void GetNormalizedUrlHash()
[PlatformSpecific(TestPlatforms.Windows)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: @sepidehms just went through all of our platform-specific tests and added comments next to them to indicate why they're platform-specific (#15932). Could you add such comments for these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

public void GetNormalizedUrlHash_Unix()
{
// Validating that we match the exact hash the desktop IsolatedStorage implementation would create.
Uri uri = new Uri(@"file:///Users/jerem/Documents/Visual Studio 2015/Projects/LongPath/LongPath/bin/Debug/TestAssembly.EXE");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path can be used in the Uri ctor on Windows as well, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Will use same test for both.

@@ -20,6 +20,8 @@ public class Tests
private const string Mime_MediaTypeNames_Image_Jpeg = "image/jpeg"; // System.Net.Mime.MediaTypeNames.Image.Jpeg
private const string s_DocumentXml = @"<Hello>Test</Hello>";
private const string s_ResourceXml = @"<Resource>Test</Resource>";
private static readonly bool s_IsWindowsSystem = Path.DirectorySeparatorChar == '\\';
private static readonly string s_LocalFile = s_IsWindowsSystem ? @"c:/resources/image1.jpg" : "/resources/image1.jpg";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s_LocalFile => s_localFile (or maybe s_fullPathToLocalFile, or something like that)

@@ -80,7 +80,7 @@ public void GetRequestResponseAfterAbort_Throws()
[Fact]
public void NotImplementedMembers_Throws()
{
WebRequest request = WebRequest.Create("file://anything");
WebRequest request = WebRequest.Create("file:///anything");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These paths previously worked on both Windows and Unix without the third slash. Is that changing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changed. On Unix, Unix paths must start with a '/' on Windows the behavior is unchanged.
00a1a71#diff-06f80dbdb7963930a25fb8a935d08e0dR2193

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does that have to change? I'm concerned about this being a breaking change. Are there no valid situations someone could have been using this today without the extra '/'?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does that have to change? I'm concerned about this being a breaking change. Are there no valid situations someone could have been using this today without the extra '/'?

I wrote the requirement more strict at first. The path could not be empty and had to start with a slash '/'. At that point, I updated the tests. I then relaxed it to, if there is a path it must start with '/'.

file://anything -> host=anything, path=/
file:///anything -> host=, path=/anything

So the tests will now also work with 'file://anything'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are tests for these cases as well? The below are also valid on Unix..

file:// -> host=, path=, AbsoluteUri should be file:///
file:/// -> host=, path=/, AbsoluteUri should be file:///

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Priya91 these are tested here: https://github.com/dotnet/corefx/pull/15925/files#diff-37ac644755d2501e671ac255288654a3R715

 +                // Explicit with empty host and empty path
 +                yield return new object[] { "file://", "/", "", "" };
 +                // Explicit with empty host and non-empty path
 +                yield return new object[] { "file:///", "/", "", "" };
 +                yield return new object[] { "file:////", "//", "", "" };

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revert the change of unix paths having to start with '/'. (For implicit paths, this is the case still.) But for file:// uris, a backslash will do to. The Uri class will convert these backslashes to forwardslashes.
e.g.:

/one\two ->          path = /one\two
file://\one\two ->   path = /one/two
file:///one%5Ctwo -> path = /one\two

_flags |= Flags.UnixPath;
}
//Windows: UNC path?
else if (IsWindowsSystem && _syntax.InFact(UriSyntaxFlags.FileLikeUri) && (i - idx >= 2 && i - idx != 3 &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe Roslyn will effectively trim away the dead branches based on it seeing that IsWindowsSystem is hardcoded to be true or false. Can you double-check the IL just to confirm? If it doesn't, we may want to rethink this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked. Roslyn trims away the dead branches.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks.

@@ -2177,6 +2190,12 @@ private unsafe ParsingError PrivateParseMinimal()
if (err != ParsingError.None)
return err;

// Unix paths must start with '/'
if (InFact(Flags.UnixPath) && idx < (ushort)length && pUriString[idx] != '/')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InFact(Flags.UnixPath) will only ever return true if we're on a unix system, right (since otherwise it wouldn't have been recognized)? If so, we may want to preface this check with !IsWindowsSystem &&... that way, assuming my previous comment about Roslyn is correct, this whole if statement would end up being compiled away on Windows and would only be in the Unix binary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Same comment applies to other similar situations)

@@ -3648,8 +3667,8 @@ private unsafe void ParseRemaining()
}
}

//NB: A string must have at least 3 characters and at least 1 before ':'
if (idx + 2 >= length || end == idx)
//NB: A string must have at least 3(Windows)/1(Unix) characters and at least 1 before ':'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I assume the 3/1 chars being referred to here are:

  • On Unix the opening slash
  • On Windows the opening drive letter, colon, and slash?

If so, might be worth clarifying that further in the comment. If not, I'm obviously not understanding, so it'd be worth spelling it out :)

@@ -184,7 +184,7 @@ public void Iri_IncompleteEscapedCharacter()
[Fact]
public void Iri_ReservedCharacters()
{
string input = "/?#??#%[]";
string input = "?/#??#%[]";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changed because the original input would trigger recognition as a unix path? Is there any concern about such things in the wild?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would actually cause an IndexOutOfRange exception to be thrown. I investigated it further now, and the cause was newHost being an empty string instead of null.
Reverted the IriTest and fixed here: tmds@c13d66a.


namespace System.PrivateUri.Tests
{
public static class UriTests
{
private static readonly bool s_IsWindowsSystem = Path.DirectorySeparatorChar == '\\';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other test suites, we've used RuntimeInformation for this instead of adding such statics, e.g. RuntimeInformation.IsOSPlatform(OSPlatform.Windows)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use PlatformDetection.IsWindows, include it as a linked filed from Common\test\System location.

uri = new Uri(@"file://c:/directory/filename");
b = uri.IsWellFormedOriginalString();
Assert.False(b);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should we validate the inverse on Unix, that these aren't recognized as file paths?

Though related to some of my previous comments, I also wonder whether we can/should try to make the implementation recognize, on unix, unix file paths in addition to rather than instead of windows ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests verify these are invalid Uris on Unix (https://github.com/dotnet/corefx/pull/15925/files#diff-37ac644755d2501e671ac255288654a3R1295) The constructor throws for these strings.

+                // Implicit Windows drive 
+                yield return new object[] { @"c:\", UriKind.Absolute };
+                yield return new object[] { @"c:\path\file", UriKind.Absolute };
+                
+                // Implicit UNC 
+                yield return new object[] { @"\\unchost", UriKind.Absolute };
+                yield return new object[] { @"\\unchost\", UriKind.Absolute };
+                yield return new object[] { @"\\unchost\path\file", UriKind.Absolute };

I also wonder whether we can/should try to make the implementation recognize, on unix, unix file paths in addition to rather than instead of windows ones.

Technically it is possible. I doubt whether it will be used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it is possible. I doubt whether it will be used.

It may not be, or at least not on purpose. But since such code would have worked on desktop and on .NET Core 1.x on Unix, if we can support it, it could be good, too. What does Mono do with such paths on Unix?

else
{
yield return new object[] { "/some/path" };
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could be:

yield return new object[] { s_IsWindowsSystem ? "\\\\wddata\\some\\path" : "/some/path" };

@stephentoub
Copy link
Member

let me know if you if I should add support for Dos paths? and Unc paths starting with '\' and '/'?

Let's try to make this fix as additive-only as possible, i.e. remove as little existing behavior as possible to add support for recognizing Unix file paths. Especially since Mono supports those things, I would like us not to drop such support unless it's absolutely necessary (and it sounds like it's not).

@Priya91
Copy link
Contributor

Priya91 commented Feb 21, 2017

@tmds There are some merge commits in the PR history, can you please clean them up.

@stephentoub
Copy link
Member

There are some merge commits in the PR history, can you please clean them up.

Once we're all happy with the PR, we can merge it via "squash and merge" rather than "create a merge commit" to get rid of them.

@tmds
Copy link
Member Author

tmds commented Feb 22, 2017

Making it additive and doing cleanup made the PR much more digestible.
@stephentoub @Priya91 @CIPop @davidsh ptal

@stephentoub
Copy link
Member

@dotnet-bot test this please

@@ -81,6 +83,7 @@ public void IriRelativeResolution_CompareImplcitAndExplicitUncWithNoUnicode_AllP
}

[Fact]
[PlatformSpecific(TestPlatforms.Windows)] // Unc paths must start with '\' on Unix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Unc paths must start with \ on Unix

Is this how Mono does it as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks for working on this, @tmds.

@stephentoub
Copy link
Member

@dotnet-bot test Outerloop Windows_NT Debug please
@dotnet-bot test Outerloop Ubuntu14.04 Debug please
@dotnet-bot test Outerloop OSX Debug please

@tmds
Copy link
Member Author

tmds commented Feb 25, 2017

Looks great. Thanks for working on this, @tmds.

Thanks. Making it additive and some refactoring really improved the PR. Will someone else do a review also?

@stephentoub
Copy link
Member

@geoffkizer, @CIPop, @JeremyKuhne, can one or more of you review as well?

@@ -95,6 +95,7 @@ private enum Flags : ulong
// and Host values in case of a custom user Parser
DosPath = 0x08000000,
UncPath = 0x10000000,
UnixPath = 0x100000000000,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be listed in order.

@@ -45,8 +47,8 @@ public void IriRelativeResolution_CompareImplcitAndExplicitFileWithReservedChar_
[Fact]
public void IriRelativeResolution_CompareImplcitAndExplicitFileWithUnicodeIriOn_AllPropertiesTheSame()
{
string unicodeImplicitTestFile = @"c:\path\\u30AF\path3\\u30EB\u30DE.text";
string nonUnicodeImplicitFileBase = @"c:\path\file.txt";
string unicodeImplicitTestFile = s_isWindowsSystem ? @"c:\path\\u30AF\path3\\u30EB\u30DE.text" : "/path//u30AF/path3//u30EB/u30DE.text";

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say, I'm confused by this test. What's the point of the \u30AF etc stuff if it's not meant to be a Unicode char?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably it should be Unicode. Shall I change it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. What's the intent of the test? Is there history that shows that this was intended to test Unicode chars? Maybe @CIPop or @davidsh understand this better?

@stephentoub
Copy link
Member

@geoffkizer, @CIPop, @JeremyKuhne, are you guys good with these changes?

@stephentoub
Copy link
Member

Ok, given the lack of further responses in the last week, I'm going to go ahead with this.

@tmds, thanks for working on this!

@stephentoub stephentoub merged commit 7df6b74 into dotnet:master Mar 4, 2017
@karelz karelz modified the milestone: 2.0.0 Mar 7, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…#15925)

* Uri: detect Unix path on Unix, detect Dos&Unc path on Windows

* Update System.IO.Packaging/tests

* Update System.Xml.XmlSchema.XmlSchemaValidatorApi.Tests

* Update Common.Tests

* Update System.Private.Uri.Functional.Tests

* Update System.Private.Uri.ExtendedFunctional.Tests

* Update System.Runtime/tests

* Uri.MethodsTests.cs: comment out failing inputs

* Restore uri minimal length of 3

* Require unix paths to start with a /

* Allow Unix path file uri to start with '\'

* FileWebRequestTest: revert extra slashes added (file:/// back to file://)

* Uri: fix empty newHost bug and revert workaround in IriTest

* Annotate PlatformSpecific tests

* IdentityHelperTests: use file path that works on Windows and Linux

* System.IO.Packaging/tests: rename s_LocalFile to s_fullPathToLocalFile

* Uri: update comments on minimal Uri length check, don't check length twice on Windows

* Use PlatformDetection.IsWindows

* Ensure Roslyn gets rid of UnixPath checks on Windows by adding IsWindowsSystem (const bool) checks

* Uri: Fix uapaot build by including Uri.Windows.cs

* IdentityHelperTests: rename GetNormalizedUrlHash_Unix -> GetNormalizedUrlHash

* Uri.MethodsTests: don't require GetHashCode to return different values when Uris only differ in case

* s_IsWindowsSystem -> s_isWindowsSystem

* Detect Dos and Unc paths on Unix

* Update tests

* Clean-up

* Clean-up II

* Clean-up III

* Clean-up IV

* Clean-up V

* Move Flags.UnixPath to the end of the Flags enum


Commit migrated from dotnet/corefx@7df6b74
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
10 participants