Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FlxStringUtil.getDomain() fixes + improvements #1993

Merged

Conversation

seraku24
Copy link
Contributor

@seraku24 seraku24 commented Nov 5, 2016

I had encountered an issue with site-locking unexpectedly failing and traced the root cause to the helper function FlxStringUtil.getDomain() not accepting the URL that was passed in.

This pull request firstly introduces unit testing that was originally missing for the function in question and secondly replaces the existing logic with a singular regular expression that handles valid inputs. Additionally, the function now supports case-insensitivity by forcing the extracted domain to lower case.

Replaced logic to properly support extracting the domain from valid URLs.
Added unit tests to validate behavior against both non-local and local paths.
JavaScript regex does not support look-behind.  The updated expression should be compatible against all targets.
@Gama11 Gama11 added the Bug label Nov 6, 2016
@Gama11 Gama11 changed the title Fix parsing of domains from URLs. FlxStringUtil.getDomain() fixes + improvements Nov 6, 2016
@Gama11 Gama11 merged commit 57dc460 into HaxeFlixel:dev Nov 6, 2016
@@ -262,14 +262,8 @@ class FlxStringUtil
*/
public static function getDomain(url:String):String
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would probably be nice if the docs mentioned the special meaning of "local" as a return value, but I couldn't really find a good way to phrase that.

  • Returns the domain of a URL, or "local" if it's not a valid URL

Not technically correct I guess, local file paths are still considered URLs.

  • Returns the domain of a URL, or "local" if it's a local file path

Not really correct either, it'll return "local" whenever the match fails, so it could be a completely invalid URL.

Any ideas? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function should probably not be returning "local" at all. That's more of a detail left for the FlxBasePreloader.isHostUrlAllowed() function. I've got an idea to use an Enum for this, but need to refactor the preloader logic to support it. Let me put together a PR with what I think makes more sense and we can pick up the discussion there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, I'm inclined to leave that alone for now. Changing it would be a breaking change - we can't guruantee that only FlxBasePreloader relies on it, other people might use it as well.

Copy link
Contributor Author

@seraku24 seraku24 Nov 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep the String return type but just return "" in the event there is no domain? That makes more sense from a design/intention perspective. It would be breaking if there are people that use getDomain and expect "local" as the failure return, but it would be an easier fix than using something like Option<String>.

I also noticed that my change does break one aspect of the sitelock system. The existing logic was able to parse http://localhost:3000/ and return "localhost" even though that is not a two-level domain. However, this scenario only occurs for the local case, which does not really depend on the domain being anything in particular. It is the failure to grab a domain that signifies the local case, even if the URL was not actually a local path. isHostUrlAllowed certainly needs fixing to ensure that it handles the remote vs. local cases properly, regardless of what getDomain returns. I'll at least get that PR going, since it's almost-but-not-quite-broken right now.

Copy link
Contributor Author

@seraku24 seraku24 Nov 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and I just realized that there is a problem with IPv4 addresses for hosts. http://127.0.0.1:3000/ would return "0.1" which is incorrect. Although, that would have been a problem for the previous version of the function as well. And of course, IPv6 breaks the parsing too.

At the end of the day, how much do we care about accuracy here? There is a very simple and clear fix for this, but let me know if you think it is worth doing:

  1. Define getHost() which returns the entire host section of the URL. By having a new function here, we can define the behavior however we want without breaking anyone.
  2. Update getDomain() to use getHost() and further break down the host into just the two-level domain, handling IP addresses and localhost as needed.

Side note... We should testfile:///D:/folder/file.extension as a local path, and make sure it does not get parsed as a remote host, just in case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering even the old function was good enough for years / nobody reported any issues with it thus far, I don't think there's a need to support every edge case. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point. I always worry that I end up overthinking and over-engineering things. And often it is the case of low-hanging fruit where I might think, "It would only take a few minutes to do it right, so better to get it done now and have it done right." As things appear to be stable enough for the time-being, we can always revisit this later as needed.

Many thanks for your feedback. (:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the software engineer in me just couldn't leave well enough alone. If you have the time, please take a look at the commit in my fork: seraku24/flixel@7937ffd. There's absolutely no rush on this, but I figured I would get the work done while it was fresh in my head. Do let me know if you think I should go ahead and submit the PR now.

@seraku24 seraku24 deleted the seraku24-fix-sitelock-domain-parsing branch November 6, 2016 19:00
Aurel300 pushed a commit to larsiusprime/haxeflixel that referenced this pull request Apr 18, 2018
Replaced logic to properly support extracting the domain from valid URLs.
Added unit tests to validate behavior against both non-local and local paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants