Escape domain detection has a leak #339

alexnederlof opened this Issue Sep 8, 2013 · 0 comments


None yet

1 participant


When crawljax asserts if it has left the domain it checks if the host of its current url is in the new url.

This is done here in

However, when it visits a social platform like Google+, that new URL has the domain int it as well, as a parameter.

Crawljax should check that the hostname occurs in the host part of the current URL.

@alexnederlof alexnederlof was assigned Sep 8, 2013
@alexnederlof alexnederlof added a commit that referenced this issue Nov 7, 2013
@alexnederlof alexnederlof Crawler checks if left domain by hostname.
Before this was done using `String.contains(x)`. However, that does not
work when in the new URL on another domain, the original domain is
passed through as a query parameter. Because the new method compares
hostnames, this cannot happen anymore.

Fixes #339
@amesbah amesbah closed this in #357 Nov 8, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment