-
-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvement suggestion with multiple domains in one single URL. #30
Comments
This is an interesting feature request. I think it shouldn't be on by default; the default behavior to not show overlapping results is I think the most common need, and, like you say, the fastest. We can't simply try all possible substrings of a found match. For example, that would mean that Perhaps we can limit this feature to at most one nested match within a URL's "path" element (including Finally, I'm not sure how to expose this in the API or in the command-line tool. Any ideas? |
I fully agree this should not be the default. By "default" now, the requested behaviour is using the space etc as "delimiter".
I fully agree too this requirement is not very common.. Was just an idea as I came across this specific case.. ;) Thank's |
This has nothing to do with separators, though. Your suggestion above would mean we'd support overlapping matches, which we don't now. If anything, adding |
Yes, suggestion is optional overlapping match. Now:
In this example, domainB.com is identified only when there is no "overlap". Suggestion about
then "reassemble" removing the extra space (to have the first full path with dir) |
What you're suggesting there would be a recursive search; at best, that would be quadratic complexity, and quite a lot of added code for a niche use case. I've decided to not implement this for now. It should be fairly easy to implement this outside xurls. If you find that the API is nice and the code is generally useful, we can look at adding that later, but for now I don't think it's worth the effort. |
That's OK. Thank's |
Hi,
Thank's for providing us xurls.
I came across the following case:
I wonder if there is a easy (still fast) way for xurls to identify there are 2 "URLs" inside ?
So this could possibly report something like:
Possibly by adding an additional option to support it on demand only.
If there is a space in the string, both are found fine (expected and fine)
This is only suggestion. If this impact performances badly, this is probably better to not implement.
The text was updated successfully, but these errors were encountered: