Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation speaks of re.match while meaning re.search #17

Closed
shaneaevans opened this issue Sep 9, 2011 · 1 comment
Closed

Documentation speaks of re.match while meaning re.search #17

shaneaevans opened this issue Sep 9, 2011 · 1 comment

Comments

@shaneaevans
Copy link
Member

Reported by Vasily Alexeev on Trac http://dev.scrapy.org/ticket/328

In link extractor reference we see passages like

"allow (str or list) – a single regular expression (or list of regular expressions) that the (absolute) urls must match in order to be extracted. If not given (or empty), it will match all links."

There's two quite different methods for working with regexps: matching and searching. A quick look in sources reveals that in this case we deal with searching, not matching:

_matches = lambda url, regexs: any((r.search(url) for r in regexs))

So documentation is clearly misleading and should be corrected.

@pablohoffman
Copy link
Member

This ticket has been open for too long without any specific suggestion on the improvement. I personally don't think that "match" is misleading there, as it's well know in the context of regular expressions, and in no place it refers to the re function itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants