LinkExtractor chokes when only one link is bogus #907
Closed
Comments
From the user perspective we should definitely skip bogus links, +1 to this feature. I think it is better to catch specific errors one-by-one. I.e. to fix this issue wrap only urljoin in try-except, not the whole loop body. |
Closed
dangra
added a commit
that referenced
this issue
Aug 16, 2015
[MRG+1] [LinkExtractors] Ignore bogus links (#907)
fixed in #1352 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
_extract_links()
either returns all extracted links (can be an empty list) or fails;It would be nice to wrap a
try/except
and return what could be extracted and skip bogus links.Example session:
The text was updated successfully, but these errors were encountered: