Fixed default value of attrs argument in SgmlLinkExtractor to be tuple #661
Conversation
Hey @ananana, Thanks for this PR! Could you please add some tests for it? I like how you used |
I made the following changes:
Still left to do:
|
Added tests for |
plenty good testcases to support this change, for me. Thanks very much. +1 to merge this (on the discussion about SGMLParser not parsing |
self.deny_extensions = set(['.' + e for e in deny_extensions]) | ||
tag_func = lambda x: x in tags | ||
attr_func = lambda x: x in attrs | ||
self.deny_extensions = set(['.' + e for e in arg_to_iter(deny_extensions)]) |
dangra
Mar 26, 2014
Member
no need to create a set from a list comprehension, I know it was there but it is a good moment to replace it by
set('.' + e for e in arg_to_iter(deny_extensions))
no need to create a set from a list comprehension, I know it was there but it is a good moment to replace it by
set('.' + e for e in arg_to_iter(deny_extensions))
kmike
Mar 26, 2014
Member
as Python 2.6 is no longer supported: {'.' + e for e in arg_to_iter(deny_extensions)}
as Python 2.6 is no longer supported: {'.' + e for e in arg_to_iter(deny_extensions)}
dangra
Mar 26, 2014
Member
+1
+1
LGTM. thanks! |
attr_func = lambda x: x in attrs | ||
self.deny_extensions = {'.' + e for e in arg_to_iter(deny_extensions)} | ||
tag_func = lambda x: x in arg_to_iter(tags) | ||
attr_func = lambda x: x in arg_to_iter(attrs) |
kmike
Mar 26, 2014
Member
I suspect that if you remove arg_to_iter
and make attrs=('href')
again then new tests will still pass because 'href' in 'href'
is True. We may e.g. add an element with "ref" attribute to the testing suite to check that condition, but I'm fine with merging without this fix.
I suspect that if you remove arg_to_iter
and make attrs=('href')
again then new tests will still pass because 'href' in 'href'
is True. We may e.g. add an element with "ref" attribute to the testing suite to check that condition, but I'm fine with merging without this fix.
That's true, I just added a new test for that (with a "ref" attribute) for the sake of thoroughness. |
as last request before merging, can we get this changes squashed into a single commit? |
…d list parameters (attrs, tags, deny_extensions)
Rebased and squashed. |
Fixed default value of attrs argument in SgmlLinkExtractor to be tuple
attrs
is now tuple)tags
andattrs
inarg_to_iter()
, sotag_func
andattr_func
would accept both strings and lists. (so this version of the constructor is compatible with the old one)