-
Notifications
You must be signed in to change notification settings - Fork 143
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
add parser_cls argument, changes default html parser to html.HTMLParser
This changes the default HTML parser to html.HTMLParser, and also introduces a parameter in Selector to specify another parser class if desired. The parser parameter will enable users that care a big deal about performance to use a custom parser if desired. This will affect Scrapy because it just uses the default here, but doesn't seem to have a perceived impact on performance, as per @kmike benchmark shared here: https://gist.github.com/kmike/af647777cef39c3d01071905d176c006
- Loading branch information
1 parent
1bba625
commit 13eb040
Showing
2 changed files
with
14 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters