forked from adbar/trafilatura
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
introduce
MAX_REDIRECTS
config setting and fix urllib3 redirect han…
…dling Fixes issue adbar#450 After setting `MAX_REDIRECTS` to 5, I could fetch the original URL from the issue: `trafilatura -u https://www.hydrogeninsight.com/production/breaking-us-reveals-the-seven-regional-hydrogen-hubs-to-receive-7bn-of-government-funding/2-1-1534596` I also fixed this old issue: adbar#128 The underlying urllib3 bug has not been fixed: urllib3/urllib3#2475 I had to pass the retry strategy to the actual request method: it doesn't propagate from the pool maanger
- Loading branch information
1 parent
d31c8d7
commit dfc03f6
Showing
4 changed files
with
29 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters