If url goes to file, dont try to fetch markup #1

hmol · 2016-02-05T09:21:21Z

When crawling http://www.the-website-to-crawl.com and the application gets to a url for a file; http://www.the-website-to-crawl.com/reports/report.pdf , it will throw exception. This is beacuse it will then try to fetch html markup from pdf-file. So, when getting to a url for file, dont fetch markup, just continue.

The text was updated successfully, but these errors were encountered:

hmol · 2016-02-09T20:47:48Z

Fixed

Merging changes from hmol/Linkcrawler

…xx-responses Print Location header for 3xx and also any exception messages

hmol closed this as completed Feb 9, 2016

hmol pushed a commit that referenced this issue Feb 19, 2017

Merge pull request #1 from hmol/develop

1942d54

Merging changes from hmol/Linkcrawler

robsiera pushed a commit to robsiera/LinkCrawler that referenced this issue Oct 30, 2020

Merge pull request hmol#1 from ed-graham/feature/show-new-links-for-3…

5ca2f87

…xx-responses Print Location header for 3xx and also any exception messages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If url goes to file, dont try to fetch markup #1

If url goes to file, dont try to fetch markup #1

hmol commented Feb 5, 2016

hmol commented Feb 9, 2016

If url goes to file, dont try to fetch markup #1

If url goes to file, dont try to fetch markup #1

Comments

hmol commented Feb 5, 2016

hmol commented Feb 9, 2016