Skip to content

An iPhone app that gets all image URLs on a web page and recursively does that for web pages it links to and display those images in a collection view

License

Notifications You must be signed in to change notification settings

yuliang/ImageCrawler

Repository files navigation

ImageCrawler

This is a PoC or programming exercise I did one weekend. It starts with a URL you specify to crawl. Then it parses the html to gather img and href links. It puts img links in a mutable set for later to display. It puts href links in another mutable set to make sure the same url doesn't get visited twice. It then recursively does the same with the unvisited href links in the html in a depth first or breath first way.

It deals with cases such as:

  • Relative path
  • With or without '/' at the end
  • Double or single quotes in the tags
  • Making sure sets and queues are thread safe
  • Set limits such as number of total web pages to crawl for depth first or number of levels to go down for breath first otherwise for most urls it will go on forever

screenshot 1

screenshot 2

About

An iPhone app that gets all image URLs on a web page and recursively does that for web pages it links to and display those images in a collection view

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published