During my co-op semester at Gravit-e Technology, my coworker and I discussed an idea to create a website/tool to compare prices between multiple buying/selling platforms. He directed my to a very cool library in Golgang called Colly. It allows for multithreaded scraping. Compared to the traditional scraper, this is very cool because it's essentially has a built in spider and is optimized for multithreading. This is the start of the project. I've successfully parsed the craigslist automotive section into CSV format.
- How to avoid getting banned from craigslist using delays and proxies(these are all built in library features)
- Selecting items
- Multithreading
- What Golang was
- Error handling