Fetches quotes from quotes.toscrape.com
cd tutorial
rm quotes.jl; scrapy crawl quotes -o quotes.jl
Fetches quotes' author bio from quotes.toscrape.com
cd tutorial
rm authors.jl; scrapy crawl author -o authors.jl
You need to enter your username and password in spiders/github_emails.py. It will output your emails you listed in your github
cd tutorial
rm emails.jl; scrapy crawl emails -o emails.jl
It finds all js files referred by script tag in a given url and download them
cd tutorial
scrapy crawl jstrackscrap
Github-repos spider log in to Github and fetches all of user's repo links. Then it have to go to each repo's setting and fetches its name. But, the twist is that we will break session(remove cookies) with some random probability before accessing repo's setting (thus creating an effect that the website break our session due to some reason). The crawler should know that session is broken and should restablish it
Note: Insert your github username and password at line#46 of spiders/github-repos.py
clear; rm repo_description.jl; scrapy crawl github-repos -o repo_description.jl