We have a starting webpage. There is a list of courses related to Python programming language each shown as a separate block.
Each link redirects user to that specific course page. That course page has some information such as:
- course name
- course description
- number of exercises
- participants
- time hours
- url
- videos
- xp points
We need to collect these information for each course. The output should be like this in an .csv
format:
Make sure you have scrapy
installed in your environment. Navigate to desired folder. And create a Scrapy project from console:
scrapy startproject datacamp
Copy this project files into that datacamp folder. It should be ..../datacamp/datacamp.../
Open console, change directory to that inside datacamp folder. And run the following command:
scrapy crawl my_scraper -o datacamp.csv