Skip to content

Crawles information from DataCamp and output into a .csv file

Notifications You must be signed in to change notification settings

mmustafaicer/datacampcrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

DataCamp Crawler

Problem Statement.

We have a starting webpage. There is a list of courses related to Python programming language each shown as a separate block.

start_url

Each link redirects user to that specific course page. That course page has some information such as:

  • course name
  • course description
  • number of exercises
  • participants
  • time hours
  • url
  • videos
  • xp points

example

We need to collect these information for each course. The output should be like this in an .csv format:


excel

Run

Make sure you have scrapy installed in your environment. Navigate to desired folder. And create a Scrapy project from console:

scrapy startproject datacamp

Copy this project files into that datacamp folder. It should be ..../datacamp/datacamp.../

Open console, change directory to that inside datacamp folder. And run the following command:

scrapy crawl my_scraper -o datacamp.csv

About

Crawles information from DataCamp and output into a .csv file

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages