Skip to content

brunocascio/scrapy-slideshare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scrapy Slideshare

Script for Slideshare scraping, using the python framework scrapy.

How to use

  • Clone this repo:

    git clone https://github.com/brunocascio/scrapy-slideshare && cd scrapy-slideshare

  • Install dependences:

    chmod +x install.sh && sudo ./install.sh

    NOTE: It works only in Debian like OS. Tested only in Ubuntu 14.04. Please consider contribute to this package for others installations. For more info visit download page.

  • Run the Spider :D

    scrapy crawl SlideShareSpider -a url=<URL_OF_SLIDESHARE>

  • The images are downloaded in images/<nameOfSlideShare> folder on root directory.

About

Script for Slideshare scraping

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published