Skip to content

Using Python and Scrapy, this repository aims to scrape the data of books for use such as a book recommendation system with pre-built assets.

Notifications You must be signed in to change notification settings

DeStRoYeR-droid/Scrapy-GoodReads

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Scrapy-GoodReads


Description 📝

This repository marks my first attempt at web scraping using Scrapy and what better way to do it than doing it on GoodReads to yield the details of the books which are described in the start_urls of /Learning/Spiders file.

This program is meant to retreive the image URL of the book, Title of the book and the description will be scraped via this crawler


To run the code 👨🏽‍💻

pip install -r requirements.txt

Change directory to Learning/spider


scrapy crawl GoodReads -o BooksData.json
(to store it in BooksData.json file, please note that this will just append the data in the file)

scrapy crawl GoodReads
(to run it normally and diplay the output)


Future prospects

As of now, we need to manually enter the links in the scraper.py file, which I would like to change to a command-line argument.

About

Using Python and Scrapy, this repository aims to scrape the data of books for use such as a book recommendation system with pre-built assets.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages