books_scraper

Books scraper for Google Scholar and Goodreads

Prerequisites

Install Python for your operating. You can download Python 3.8.2 from here.
This program makes use of Selenium WebDriver for fetching GoodReads book shelf data. You should have a driver installed for your browser. Currently supported browsers are: Chrome, Firefox, Edge and Safari. We have tested with Firefox and Safari (on macOSX 10.14.6).

How to install and run

Open a shell
cd some_folder_where_you_want_this_code
git clone https://github.com/bsodhi/books_scraper.git
cd books_scraper
python3 -m venv give_some_name
source give_some_name/bin/activate
pip install -r requirements.txt
python3 books_scraper/scraper.py -h

Output format

The output is written as a csv file. For Goodreads data following columns are written to the csv file: ["author", "title", "isbn", "language", "avg_rating", "ratings", "pub_year", "book_format", "pages", "genre"]

For Google Scholar data, the columns are: ["author", "title", "citedby", "url", "abstract"]

This code is written by taking lot of help from StackOverflow community and Python API documentation. Greatly appreciated!

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
books_scraper		books_scraper
webapp		webapp
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
books_app.code-workspace		books_app.code-workspace
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

books_scraper

Prerequisites

How to install and run

Output format

About

Releases

Packages

Contributors 2

Languages

License

bsodhi/books_scraper

Folders and files

Latest commit

History

Repository files navigation

books_scraper

Prerequisites

How to install and run

Output format

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages