Python Book Scraper Program

This is a beginner/intermediate friendly program that pulls data from the website http:books.toscrape.com and uploads it as an excel doc for readability. This project will help you learn how to read HTML and use a couple of different Python Packages (BeautifulSoup, OS, requests, csv ...). The project does require a basic understanding of the python programming language such as how to make lists and use loops. I highly recommend reviewing the documentation and trying a couple of exercises on replit if needed.

The Program

This program pulls data from the website http://books.toscrape.com and organizes it into various categories to be exported to CSV/excel files for easy viewing and organizing. The pieces of data it retrieves for each book are: title, URL, UPC, price w/tax and without, description, category, rating and image URL and images. It organizes these pieces of data into multiple excel files by category. The reason for this program is it makes viewing relevant pieces of data from a entire website straightforward and quick. Instead of scrolling through page after page of books all that data is assembled for you to easily search through. It eliminates the time component and retrieves a vast amount of data which can help you quickly find the necessary data that you need regarding whatever book you are trying to find. This project can be adapted for future projects to scrape a number of different websites (please make sure that you can do so legally)

Setting up the IDE (if preferred):

Go to https://www.jetbrains.com/pycharm/ version 3.12 and download the program for your given operating system.
Download python from https://www.python.org/downloads/ version 3.12.1 for your specific operating system.
Open it and create a new project
Download the code
Extract the code to preferred location
In the IDE go to file open and navigate to the location you extracted the code from and select the main.py (optional) Navigate the the extracted folder location and double click the main.py file and it will run automtically in the terminal

Instructions for Terminal (Windows):

Download files from this repository or create a clone using the code below.

$ git clone https://github.com/yetty300/Book_Scraper
Navigate to the directory containing the repository.

$ cd Book_Scraper
Using these terminal commands, create and activate a virtual environment.

$ python -m venv env

$ env/scripts/activate (this step may be required for some)
Use the command below to install the packages according to the configuration file requirements.txt.

$ pip install -r requirements.txt
Open and run the file allcategories.py to download product data in CSV format and product images.

$ .\main.py

Instructions for Mac:

Download files from this repository or create a clone using the code below.

$ git clone https://github.com/yetty300/Book_Scraper
Navigate to the directory containing the repository.

$ cd Book_Scraper
Using these terminal commands, create and activate a virtual environment.

$ python -m venv env

$ source env/bin/activate
Use the command below to install the packages according to the configuration file requirements.txt.

$ python3 -m pip install -r requirements.txt
Open and run the file allcategories.py to download product data in CSV format and product images.

$ python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
.gitignore		.gitignore
README.md		README.md
Requirements.txt		Requirements.txt
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python Book Scraper Program

The Program

Setting up the IDE (if preferred):

Instructions for Terminal (Windows):

Instructions for Mac:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Python Book Scraper Program

The Program

Setting up the IDE (if preferred):

Instructions for Terminal (Windows):

Instructions for Mac:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages