ScrapBook: ISBN-based Book Information Retrieval

In this article

A complete code provides here

Chaloklum-Books/Add_new_books_dataset.ipynb

Dataset table for playaround in Kaggle here

Chaloklum Bookshop Dataset

Short Story

All books ISBN are all collected from my local bookstore.

Chaloklum Bookshop is one of the bookstore in Koh Phangan, Thailand which selling over 4000+ second-hand books in many languages such as English, German, French ... etc.

Every year, there will be a large number of tourists visiting. Most of them are tourists from western countries. There are also many foreigners living and work here. This bookstore therefore sells books in a variety of languages. Almost all books are second-hand books. And there are frequent exchanges from foreigners who live here. You can bring your own books to exchange with the shop.

This bookstore has been open for many years, but never recorded information about the book or any book trading information. So, I decided to collecting every books as possible in the store by using my phone to scanning every single book barcode for ISBN number and using python to retrieve information of the books from two sources, Google Books and Goodreads. 😃

Tools

Here list of tools that I used

Books ISBN Collecting

Android smart phone
Application: 'Barcode to Text' by 1room

Data Retrieving and Management Libraries

Python 3.8
NumPy 1.21.6
Json 2.0.9
BeautifulSoup 4.9
Requests 2.23
Pandas 1.3.5

Data Sources

Google Books API
Goodreads.com

Book's ISBN Collecting Process

Scanning all book's barcode in the bookstore using my smartphone and Barcode to Text application.
- Barcode of the books represent the ISBN number.
- Also during this step I do writing the code with some sample too.
Export the ISBN number to .txt here an example how it look like.

books_isbn.txt
9784478048122
9784478048009
4091780326
9784872577969
4091780334
CIP2005001965
6142204340028
...

Use Python and some libraries as mentioned above to scrapped the webpage of the book given ISBN number.

Collected Dataset

There are over 4000+ books but some books are too old, so there will be no information about it in both Google Books API and Goodreads

Toggle to see an example of collected dataset table

	id	isbn_10	isbn_13	isbn_other	isbn_book	authors	title	subtitle	publisher	published_date	page_count	categories	language	google_desc	rating_avg	#reviews	#ratings	#text_reviews	thumbnail	genre	goodreads_desc	text_reviews	also_enjoy	status
0	1	nan	nan	nan	9788020609564	['Jiří Šolc']	Útěky a návraty Bohumila Laušmana	osud českého politika	nan	2008	403	['Political prisoners / Czechoslovakia / 1948-1968 / czenas']	cs	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	-1
1	2	nan	nan	nan	9789637253089	nan	A nők tartják az égbolt felét	egy rendkívüli asszony rendkívüli élettörténete	nan	2005	nan	nan	hu	nan	nan	nan	nan	nan	nan	nan	nan	nan	nan	-1

I also do have a notion page to find the book with this table too!

Problems and Obstacles Encountered

Some books have no barcode(less than 5% of the books)

Some books doesn't even have the cover :'<
Some books have indistinct bar codes, covered with price tags, marked in dark ink.

Barcodes can still be scanned even if they are slightly damaged.
But will not be able to scan at all if the bar code is covered too much.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
Add_new_books_dataset.ipynb		Add_new_books_dataset.ipynb
Books_dataset_table_preprocessing.ipynb		Books_dataset_table_preprocessing.ipynb
README.md		README.md
chaloklum-bookshop-data-visualization.ipynb		chaloklum-bookshop-data-visualization.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScrapBook: ISBN-based Book Information Retrieval

Short Story

Tools

Books ISBN Collecting

Data Retrieving and Management Libraries

Data Sources

Book's ISBN Collecting Process

Collected Dataset

Toggle to see an example of collected dataset table

Problems and Obstacles Encountered

About

Releases

Packages

Languages

p4zaa/ScrapBook-ISBN-based-retrieval

Folders and files

Latest commit

History

Repository files navigation

ScrapBook: ISBN-based Book Information Retrieval

Short Story

Tools

Books ISBN Collecting

Data Retrieving and Management Libraries

Data Sources

Book's ISBN Collecting Process

Collected Dataset

Toggle to see an example of collected dataset table

Problems and Obstacles Encountered

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages