Skip to content

typesense/showcase-books-search

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

πŸ“š Instant Books Search, powered by Typesense

This is a demo that showcases some of Typesense's features using a 28 Million database of books from OpenLibrary (Internet Archive).

View it live here: books-search.typesense.org

Tech Stack

This search experience is powered by Typesense which is a blazing-fast, open source typo-tolerant search-engine. It is an open source alternative to Algolia and an easier-to-use alternative to ElasticSearch.

The book dataset is from openlibrary.org. If you're able to contribute book metadata, please do πŸ™

The app was built using the Typesense Adapter for InstantSearch.js and is hosted on S3, with CloudFront for a CDN.

The search backend is powered by a geo-distributed 3-node Typesense cluster running on Typesense Cloud, with nodes in Oregon, Frankfurt and Mumbai.

The dataset has ~28M records, takes up 6.8GB on disk and 14.3GB in RAM when indexed in Typesense. Takes ~3 hours to index these 28M records.

Repo structure

  • src/ and index.html - contain the frontend UI components, built with Typesense Adapter for InstantSearch.js
  • scripts/indexer - contains the script to index the book data into Typesense.
  • scripts/data - contains a 1K sample subset of the books database. But you can download the full dataset from the link above.

Development

To run this project locally, install the dependencies and run the local server:

yarn
bundle # JSON parsing takes a while to run using JS when indexing, so we're using Ruby just for indexing

yarn run typesenseServer

ln -s .env.development .env

yarn run indexer:extractAuthors # This will output an authors.jsonl file
yarn run indexer:transformDataset # This will output a transformed_dataset.json file
BATCH_SIZE=100000 yarn run indexer:importToTypesense # This will import the JSONL file into Typesense

yarn start

Open http://localhost:3000 to see the app.

Deployment

The app is hosted on S3, with Cloudfront for a CDN.

yarn build
yarn deploy

About

A site to instantly search 28M books from OpenLibrary using Typesense Search (an open source alternative to Algolia / ElasticSearch) ⚑ πŸ“š πŸ”

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project