Gutenfinder: search widget for Project Gutenberg
Gutenfinder is a little project to make a search widget for Project Gutenberg.
It is under development.
CLI examples:
Serve a webapp on localhost, port 8080:
$ gutenfinder web serve \
--postgres="user=username dbname=gutenfinder host=dbhost sslmode=disable" \
--elastic="http://gutenfinder_dev_elastic:9200"
Search for eBooks on the command line:
$ gutenfinder search --term beancurd
{
"id": 9,
"term": "beancurd",
"result_count": 1,
"items": [
{
"id": 467,
"ebook_id": 724,
"ebook_title": "Have We No Rights? A frank discussion of the \"rights\" of missionaries",
"html_snippet": "be made of a couple of\r\nplanks laid on sawhorses, and you may have to eat boiled rice, greens,\r\nand \u003cem\u003ebeancurd\u003c/em\u003e"
}
]
}
Results are returned based on titles, the full text of the book, and any other properties currently indexed.
Getting started
Dependencies
Install the following dependencies:
Due to an issue with the dockerized Elasticsearch, the following tweak must be made on Linux:
sudo sysctl -w vm.max_map_count=262144
This setting will be lost on reboot, but can be permanently set by adding it to /etc/etc/sysctl.conf
Initialize the project
$ ./script/init
$ ./script/download
This will download a lot of data from a Project Gutenberg archive.
Run with docker
The following sequence of commands will build and launch the app, run migrations, read the eBook catalogue, and index the eBooks on Elasticsearch.
$ ./script/docker-compose-up
$ ./script/db-migrate
$ ./script/catalogue-load-xml
$ ./script/catalogue-load-text
Credits
I am not affiliated with Project Gutenberg in any way. This is just for fun.
License
MIT-style, see the LICENSE file.