TBR Engine

A modular book recommendation and management engine built in Python. This project ingests a StoryGraph export, performs data cleaning and feature engineering, and provides both ranked and random book recommendations from a user's TBR (To Be Read) list. It also supports persistent updates such as marking books as finished, DNF, or adding new entries.

Features

CSV ingestion and structured preprocessing
Data cleaning and normalization
Recency-based scoring
Weighted ranking system
Author-preference recommendation logic
Exploration via controlled randomness
Deduplication and optional diversity filtering
CLI-based interaction
Persistent state management using CSV storage

Project Structure

tbr-engine/
│
├── data/
│   ├── raw/
│   │   └── storyGraph_export.csv
│   └── processed/
│       └── books.csv
│
├── ingest/
│   └── load_csv.py
│
├── preprocess/
│   ├── clean_books.py
│   └── normalize.py
│
├── ranking/
│   └── score.py
│
├── cli/
│   └── manage_books.py
│
├── main.py
└── README.md

How It Works

1. Ingestion

Loads a StoryGraph CSV export from data/raw.

2. Cleaning

Normalizes categorical fields
Handles missing ratings
Ensures date formatting
Preserves all reading statuses

3. Feature Engineering

Min-max normalization of ratings
Recency scoring based on days since read

4. Ranking Logic

Read books scored using weighted rating and recency
TBR books ranked using author preference learned from past ratings
Slight randomness added for exploration
Author diversity constraint

5. CLI Interaction

Users can:

Generate a smart recommendation
Mark a book as finished (auto sets today's date)
Mark a book as DNF
Add a new book to TBR
Persist changes to CSV

Installation

Clone the repository:

git clone https://github.com/tranguyeenn/optimization-books-engine

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate

Install dependencies:

pip install pandas numpy

Usage

Run the application:

python main.py

You will be prompted with a menu:

1 - Smart recommendation
2 - Mark book as finished
3 - Mark book as DNF
4 - Add book to TBR
5 - Exit

Changes are automatically saved to:

data/processed/books.csv

Design Principles

Separation of concerns
Modular architecture
Clean pipeline orchestration
Persistent state management

Future Improvements

Web-based UI (Streamlit or Flask)
REST API layer
Recommendation diversity controls
Collaborative filtering extensions
SQLite or database backend
Unit testing and CI integration

Author

Trang Nguyen

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
__pycache__		__pycache__
cli		cli
data		data
ingest		ingest
preprocess		preprocess
ranking		ranking
Procfile		Procfile
README.md		README.md
api.py		api.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TBR Engine

Features

Project Structure

How It Works

1. Ingestion

2. Cleaning

3. Feature Engineering

4. Ranking Logic

5. CLI Interaction

Installation

Usage

Design Principles

Future Improvements

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TBR Engine

Features

Project Structure

How It Works

1. Ingestion

2. Cleaning

3. Feature Engineering

4. Ranking Logic

5. CLI Interaction

Installation

Usage

Design Principles

Future Improvements

Author

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages