## Book Recommendation Project

### Books Dataset

Books are identified by their ISBN codes.  
Additionally, content-based information is included, such as **Book-Title**, **Book-Author**, **Year-Of-Publication**, and **Publisher**, which have been retrieved from Amazon Web Services.  
If a book has multiple authors, only the first author appears in the data.

Also included are cover image URLs in three sizes:  
**Image-URL-S**, **Image-URL-M**, **Image-URL-L** (small, medium, large).  
These URLs direct to Amazon's website.

### Ratings Dataset

This dataset contains book rating information.  
Ratings (**Book-Rating**) can be:
- **explicit**, on a scale of 1–10 (higher value = better rating), or  
- **implicit**, indicated by a value of 0 (user has not provided a numerical rating).

## Project Objective

The project's objective is to build a book recommendation system that utilizes the Surprise library to implement a user-specific recommendation model. The system aims to predict what kinds of books an individual user is likely to appreciate, based on previous ratings and the behavior of other users.

## Project Components

### 1. Data Preprocessing and Quality Checking

- Merging book and rating data
- Removing invalid ISBN codes
- Handling implicit entries (0-ratings)
- Possible filtering of infrequent users and books

### 2. Building a Recommendation Model with the Surprise Library

- Training the model on user–book ratings
- Experimenting with different algorithms (e.g., **SVD**, **KNNWithMeans**, **BaselineOnly**)
- Evaluating model performance with cross-validation (MAE, RMSE)

### 3. Generating Predictions and Recommendations

- Using an anti-test set to predict ratings for books the user has not yet read
- Creating user-specific **Top-N recommendations**

### 4. Analysis and Interpretation of Results

- Examining the model's accuracy and its limitations
- Considering the impact of data structure on model performance
- Presenting possibilities for further development (e.g., content-based enrichment, hybrid models)

The project's end result is a functional prototype-level book recommendation system that can predict user preferences and provide them with personalized book suggestions based on user data.

In [4]:
%%writefile .gitignore
# Python
*.py[cod]
*$py.class
# Python
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# Virtual Environment
venv/
ENV/
env/
.venv

# Jupyter Notebook
.ipynb_checkpoints
*/.ipynb_checkpoints/*
*.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# Datasets
*.csv
*.tsv
*.txt
*.json
*.xml
data/
datasets/
raw_data/
processed_data/

# Model files
*.pkl
*.pickle
*.joblib
*.h5
*.pt
*.pth
models/
saved_models/
checkpoints/

# Surprise library specific
.surprise_data/

# Large files
*.zip
*.tar.gz
*.rar

# Images
*.jpg
*.jpeg
*.png
*.gif
*.bmp

# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store

# Logs and databases
*.log
*.sql
*.sqlite
*.db

# Cache
__pycache__/
*.pyc
.pytest_cache/
.mypy_cache/
.dmypy.json
dmypy.json

# Environment variables
.env
.env.local

# Jupyter temporary files
.jupyter/
*.nbconvert.ipynb

# Documentation
docs/_build/

# OS
Thumbs.db
.DS_Store

Writing .gitignore
