# BikeTrend: Prototyping an Article Recommendation System for Bike Product Retailers.
#### Paddle the Wave Session 2025 05 19


## Introduction

Bike product retailer success depends on accurately managing inventory. Online articles on sites like [PinkBike](https://www.pinkbike.com/), can have an outsized affect on sales. Unfortunately, there are often too many articles for short staffed retailers to sort through. Therefore, an opportunity arises for a tool that can sort through a high volume of articles and identify the ones most relevant to a bike retailers products.

We can classify articles into three major types based on their relevance to inventory managment of bike products:
* Positive: Articles whos content implies a products sales will increase
* Negative: Articles whos content implies a products sales will decrease
* Neutral: Articles whos content is not relevant to a product

## Goal
This notebook demonstrates how to prototype BikeTrend, an article recommendation system application using data science tools. We built the recommender to identify [PinkBike](https://www.pinkbike.com/) articles most relevant to specific bike products sold by [Esker Cycles](https://eskercycles.com/).

## Key Results
- **Formulate**: Brainstorm ideas as a team
- **Collect**: Collect the initial datasets for the model build
- **Clean**: Clean the datasets to have the correct shape and content to fit into a model
- **Analyze**: Explore the cleaned datasets and edit content as needed
- **Model**: Fit different models to the datasets, recommending articles for products, and evaluate there relevance.
- **Present**: Present the model outputs in compelling ways through (Examples: dashabords, visualizations or reports).

**Run the following notebooks and explore how we prototyped BikeTrend.**

## 0. Formulate Problem

First, we brainstormed as a team around the problem using the [Question Storming](https://experiencinginformation.com/2011/11/02/questionstorming-framing-the-problem/) technique. This allowed us to narrow in on the few ideas that would make the most compeling prototype.

The results of our Question Storm can be found [here](https://ridethenextwave-my.sharepoint.com/:x:/p/nick_capaldini/EWQWeLLgcz9KroLso29B4fABtNmctnTHb2WQNEyxtPARqg?e=AsxVzH).

## 1. Collect Dataset

Next, we collect the necessary product text data from [Esker Cycles](https://eskercycles.com/) website.

Run the following notebook to extract the necessary text data from Esker Cycles.

[01-CollectProducts.ipynb](./01-CollectProducts.ipynb)

This notebook saves the dataset in the file `./data/product-data-raw.json`.

## 2. Clean Datasets

The raw article and product text is not filtered and cannot be directly used for machine learning. Here we use various methods to clean the text data and prepare it for machine learning.

Run the following notebooks to clean the datasets. 

#### 2.1 Clean Article Data
Raw article data is cleaned to prepare for modeling.

[02-01-CleanArticles.ipynb](./02-01-CleanArticles.ipynb)

This notebook saves the cleaned dataset in the file `./intermediate_data/article-data-clean.json`.

#### 2.2 Clean Product Data
Raw product data is cleaned to prepare for modeling.

[02-02-CleanProducts.ipynb](./02-02-CleanProducts.ipynb)

This notebook saves the cleaned dataset in the file `./intermediate_data/product-data-clean.json`.

## 3. Analyze Data

Now that the data has been cleaned, we can analyze it. Here we do some quick exploration of the cleaned datasets for any insights, ideas, or edits that can inform our model building.

Run the following notebooks to analyze the datasets. 

#### 3.1 Analyze Article Data
Cleaned article data is analyzed in anticipation of modeling.

[03-01-AnalyzeArticles.ipynb](./03-01-AnalyzeArticles.ipynb)

#### 3.2 Analyze Product Data
Cleaned product data is analyzed in anticipation of modeling.

[03-02-AnalyzeProducts.ipynb](./03-02-AnalyzeProducts.ipynb)

## 4. Model Recommendation

Next, we build a recommendation model using the cleaned datasets.

Run the following notebook to build a recommendation model using the data provided.

[04-ModelRec.ipynb](./04-ModelRec.ipynb)

This notebook saves the recommendation model in the file `./intermediate_data/recommender`.

## 5. Present Recommendations

Finally, we set up the recommendation system so that it can be presented in Mesmorizing, Original, Professional, and Simple way to a non-technical audience.

Run the following python file to present the recommendations.

[05-PresentRec.py](./05-PresentRec.py)

## Version and Hardware Information

In [1]:
%load_ext watermark
%watermark -v -m -p ipywidgets,matplotlib,numpy,streamlit,pandas,sklearn

ModuleNotFoundError: No module named 'watermark'

---

**Authors:**
[Salah Mohamoud](mailto:salah.mohamoud.dev@gmail.com),
[Sai Keertana Lakku](mailto:saikeertana005@gmail.com),
[Zhen Zhuang](mailto:zhuangzhen17cs@gmail.com),
[Nick Capaldini](mailto:nick.capaldini@ridethenextwave.com), Ride The Next Wave, May 19, 2025

---