# News Recommendation System using K-Nearest Neighbors (KNN)

This notebook demonstrates how to build a news recommendation system using KNN.


In [1]:
import warnings
warnings.filterwarnings("ignore")

In [2]:
import os
import pandas as pd
import joblib

from sqlalchemy import create_engine

In [3]:
import sys

sys.path.append("..")
from recommender import NewsRecommendationSystem

---

## 2. Database Configuration


In [4]:
# Load database URL from environment variables
DATABASE_URL = os.getenv("DATABASE_URL")

# Establish a connection to the database
engine = create_engine(DATABASE_URL)

## 3. Load and Inspect Data

In [5]:
QUERY = """
SELECT 
    "Post".id,
    "Post".title AS heading,
    "Post"."createdAt" AS date,
    "Post"."visitCount",
    "Category".name AS category_name,
    "Sentiment".name AS sentiment_name
FROM 
    "Post"
LEFT JOIN "Category" ON "Post"."categoryId" = "Category".id
LEFT JOIN "Sentiment" ON "Post"."sentimentId" = "Sentiment".id
WHERE
    "Sentiment".name != 'NEGATIVE'
"""

In [6]:
df = pd.read_sql_query(QUERY, engine)

In [7]:
df.head()

Unnamed: 0,id,heading,date,visitCount,category_name,sentiment_name
0,cm6cglxr40001gjp8jmi8adbz,नेपालमा शिक्षा प्रणाली सुधार,2025-01-25 17:23:15.616,0,economy,POSITIVE
1,cm6cglyjo0002gjp8n9x8e2om,नेपालमा महिलाको सशक्तिकरण,2025-01-25 17:23:16.645,0,economy,NEUTRAL
2,cm6cglzcu0003gjp8cl64f1im,नेपालमा आगामी चुनावको तयारी,2025-01-25 17:23:17.695,0,economy,POSITIVE
3,cm6cgm06d0004gjp8rdu7icct,नेपालमा महिलाको सशक्तिकरण,2025-01-25 17:23:18.758,0,economy,POSITIVE
4,cm6cgm10u0005gjp8t3x1g2a0,नेपालका प्रमुख पर्यटकीय गन्तव्यहरू,2025-01-25 17:23:19.855,0,opinion,POSITIVE


In [8]:
# Print dataset statistics
print(f"Total number of articles: {df.shape[0]}")
print("\nArticles per category:")
print(df.groupby("category_name").size())

Total number of articles: 64

Articles per category:
category_name
diaspora         7
economy          9
entertainment    8
health           5
literature       6
national         6
opinion          6
sports           5
technology       7
world            5
dtype: int64


---

## 4. Prepare Data for Recommendation System

In [9]:
# Convert DataFrame to a list of dictionaries for easier processing
articles = df.to_dict(orient="records")

In [10]:
# Display the first few records
print("Sample article records:")
print(articles[:5])

Sample article records:
[{'id': 'cm6cglxr40001gjp8jmi8adbz', 'heading': 'नेपालमा शिक्षा प्रणाली सुधार', 'date': Timestamp('2025-01-25 17:23:15.616000'), 'visitCount': 0, 'category_name': 'economy', 'sentiment_name': 'POSITIVE'}, {'id': 'cm6cglyjo0002gjp8n9x8e2om', 'heading': 'नेपालमा महिलाको सशक्तिकरण', 'date': Timestamp('2025-01-25 17:23:16.645000'), 'visitCount': 0, 'category_name': 'economy', 'sentiment_name': 'NEUTRAL'}, {'id': 'cm6cglzcu0003gjp8cl64f1im', 'heading': 'नेपालमा आगामी चुनावको तयारी', 'date': Timestamp('2025-01-25 17:23:17.695000'), 'visitCount': 0, 'category_name': 'economy', 'sentiment_name': 'POSITIVE'}, {'id': 'cm6cgm06d0004gjp8rdu7icct', 'heading': 'नेपालमा महिलाको सशक्तिकरण', 'date': Timestamp('2025-01-25 17:23:18.758000'), 'visitCount': 0, 'category_name': 'economy', 'sentiment_name': 'POSITIVE'}, {'id': 'cm6cgm10u0005gjp8t3x1g2a0', 'heading': 'नेपालका प्रमुख पर्यटकीय गन्तव्यहरू', 'date': Timestamp('2025-01-25 17:23:19.855000'), 'visitCount': 0, 'category_name':

---

## 5. Train the News Recommendation Model


### Initialize the recommendation system with KNN


In [11]:
recommender = NewsRecommendationSystem(k=5)

### Train the model

In [12]:
recommender.fit(articles)

### Save the model

In [None]:
MODEL_DIR = os.path.join(os.getcwd(), "..", "models")
if not os.path.exists(MODEL_DIR):
    print(f"Creating directory: {MODEL_DIR}")
    os.makedirs(MODEL_DIR)

MODEL_PATH = os.path.join(MODEL_DIR, "news_recommendation_model.pkl")

In [17]:
try:
    recommender.save_model(MODEL_PATH)
    print(f"Model saved to: {MODEL_PATH}")
except Exception as e:
    print(f"Failed to save model: {e}")

Model saved to: c:\Users\Suyash Shrestha\Personal\_Nepali_news_project\News-algorithm\news_algorithms\news_recommendation\notebooks\..\models\news_recommendation_model.pkl


---

## 6. Test the Model with Recommendations

### Select an article ID for testing


In [12]:
sample_article_id = articles[0]["id"]

### Get recommendations


In [13]:
recommender_saved = NewsRecommendationSystem.load_model(MODEL_PATH)

In [14]:
op = recommender_saved.recommend(sample_article_id)

In [15]:
op

[('cm6cglzcu0003gjp8cl64f1im', 1.261624932178372),
 ('cm6dw2oy20004gjx8aqtorl2q', 1.2652088311816925),
 ('cm6cgm06d0004gjp8rdu7icct', 1.2883576606946117),
 ('cm6cgmwvj0016gjp8skudadsf', 1.4142135623730951),
 ('cm6dw2mk70001gjx8a1tkndwj', 1.4174116963698524)]

### Display the recommended articles


In [15]:
recommendations_with_distances = recommender.recommend(sample_article_id, limit=5)

print(f"Recommendations for article ID {sample_article_id}:")
for rec_id, distance in recommendations_with_distances:
    article = next(a for a in articles if a["id"] == rec_id)
    print(f"ID: {article['id']}")
    print(f"Heading: {article['heading']}")
    print(f"Category: {article['category_name']}")
    # print(f"Distance: {distance:.4f}")
    print("-"*50)

Recommendations for article ID cm6cglxr40001gjp8jmi8adbz:
ID: cm6cglzcu0003gjp8cl64f1im
Heading: नेपालमा आगामी चुनावको तयारी
Category: economy
--------------------------------------------------
ID: cm6dw2oy20004gjx8aqtorl2q
Heading: नेपालमा आगामी चुनावको तयारी
Category: economy
--------------------------------------------------
ID: cm6cgm06d0004gjp8rdu7icct
Heading: नेपालमा महिलाको सशक्तिकरण
Category: economy
--------------------------------------------------
ID: cm6cgmwvj0016gjp8skudadsf
Heading: नेपालमा शिक्षा प्रणाली सुधार
Category: health
--------------------------------------------------
ID: cm6dw2mk70001gjx8a1tkndwj
Heading: नेपालका प्रमुख खेलकुद गतिविधिहरू
Category: economy
--------------------------------------------------


---

## 7. Save and Load the Model

### Save the trained model


In [17]:
joblib.dump(recommender, MODEL_PATH)
print(f"Model saved to {MODEL_PATH}")

Model saved to c:\Users\Suyash Shrestha\Personal\_Nepali_news_project\News-algorithm\news_algorithms\news_recommendation\notebooks\..\models\news_recommendation_model.pkl


### Load the Saved Model

In [18]:
recommender = NewsRecommendationSystem(k=5, metric="euclidean", time_decay_factor=0.1)

In [19]:
recommender_saved = joblib.load(MODEL_PATH)
print("Model loaded successfully.")

Model loaded successfully.


---

## 8. Generate Recommendations from Loaded Model

In [20]:
def get_recommendation_data(recommended_id: str):
    """
    Fetch and display details of a recommended article.
    
    Args:
        recommended_id: The ID of the article to fetch.
    """
    article = next(a for a in articles if a["id"] == recommended_id)
    print(f"ID: {article['id']}")
    print(f"Category: {article['category_name']}")
    print(f"Heading: {article['heading']}")
    print(f"Date: {article['date']}")

### Test the saved model with recommendations


In [21]:
new_test_article_id = articles[3]["id"]
print("Testing with article:", articles[3])

recommendations = recommender_saved.recommend(new_test_article_id, limit=5)
print("Recommended articles:")
for rec_id, distances in recommendations:
    get_recommendation_data(rec_id)
    print("------------------------")

Testing with article: {'id': 'cm6cgm06d0004gjp8rdu7icct', 'heading': 'नेपालमा महिलाको सशक्तिकरण', 'date': Timestamp('2025-01-25 17:23:18.758000'), 'visitCount': 0, 'category_name': 'economy', 'sentiment_name': 'POSITIVE'}
Recommended articles:
ID: cm6cglxr40001gjp8jmi8adbz
Category: economy
Heading: नेपालमा शिक्षा प्रणाली सुधार
Date: 2025-01-25 17:23:15.616000
------------------------
ID: cm6cglzcu0003gjp8cl64f1im
Category: economy
Heading: नेपालमा आगामी चुनावको तयारी
Date: 2025-01-25 17:23:17.695000
------------------------
ID: cm6dw2oy20004gjx8aqtorl2q
Category: economy
Heading: नेपालमा आगामी चुनावको तयारी
Date: 2025-01-26 17:23:57.770000
------------------------
ID: cm6cglyjo0002gjp8n9x8e2om
Category: economy
Heading: नेपालमा महिलाको सशक्तिकरण
Date: 2025-01-25 17:23:16.645000
------------------------
ID: cm6cgmimg000pgjp8hmypm7nc
Category: diaspora
Heading: नेपालमा महिलाको सशक्तिकरण
Date: 2025-01-25 17:23:42.665000
------------------------


---

## Conclusion

- We successfully built a KNN-based news recommendation system.
- The model was saved and loaded correctly.
- Recommendations were generated based on test articles.